PhyloP basewise conservation

Description

PhyloP basewise conservation score derived from Multiz alignment of 30 vertebrate species. This series includes two SGA files, one for all vertebrates and one for placental mammals only.

Source

Files downloaded from UCSC genome browser database:

Samples

From M. musculus (July 2007 NCBI37/mm9).

Filename Description Feature GEO-ID
1 phylop_vert.sga PhyloP vertebrate 30way (score >= 2) PhyloP -
2 hylop_placental.sga PhyloP placental mammal 30way (score >= 1.5) PhyloP -

Technical Notes

The sources files are in WIG fixedStep format. Conversion into SGA was carried out with an ad hoc Perl script. In order to keep the SGA files reasonably compact, only positions with phyloP scores greater than a threshold value t were considered. The original real-valued scores were first dimished by t-1 and then rounded down to the nearest integer:

counts = int(phyoP_score-t+1)
The modified phyloP score is given in the count (5th) field of the SGA files.

Threshold values: 2.0 for vertebrate, 1.5 for placental mammals

USCS also provides phyloP files for a "Euarchontoglires", a subgroup of placental mammas. Those were not converted into SGA files, as they primarily contain negative values reflecting accelerated evolution rather than conservation.

References

  1. Pollard KS, Hubisz MJ, Siepel A.
    Detection of non-neutral substitution rates on mammalian phylogenies Genome Res. 2010 Jan;20(1):110-21. PMID: 19858363