This track, produced as part of the ENCODE Project, displays maps of histone modifications genome-wide using ChIP-seq in different cell lines. The ChIP-seq method involves first using formaldehyde to cross-link histones and other DNA-associated proteins to genomic DNA within cells. The cross-linked chromatin is subsequently extracted, sheared, and immunoprecipitated using specific antibodies. After reversal of cross-links, the immunoprecipitated DNA is sequenced and mapped to the human reference genome. The relative enrichment of each antibody-target (epitope) across the genome is inferred from the density of mapped fragments.
Chemical modifications (e.g. methylation or acetylation) of the histone proteins present in chromatin influence gene expression by changing how accessible the chromatin is to transcription factors. Shown for each experiment (defined as a particular antibody and a particular cell type) is a track of enrichment for the specifically modified histone (Signal), along with sites that have the greatest enrichment (Peaks). Also included for each cell type is the input signal, which represents the control condition where no antibody targeting was performed. In general the following chemical modifications have associated genetic phenotypes:
H3K4me3 and H3K9Ac are considered to be marks of active or potentially active promoter regions. H3K4me1 and H3K27Ac are considered to be marks of active or potentially active enhancer regions. H3K36me3 and H3K79me2 are considered to be marks of transcriptional elongation. H3K27me3 and H3K9me3 are considered to be marks of inactive regions.
For data usage terms and conditions, please refer to http://www.genome.gov/27528022 and http://www.genome.gov/Pages/Research/ENCODE/ENCODEDataReleasePolicyFinal2008.pdf
Overall design
Cells were grown according to the approved ENCODE cell culture protocols. Briefly, cells were crosslinked, chromatin was extracted and sonicated using a Bioruptor sonicator (Diagenode) to an average size of 300-500bp, and individual ChIP assays were performed using antibodies to modified histones. For the K562 and Ntera2 histone ChIP-seq samples, immunoprecipitates were collected using protein G-coupled magnetic beads; a detailed ChIP and library protocol can be found at http://www.roadmapepigenomics.org/protocols. For the U2OS histone ChIP-seq samples, immunoprecipitates were collected using StaphA cells; a detailed protocol can be found at http://expression.genomecenter.ucdavis.edu/chip.html. Library DNA was quantitated using either a Nanodrop or a BioAnalyzer and sequenced on an Illumina GA2.
The sequencing reads were mapped to the genome using the Eland alignment program. ChIP-seq data was scored based on sequence reads (length ~30 bps) that align uniquely to the human genome. From the mapped tags, a signal map of ChIP DNA fragments (average fragment length ~ 200 bp) was constructed where the signal height is the number of overlapping fragments at each nucleotide position in the genome.
For each 1 Mb segment of each chromosome, a peak height threshold was determined by requiring a false discovery rate <= 0.05 when comparing the number of peaks above threshold as compared to the number obtained from multiple simulations of a random null background with the same number of mapped reads (also accounting for the fraction of mapable bases for sequence tags in that 1 Mb segment). The number of mapped tags in a putative binding region is compared to the normalized (normalized by correlating tag counts in genomic 10 kb windows) number of mapped tags in the same region from an input DNA control. Using a binomial test, only regions that have a p-value <= 0.05 are considered to be significantly enriched compared to the input DNA control.
Filename | Description | Feature | GEO-ID | |
1 | GSM788088.sga | K562 H3K27me3B | H3K27me3B | GSM788088 |
2 | GSM788085.sga | K562 H3K4me1 | H3K4me1 | GSM788085 |
3 | GSM788087.sga | K562 H3K4me3B | H3K4me3B | GSM788087 |
4 | GSM788082.sga | K562 H3K9acB | H3K9acB | GSM788082 |
5 | GSM788074.sga | K562 Input | Input | GSM788074 |
6 | GSM788071.sga | NT2-D1 H3K27me3B | H3K27me3B | GSM788071 |
7 | GSM788081.sga | NT2-D1 H3K36me3B | H3K36me3B | GSM788081 |
8 | GSM788083.sga | NT2-D1 H3K4me1 | H3K4me1 | GSM788083 |
9 | GSM788072.sga | NT2-D1 H3K4me3B | H3K4me3B | GSM788072 |
10 | GSM788086.sga | NT2-D1 H3K9acB | H3K9acB | GSM788086 |
11 | GSM788080.sga | NT2-D1 H3K9me3 | H3K9me3 | GSM788080 |
12 | GSM788077.sga | NT2-D1 Input | Input | GSM788077 |
13 | GSM818826.sga | PANC-1 H3K27ac | H3K27ac | GSM818826 |
14 | GSM818827.sga | PANC-1 H3K4me1_pAb-037-050 | H3K4me1_pAb-037-050 | GSM818827 |
15 | GSM818828.sga | PANC-1 Input | Input | GSM818828 |
16 | GSM788073.sga | PBMC H3K27me3B | H3K27me3B | GSM788073 |
17 | GSM788084.sga | PBMC H3K4me1 | H3K4me1 | GSM788084 |
18 | GSM788075.sga | PBMC H3K4me3B | H3K4me3B | GSM788075 |
19 | GSM788079.sga | PBMC H3K9me3 | H3K9me3 | GSM788079 |
20 | GSM788070.sga | PBMC Input | Input | GSM788070 |
21 | GSM788076.sga | U2OS H3K36me3B | H3K36me3B | GSM788076 |
22 | GSM788078.sga | U2OS H3K9me3 | H3K9me3 | GSM788078 |
23 | GSM788069.sga | U2OS Input | Input | GSM788069 |
SRA files were downloaded from GEO and processed using the following bash commands:
fastq-dump SAMPLE.sra
bowtie --best --strata -m1 --sam -l 36 -n 3 h_sapiens_ncbi36 -q SAMPLE.fastq > SAMPLE.sam
awk 'BEGIN {FS="\t"} $3 != "\*" {print $0}' SAMPLE.sam > SAMPLE_clean.sam
samtools view -bS -o SAMPLE.bam SAMPLE_clean.sam
samtools sort SAMPLE.bam SAMPLE_sorted
bamToBed -i SAMPLE_sorted.bam > SAMPLE.bed
bed2sga.pl -s hg18 -f FEATURE < SAMPLE.bed | sort -s -k1,1 -k3,3n -k4,4 | compactsga > SAMPLE.sga