This track shows a comprehensive survey of cis-regulatory elements in the mouse genome by using ChIP-seq (Robertson et al., 2007) to identify transcription factor binding sites and chromatin modification profiles in many mouse (C57Bl/6) tissues and primary cells, including bone marrow, cerebellum, cortex, heart, kidney, liver, lung, spleen, mouse embryonic fibroblast cells (MEFs) and embryonic stem (ES) cells.
In specific, the Ren lab examined RNA polymerase II (PolII), co-activator protein p300, the insulator protein CTCF, and two chromatin modification marks H3K4me3 and H3K4me1 due to their demonstrated utilities in identifying promoters, enhancers and insulator elements (Barski et al., 2007; Blow et al., 2010; Heintzman et al., 2009; Kim et al., 2007; Kim et al., 2005a; Visel et al., 2009). Enrichment of H3K4me3 or PolII signals is a strong indicator of active promoter, while the presence of p300 or H3K4me1 outside of promoter regions has been used as a mark for enhancers. CTCF binding sites are considered as a mark for potential insulator elements. For each transcription factor or chromatin mark in each tissue, ChIP-seq was carried out with at least two biological replicates. Each experiment produced 20-30 million monoclonal, uniquely mapped tags.
Overall design
Cells were grown according to the approved ENCODE cell culture protocols (http://hgwdev.cse.ucsc.edu/ENCODE/protocols/cell/mouse/).
Enrichment and Library Preparation: Chromatin immunoprecipitation was performed according to Ren Lab ChIP Protocol (http://bioinformatics-renlab.ucsd.edu/RenLabChipProtocolV1.pdf).
Library construction was performed according to Ren Lab Library Protocol (http://bioinformatics-renlab.ucsd.edu/RenLabLibraryProtocolV1.pdf).
Sequencing and Analysis: Samples were sequenced on Illumina Genome Analyzer II Genome Analyzer IIx, and HiSeq 2000 platforms for 36 cycles. Image analysis, base calling and alignment to the mouse genome version mm9 were performed using Illumina's RTA and Genome Analyzer Pipeline software. Alignment to the mouse genome was performed using ELAND or Bowtie (Langmead et al., 2009) with a seed length of 25 and allowing up to two mismatches. Only the sequences that mapped to one location were used for further analysis. Of those sequences, clonal reads, defined as having the same start position on the same strand, were discarded. BED and wig files were created using custom perl scripts.
Filename | Description | Feature | GEO-ID | |
1 | GSM769008.sga | ES-Bruce4 H3K4me3 | H3K4me3 | GSM769008 |
2 | GSM769009.sga | ES-Bruce4 H3K4me1 | H3K4me1 | GSM769009 |
3 | GSM769012.sga | Lung H3K4me3 | H3K4me3 | GSM769012 |
4 | GSM769013.sga | Lung H3K4me1 | H3K4me1 | GSM769013 |
5 | GSM769014.sga | Liver H3K4me3 | H3K4me3 | GSM769014 |
6 | GSM769015.sga | Liver H3K4me1 | H3K4me1 | GSM769015 |
7 | GSM769016.sga | Kidney H3K4me3 | H3K4me3 | GSM769016 |
8 | GSM769023.sga | Kidney H3K4me1 | H3K4me1 | GSM769023 |
9 | GSM769017.sga | Heart H3K4me3 | H3K4me3 | GSM769017 |
10 | GSM769025.sga | Heart H3K4me1 | H3K4me1 | GSM769025 |
11 | GSM769036.sga | Spleen H3K4me3 | H3K4me3 | GSM769036 |
12 | GSM769031.sga | Spleen H3K4me1 | H3K4me1 | GSM769031 |
13 | GSM769027.sga | Cerebellum H3K4me3 | H3K4me3 | GSM769027 |
14 | GSM769018.sga | Cerebellum H3K4me1 | H3K4me1 | GSM769018 |
15 | GSM769021.sga | BoneMarrow H3K4me3 | H3K4me3 | GSM769021 |
16 | GSM769024.sga | BoneMarrow H3K4me1 | H3K4me1 | GSM769024 |
17 | GSM769026.sga | Cortex H3K4me3 | H3K4me3 | GSM769026 |
18 | GSM769022.sga | Cortex H3K4me1 | H3K4me1 | GSM769022 |
19 | GSM769029.sga | MEF H3K4me3 | H3K4me3 | GSM769029 |
20 | GSM769028.sga | MEF H3K4me1 | H3K4me1 | GSM769028 |
21 | GSM769032.sga | Heart Input | Input | GSM769032 |
22 | GSM769033.sga | Kidney Input | Input | GSM769033 |
23 | GSM769034.sga | Liver Input | Input | GSM769034 |
24 | GSM769035.sga | Lung Input | Input | GSM769035 |
25 | GSM769037.sga | Spleen Input | Input | GSM769037 |
26 | GSM769010.sga | ES-Bruce4 Input | Input | GSM769010 |
27 | GSM769011.sga | BoneMarrow Input | Input | GSM769011 |
28 | GSM769019.sga | Cortex Input | Input | GSM769019 |
29 | GSM769020.sga | Cerebellum Input | Input | GSM769020 |
30 | GSM769030.sga | MEF Input | Input | GSM769030 |
SRA files were downloaded from GEO and processed using the following bash commands:
fastq-dump SAMPLE.sra
bowtie --sam -l 36 -n 3 mm9 -q SAMPLE.fastq > SAMPLE.sam
awk 'BEGIN {FS="\t"} $3 != "\*" {print $0}' SAMPLE.sam > SAMPLE_clean.sam
samtools view -bS -o SAMPLE.bam SAMPLE_clean.sam
samtools sort SAMPLE.bam SAMPLE_sorted
bamToBed -i SAMPLE_sorted.bam > SAMPLE.bed
bed2sga.pl -s mm9 -f FEATURE < SAMPLE.bed | sort -s -k1,1 -k3,3n -k4,4 | compactsga > SAMPLE.sga