GSE31363, Transcription Factor Binding Sites by Epitope-Tag from ENCODE/UChicago.


ChIP-seq data against Epotope-tagged transcription factors expressed as GFP-fusion proteins in K562 cells. The data were produced by the University of Chicago as part of the ENCODE project.

The source data were downloaded from ENCODE at UCSC



From H. sapiens (Feb 2009 GRCh37/hg19).

ChIP-seq data:

Filename Description Feature GEO-ID
1 EH001207_rep1.sga K562|FOS-EGFP||UChicago rep1 FOS GSM777644
2 EH001207_rep2.sga K562|FOS-EGFP||UChicago rep2 FOS GSM777644
3 EH001207_rep3.sga K562|FOS-EGFP||UChicago rep3 FOS GSM777644
4 EH001208_rep1.sga K562|GATA2-EGFP||UChicago rep1 GATA2 GSM777641
5 EH001208_rep2.sga K562|GATA2-EGFP||UChicago rep2 GATA2 GSM777641
6 EH001209_rep1.sga K562|HDAC8-EGFP||UChicago rep1 HDAC8 GSM777640
7 EH001209_rep2.sga K562|HDAC8-EGFP||UChicago rep2 HDAC8 GSM777640
8 EH001201_rep1.sga K562|Input (FOS)|UChicago rep1 Input GSM777646
9 EH001201_rep2.sga K562|Input (FOS)|UChicago rep2 Input GSM777646
10 EH001201_rep3.sga K562|Input (FOS)|UChicago rep3 Input GSM777646
11 EH001202_rep1.sga K562|Input (GATA2)|UChicago rep1 Input GSM777648
12 EH001202_rep2.sga K562|Input (GATA2)|UChicago rep2 Input GSM777648
13 EH001203_rep1.sga K562|Input (HDAC8)|UChicago rep1 Input GSM777647
14 EH001203_rep2.sga K562|Input (HDAC8)|UChicago rep2 Input GSM777647
15 EH001204_rep1.sga K562|Input (JUNB)|UChicago rep1 Input GSM777643
16 EH001204_rep2.sga K562|Input (JUNB)|UChicago rep2 Input GSM777643
17 EH001205_rep1.sga K562|Input (JUND)|UChicago rep1 Input GSM777642
18 EH001205_rep2.sga K562|Input (JUND)|UChicago rep2 Input GSM777642
19 EH001205_rep3.sga K562|Input (JUND)|UChicago rep3 Input GSM777642
20 EH001206_rep1.sga K562|Input (NR4A1)|UChicago rep1 Input GSM777645
21 EH001206_rep2.sga K562|Input (NR4A1)|UChicago rep2 Input GSM777645
22 EH001210_rep1.sga K562|JUNB-EGFP||UChicago rep1 JUNB GSM777638
23 EH001210_rep2.sga K562|JUNB-EGFP||UChicago rep2 JUNB GSM777638
24 EH001211_rep1.sga K562|JUND-EGFP||UChicago rep1 JUND GSM777639
25 EH001211_rep2.sga K562|JUND-EGFP||UChicago rep2 JUND GSM777639
26 EH001211_rep3.sga K562|JUND-EGFP||UChicago rep3 JUND GSM777639
27 EH001212_rep1.sga K562|NR4A1-EGFP||UChicago rep1 NR4A1 GSM777637
28 EH001212_rep2.sga K562|NR4A1-EGFP||UChicago rep2 NR4A1 GSM777637

ChIP-seq peaks files:

Filename Description Feature GEO-ID
1 EH001207_narrowPeak.sga K562|FOS-EGFP narrowPeak||UChicago FOS_P GSM777644
2 EH001208_narrowPeak.sga K562|GATA2-EGFP narrowPeak||UChicago GATA2_P GSM777641
3 EH001209_narrowPeak.sga K562|HDAC8-EGFP narrowPeak||UChicago HDAC8_P GSM777640
4 EH001210_narrowPeak.sga K562|JUNB-EGFP narrowPeak||UChicago JUNB_P GSM777638
5 EH001211_narrowPeak.sga K562|JUND-EGFP narrowPeak||UChicago JUND_P GSM777639
6 EH001212_narrowPeak.sga K562|NR4A1-EGFP narrowPeak||UChicago NR4A1_P GSM777637

Notes on samples nomenclature:

All sample information was derived from a file named "files.txt" downloaded from the above-indicated URL. This file contains a list of filenames long with annotation (metadata). The sample descriptions provided here include cell type, ChIP-seq target (transcription factor), treatment and replicate. Cell types and treatments were transferred from ENCODE annotation as such. Some of the original ChIP-seq target names were modified in order to conform to the naming conventions of the MGA repository. Epitop-tagged transcription factors are identified by the corresponding HGNC gene symbol followed by the name of the epitope name, e.g. JUND-EGFP.

The sample descriptions provided here were required to be unique across all ENCODE data series of the MGA repository. For this reason, the name of data contributing lab is indicated in the replicate field. Note further that the names of the local data files contain as first part the experiment identifier (dccAccession) extracted from the ENCODE annotation file. This identifier makes it possible to match a data file from this series to the corresponding peak list in hg19/encode/Uniform_TFBS.

More information about cell lines, antibodies, treatments, protocols, etc. can be found at

Optional fields in SGA files:

The peak files contain in the optional sixth field a positive real number reflecting the overall enrichment of the ChIP-seq signal in the peak region.

Technical Notes

Files with a non-empty objectStatus in the file list provided by UCSC were not considered. BAM files were converted into BED using bamToBed (bedtools v2.27.0) and subsequently converted into SGA using bed2sga ( ChIP-Seq v. 1.5.3). Peak files in narrowPeak format were converted into SGA using bed2sga with options --narrowPeak -e 7.


Genome browser viewable files

Delete this section if none

Last update: 13 Nov 2018