A public AMI (Amazon Machine Image) has been created on AWS cloud with our ChIP-Seq analysis tools. Latest version of all the tools have been installed and configured. This page describes how to use our tools on AWS cloud. Its a three step process (if you are using AWS cloud for the first time).
Type the follwing in your terminal and follow the instructions:
$ aws configure $ AWS Access Key ID: AK***************SQ $ AWS Secret Access Key: AO**************************************Ek $ Default region name [us-east-1]: us-east-1 $ Default output format [json]: json
Please check various machines offered by AWS and choose one that suits your compute requirement (here) and budget.
# optionally you can also start the EC2 instance via CLI. Please check here for more information
You can try it for free with AWS Free Tier
Type following in your terminal and start using ChIP-Seq analysis tools:
The software package can be found in /home/ubuntu/chipseq. The main subdirectories are the following:
The ChIP-seq main programs use as a format a simplified GFF format, called SGA (Simplified Genome Annotation), which is sorted by sequence name and position. In a data analysis pipeline, the SGA file is typically generated from a variety of richer formats, such as the Solexa genome mapping files, BED files, or FPS (Functional Position Set) files used by the Signal Search Analysis programs at SIB (SSA).
SGA is a single-line-oriented and tab-delimited format with the following five obligatory fields:
An example of use of the chipcor program (feature correlation tool) is the following:
Where 'H3K4me3.sga' is the file containing the list of ChIP-Seq tags, which correspond to the H3K4me3 histon modification data. The '-c' option specifies the cut-off on input counts. Tags corresponding to histone modifications along the positive strand (option '-A "H3K4me3 +"') are correlated with tags corresponding to the same histone modification pattern on the opposite strand (option '-B "H3K4me3 -"'), and their relative distances are distributed in a histogram within the range [- 1000; + 1000] (options: '-b -1000', '-e 1000'). The output file (H3K4me3_fc_n1.out) contains all histogram entries in simple text format. Histogram entries show count density values (option '-n 1') of the target feature (H3K4me3 tags on the negative strand) at relative distances to the reference feauture (H3K4me3 tags on the positive strand). 'Count Density' means number of tags per base pair.
Other useful tools installed on AWS include: