# Download matrix file from GEO:

wget https://ftp.ncbi.nlm.nih.gov/geo/series/GSE114nnn/GSE114737/matrix/GSE114737_series_matrix.txt.gz

# Information was extracted from the GEO matrix files with a Perl script:

zcat GSE114737_series_matrix.txt.gz | ./singh19_extract_sample_info.pl > samples 

# Note the ad hoc correction of a formatting inconsistency for sample GSM314902 
# in the matrix file. To understand what's going on, try: 

zcat GSE114737_series_matrix.txt.gz | grep '^.Sample_relation.*SRA'  

# generation of SGA files:

# ./singh19_mk_sga.pl < samples | sh 2> /dev/null &

# or parallel execution 

sh dispatch_GSE114737.sh

# Note: the script ./singh19_mk_sga.pl calls another Perl script, ./srx2srr.pl, which
# extracts sequence run ids (SRR#) corresponinding to an expriment id (SRX#). This script,
# which accesses an html page at SRA, had to be updated in response to recent URL and
# format changes at the NCBI website. It may not work again in the future for the
# same reason. 

# Generation of sample description file

./singh19_mk_txt.pl < samples > ../singh19.txt

