Next Generation Sequencing (NGS)/SOAPdenovo
We get some E coli data from SRR001665 you could type
wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR001/SRR001665/SRR001665_1.fastq.gz wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR001/SRR001665/SRR001665_2.fastq.gz
unpack the two files
gunzip SRR001665_1.fastq.gz gunzip SRR001665_2.fastq.gz
You will need to get SOAPdenovo and the data prepare module
wget http://soap.genomics.org.cn/down/x86_64.linux/SOAPdenovo31mer.tgz tar xvzf SOAPdenovo31mer.tgz
Also we have to make a config file. We name this cont.config
#maximal read length max_rd_len=36 [LIB] #average insert size avg_ins=200 #if sequence needs to be reversed reverse_seq=0 #use for contig building only asm_flags=1 #in which order the reads are used while scaffolding rank=1 #fastq files q1=./SRR001665_1.fastq q2=./SRR001665_2.fastq
And then we scaffold using a Kmer size of 31 (the read length is 36). We use the whole SOAP pipeline by specifying the "all" parameter
By setting asm_flags to 3 the same library would be used for scaffolding as well. In this case SOAP will terminate in the scaffolding step with a floating point exception as there is nothing to scaffold with. Contigs will be found nevertheless in EC.contigs.
./SOAPdenovo31mer all -K 31 -s cont.config -o EC