BWA介绍
BWA 全名 Burrow-Wheeler Aligner
BWA是一款将DNA序列mapping到参考基因组上的软件,例如比对到人类基因组。其由三个算法组成BWA-backtrack,BWA-SW和BWA-MEM。在该软件作者的github上可以看到对三个算法的不同用处的解释https://github.com/lh3/bwa
- The first algorithm is designed for Illumina sequence reads up to 100bp, while the rest two for longer sequences ranged from 70bp to a few megabases.
- BWA-MEM and BWA-SW share similar features such as the support of long reads and chimeric alignment, but BWA-MEM, which is the latest, is generally recommended as it is faster and more accurate.
- BWA-MEM also has better performance than BWA-backtrack for 70-100bp Illumina reads.
并且在BWA命令中可以分别调用这三个算法,如:aln/samse/sampe for BWA-backtrack,bwasw for BWA-SW,mem for the BWA-MEM #### BWA下载及安装
下载地址:https://sourceforge.net/projects/bio-bwa/
tar jxvf bwa-0.7.15.tar.bz2
cd bwa-0.7.15
make
可以用绝对路径调用,也可以加入环境变量
BWA使用步骤
我们使用BWA对短序列进行比对时,一般是使用BWA-MEM算法,按照作者的意思,BWA-MEM更加适合于Illumina 70-100bp reads的比对
使用BWA构建参考基因组的index
bwa index [-p prefix] [-a algoType] <in.db.fasta>
-p 输出文件的前缀,例如对hg19.fa建索引,那么输出文件前缀就写hg19
-a 选择构建索引的算法, >is 是默认的算法,是由于其快速简单,但是不适合超过2G的基因组,因此人类基因组需要指定另一个算法 > >bwtsw 对于基因组大于2G的,适合使用该算法,比如人类基因组
使用BWA-MEM算法进行比对
bwa mem [-aCHMpP] [-t nThreads] [-k minSeedLen] [-w bandWidth] [-d zDropoff] [-r seedSplitRatio] [-c maxOcc] [-A matchScore] [-B mmPenalty] [-O gapOpenPen] [-E gapExtPen] [-L clipPen] [-U unpairPen] [-R RGline] [-v verboseLevel] db.prefix reads.fq [mates.fq]
其实我们用不到这么多参数,主要的一般是以下几个:
-t 设置线程数
-R 设置read标头 >Complete read group header line. ’ can be used in STR and will be converted to a TAB in the output SAM. The read group ID will be attached to every read in the output. An example is ’@RG:foo:bar’.
比如模仿一个:-R @RG:sample_name:sample_name:WGS:Illumina
-M 将较短的split hits标记为secondary,与picard兼容
最后来一个例子:
# 建索引
bwa index -a bwtsw -p hg19 hg19.fa 1>hg19.bwa_index.log 2>&1
# 比对
bwa mem -t 5 -M -R @RG\tID:KPGP-00001_L1\tSM:KPGP-00001_L1\tLB:WGS\tPL:Illumina ~/reference/index/bwa/hg19 KPGP-00001_L1_R1.fq.gz KPGP-00001_L1_R2.fq.gz 1>KPGP-00001_L1.sam 2>KPGP-00001_L1.bwa.align.log
本文出自于http://www.bioinfo-scrounger.com转载请注明出处