bwa mem性能较好、可靠稳定,一直是WES/WGS数据分析的首选比对工具,随着技术的发展,有了越来越多的可替代方案。
1. Minimap2( https://github.com/lh3/minimap2 )
在bwa的作者李恒的博客( https://lh3.github.io/2018/04/02/minimap2-and-the-future-of-bwa )《Minimap2 and the future of BWA》中写道:
The story on short-read alignment is a little complex, though. I did plan to replace bwa-mem with minimap2 on short-read alignment, too. In the minimap2 paper, I showed that minimap2 is 3X as fast as bwa-mem and achieves comparable accuracy to bwa-mem on short variant calling (section 3.3). In the final round of the review, an reviewer still argued that minimap2 wouldn’t work well for short reads. I didn’t think so at the time given that Illumina Inc. has independently evaluated minimap2 and observed that minimap2 is highly competitive. Therefore, I didn’t follow the suggestion of that reviewer.
However, Andrew Carroll at DNAnexus has recently showed me that minimap2 was slower than bwa-mem on two NovaSeq runs at his hand. Part of the reason, I guess, is that the two NovaSeq runs have a little higher error rate, which triggers expensive heuristics in minimap2 more frequently. Furthermore, I also realize that bwa-mem will be better than minimap2 at Hi-C alignment because bwa-mem is more sensitive to short matches. In the end, I admit minimap2 is not ready to replace bwa-mem all around. I owe that reviewer an apology.
Generally, I still think minimap2 is a competitive short-read mapper and I will use it often in my research projects. However, given that the performance of minimap2 is not as consistent as bwa-mem for short reads of varying quality, bwa-mem is still better for production uses, at least before I find a way to improve minimap2.
距离这博客已经三年多过去了,minimap2也不知更新了多少版本,minimap2的github网站上这样写道:
Minimap2 is a versatile sequence alignment program that aligns DNA or mRNA sequences against a large reference database. Typical use cases include: (1) mapping PacBio or Oxford Nanopore genomic reads to the human genome; (2) finding overlaps between long reads with error rate up to ~15%; (3) splice-aware alignment of PacBio Iso-Seq or Nanopore cDNA or Direct RNA reads against a reference genome; (4) aligning Illumina single- or paired-end reads; (5) assembly-to-assembly alignment; (6) full-genome alignment between two closely related species with divergence below ~15%.
For ~10kb noisy reads sequences, minimap2 is tens of times faster than mainstream long-read mappers such as BLASR, BWA-MEM, NGMLR and GMAP. It is more accurate on simulated long reads and produces biologically meaningful alignment ready for downstream analyses. For >100bp Illumina short reads, minimap2 is three times as fast as BWA-MEM and Bowtie2, and as accurate on simulated data.Detailed evaluations are available from the minimap2 paper or the preprint.
因此,一些科研型的、对某个具体的点突变要求不那么精确,又对软件效率要求较高的的应用(比如就NGS数据call ROH)上可以尝试使用minimap2作为比对工具,不排除将来也可全面替代bwa mem。
2. Dragmap( https://github.com/Illumina/DRAGMAP )
Illumina公司的开源软件,出自Edico Genome的DRAGEN系列软件,现已将比对部分开源,当然也可去使用商业License 的DRAGEN软件,现在已升级到V3.9版本,在准确性上有进一步提升。
3. bwa-mem2( https://github.com/bwa-mem2/bwa-mem2 )
指令集优化版bwa mem,近期的版本极大降低了内存和存储使用量
Bwa-mem2 is the next version of the bwa-mem algorithm in bwa. It produces alignment identical to bwa and is ~1.3-3.1x faster depending on the use-case, dataset and the running machine.
The original bwa was developed by Heng Li (@lh3). Performance enhancement in bwa-mem2 was primarily done by Vasimuddin Md (@yuk12) and Sanchit Misra (@sanchit-misra) from Parallel Computing Lab, Intel. Bwa-mem2 is distributed under the MIT license.
4. Sentieon
Sentieon也是商业License 的软件,其中的比对软件也主要基于bwa mem进行优化的,准确性和性能也都不错