【4.5.1】yara

Abstract

Yara is an exact tool for aligning DNA sequencing reads to reference genomes.

Main features:

  • Exhaustive enumeration of sub-optimal end-to-end alignments under the edit distance.
  • Excellent speed, memory footprint and accuracy.
  • Accurate mapping quality computation.
  • Support for reference genomes consisiting of million of contigs.
  • Direct output in SAM/BAM format.

Supported data:

Yara has been tested on DNA reads (i.e., Whole Genome, Exome, ChIP-seq, MeDIP-seq) produced by the following sequencing platforms:

  • Illumina GA II, HiSeq and MiSeq (single-end and paired-end).
  • Life Technologies Ion Torrent Proton and PGM.

Quality trimming is necessary for Ion Torrent reads and recommended for Illumina reads.

Unsupported data:

  • RNA-seq reads spanning splicing sites.
  • Long noisy reads (e.g., Pacific Biosciences RSII, Oxford Nanopore MinION). Previous applications:

Yara is the follow-up of the Masai project. Use of Masai is discouraged. Nonetheless, old Masai binaries can still be downloaded here.

Official Website Download binaries View the source code and README on GitHub

Please Cite

  • E. Siragusa, D. Weese, K. Reinert, “Fast and accurate read mapping with approximate seeds and multiple backtracking”, vol. 41, iss. 7, 2013-01-28.
  • Enrico Siragusa, “Approximate string matching for high-throughput sequencing”, p. 127, 2015-07-23.

参考资料

药企,独角兽,苏州。团队长期招人,感兴趣的都可以发邮件聊聊:tiehan@sina.cn
个人公众号,比较懒,很少更新,可以在上面提问题,如果回复不及时,可发邮件给我: tiehan@sina.cn