Short-Read Sequence Alignment
Name | Description | paired-end option | Use FASTQ quality | Gapped | Multi-threaded | License | Link |
---|---|---|---|---|---|---|---|
BarraCUDA | A GPGPU accelerated Burrows-Wheeler transform (FM-index) short read alignment program based on BWA, supports alignment of indels with gap openings and extensions. | Yes | No | Yes | Yes (POSIX Threads and CUDA) | GPL | link |
BFAST | Explicit time and accuracy tradeoff with a prior accuracy estimation, supported by indexing the reference sequences. Optimally compresses indexes. Can handle billions of short reads. Can handle insertions, deletions, SNPs, and color errors (can map ABI SOLiD color space reads). Performs a full Smith Waterman alignment. | Yes (POSIX Threads) | GPL | link | |||
BLASTN | BLAST's nucleotide alignment program, slow and not accurate for short reads, and uses a sequence database (EST, sanger sequence) rather than a reference genome. | link | |||||
BLAT | Made by Jim Kent. Can handle one mismatch in initial alignment step. | Yes (client/server). | Free for academic and non-commercial use. | link | |||
Bowtie | Uses a Burrows-Wheeler transform to create a permanent, reusable index of the genome; 1.3 GB memory footprint for human genome. Aligns more than 25 million Illumina reads in 1 CPU hour. Supports Maq-like and SOAP-like alignment policies | Yes (POSIX Threads) | Artistic License | link | |||
BWA | Uses a Burrows-Wheeler transform to create an index of the genome. It's a bit slower than bowtie but allows indels in alignment. | Yes | No | Yes | Yes | GPL | link |
CASHX | Quantify and manage large quantities of short-read sequence data. CASHX pipeline contains a set of tools that can be used together or as independent modules on their own. This algorithm is very accurate for perfect hits to a reference genome. | No | Free for academic and non-commercial use. | link | |||
CUDA-EC | Short-read alignment error correction using GPUs. | Yes (GPU enabled) | CUDA-EC- | ||||
CUSHAW | A CUDA compatible short read aligner to large genomes based on Burrows-Wheeler transform. | Yes | Yes | No | Yes (GPU enabled) | GPL | link |
CUSHAW2 | Long read alignment based on maximal exact match seeds. | Yes | No | Yes | Yes | GPL | link |
drFAST | Read mapping alignment software that implements cache obliviousness to minimize main/cache memory transfers like mrFAST and mrsFAST, however designed for the SOLiD sequencing platform (color space reads). It also returns all possible map locations for improved structural variation discovery. | Yes | Yes (for structural variation) | Yes | No | BSD | link |
ELAND | Implemented by Illumina. Includes ungapped alignment with a finite read length. | ||||||
ERNE | Extended Randomized Numerical alignEr for accurate alignment of NGS reads. It can map bisulfite-treated reads. | Yes | Low quality bases trimming | Yes | Multithreading and MPI-enabled | GPL v3 | link |
GNUMAP | Accurately performs gapped alignment of sequence data obtained from next-generation sequencing machines (specifically that of Solexa/Illumina) back to a genome of any size. Includes adaptor trimming, SNP calling and Bisulfite sequence analysis. | Yes (also supports Illumina *_int.txt and *_prb.txt files with all 4 quality scores for each base) | Multithreading and MPI-enabled | link | |||
GEM | High-quality alignment engine (exhaustive mapping, that is 100% of sensitivity, for any number of substitutions; 1 non-exhaustive indel). Several standalone applications (mapper, split mapper, mappability, and other) provided. | Yes | Yes | Yes | GPL; GEM source is currently unavailable | link | |
GensearchNGS | Complete framework with user-friendly GUI to analyse NGS data. It integrates a proprietary high quality alignment algorithm as well as plug-in capability to integrate various public aligner into a framework allowing to import short reads, align them, detect variants and generate reports. It is geared towards re-sequencing projects, namely in a diagnostic setting. | Yes | No | Yes | Yes | Commercial; | link |
GMAP and GSNAP | Robust, fast short-read alignment. GMAP: longer reads, with multiple indels and splices (see entry above under Genomics analysis); GSNAP: shorter reads, with a single indel or up to two splices per read. Useful for digital gene expression, SNP and indel genotyping. Developed by Thomas Wu at Genentech. Used by the National Center for Genome Resources (NCGR) in Alpheus. | Yes | Yes | Yes | Yes | Free for academic and non-commercial use. | link |
Geneious Assembler | Fast, accurate overlap assembler with the ability to handle any combination of sequencing technology, read length, any pairing orientations, with any spacer size for the pairing, with or without a reference genome. | Yes | Commercial | link | |||
iSAAC | iSAAC has been designed to take full advantage of all the computational power available on a single server node. As a result iSAAC scales well over a broad range of hardware architectures, and alignment performance improves with hardware capabilities | Yes | Yes | Yes | Yes | Free for academic and non-commercial use. | link |
LAST | Yes | Yes | Yes | GPL | link | ||
MAQ | Ungapped alignment that takes into account quality scores for each base. | GPL | link | ||||
mrFAST and mrsFAST | Gapped (mrFAST) and ungapped (mrsFAST) alignment software that implements cache obliviousness to minimize main/cache memory transfers. They are designed for the Illumina sequencing platform and they can return all possible map locations for improved structural variation discovery. | Yes | Yes (for structural variation) | Yes | No | BSD | mrFAST mrsFAST |
MOM | MOM or maximum oligonucleotide mapping is a query matching tool that captures a maximal length match within the short read. | Yes | link | ||||
MOSAIK | Fast gapped aligner and reference-guided assembler. Aligns reads using a banded Smith-Waterman algorithm seeded by results from a k-mer hashing scheme. Supports reads ranging in size from very short to very long. | Yes | link | ||||
MPscan | Fast aligner based on a filtration strategy (no indexing, use q-grams and Backward Nondeterministic DAWG Matching) | link | |||||
Novoalign & NovoalignCS | Gapped alignment of single end and paired end Illumina GA I & II, ABI Colour space & ION Torrent reads.. High sensitivity and specificity, using base qualities at all steps in the alignment. Includes adapter trimming, base quality calibration, Bi-Seq alignment, and option to report multiple alignments per read. | Yes | Yes | Yes | Multi-threading and MPI versions available with paid license. | Single threaded version free for academic and non-commercial use. | Novocraft |
NextGENe | NextGENe® software has been developed specifically for use by biologists performing analysis of next generation sequencing data from Roche Genome Sequencer FLX, Illumina GA/HiSeq, Life Technologies Applied BioSystems’ SOLiD™ System, PacBio and Ion Torrent platforms. | Yes | Yes | Yes | Yes | Commercial | Softgenetics |
PALMapper | PALMapper, efficiently computes both spliced and unspliced alignments at high accuracy. Relying on a machine learning strategy combined with a fast mapping based on a banded Smith-Waterman-like algorithm it aligns around 7 million reads per hour on a single CPU. It refines the originally proposed QPALMA approach. | Yes | GPL | link | |||
Partek | Partek® Flow software has been developed specifically for use by biologists and bioinformaticians. It supports un-gapped, gapped and splice-junction alignment from single and paired-end reads from Illumina, Life technologies Solid TM, Roche 454 and Ion Torrent raw data (with or without quality information). It integrates powerful quality control on FASTQ/Qual level and on aligned data. Additional functionality include trimming and filtering of raw reads, SNP and InDel detection, mRNA and microRNA quantification and fusion gene detection. | Yes | Yes | Yes | Multiprocessor/Core, Client-Server installation possible | Commercial, FREE trial version | |
PASS | Indexes the genome, then extends seeds using pre-computed alignments of words. Works with base space as well as color space (SOLID) and can align genomic and spliced RNA-seq reads. | Yes | Yes | Yes | Yes | Free for academic and non-commercial use. | PASS_HOME |
PerM | Indexes the genome with periodic seeds to quickly find alignments with full sensitivity up to four mismatches. It can map Illumina and SOLiD reads. Unlike most mapping programs, speed increases for longer read lengths. | Yes | GPL | link | |||
QPalma | Is able to take advantage of quality scores, intron lengths and computation splice site predictions to perform and performs an unbiased alignment. Can be trained to the specifics of a RNA-seq experiment and genome. Useful for splice site/intron discovery and for gene model building. (See PALMapper for a faster version). | Yes (client/server) | GPLv2 | link | |||
RazerS | No read length limit. Hamming or edit distance mapping with configurable error rates. Configurable and predictable sensitivity (runtime/sensitivity tradeoff). Supports paired-end read mapping. | LGPL | link | ||||
REAL, cREAL | REAL is an efficient, accurate, and sensitive tool for aligning short reads obtained from next-generation sequencing. The programme can handle an enormous amount of single-end reads generated by the next-generation Illumina/Solexa Genome Analyzer. cREAL is a simple extension of REAL for aligning short reads obtained from next-generation sequencing to a genome with circular structure. | Yes | Yes | GPL | link | ||
RMAP | Can map reads with or without error probability information (quality scores) and supports paired-end reads or bisulfite-treated read mapping. There are no limitations on read length or number of mismatches. | Yes | Yes | Yes | GPL v3 | link | |
rNA | A randomized Numerical Aligner for Accurate alignment of NGS reads | Yes | Low quality bases trimming | Yes | Multithreading and MPI-enabled | GPL v3 | link |
RTG Investigator | Extremely fast, tolerant to high indel and substitution counts. Includes full read alignment. Product includes comprehensive pipelines for variant detection and metagenomic analysis with any combination of Illumina, Complete Genomics and Roche 454 data. | Yes | Yes, for variant calling | Yes | Yes | Free for individual investigator use. | link |
Segemehl | Can handle insertions, deletions and mismatches. Uses enhanced suffix arrays. | Yes | No | Yes | Yes | Free for non-commercial use | link |
SeqMap | Up to 5 mixed substitutions and insertions/deletions. Various tuning options and input/output formats. | Free for academic and non-commercial use. | link | ||||
Shrec | Short read error correction with a Suffix trie data structure. | Yes (Java) | link | ||||
SHRiMP | Indexes the reference genome as of version 2. Uses masks to generate possible keys. Can map ABI SOLiD color space reads. | Yes | Yes | Yes | Yes (OpenMP) | BSD derivative | link |
SLIDER | Slider is an application for the Illumina Sequence Analyzer output that uses the "probability" files instead of the sequence files as an input for alignment to a reference sequence or a set of reference sequences. | link | |||||
SOAP, SOAP2 and SOAP3 | Robust with a small (1-3) number of gaps and mismatches. Speed improvement over BLAT, uses a 12 letter hash table. SOAP2 using bidirectional BWT to build the index of reference, and it is much faster than the first version. Now an GPU-accelerated version named as SOAP3/GPU is available, that could find all 4-mismatch alignments in tens of seconds per one million reads. | Yes | Yes(multithread), SOAP3/GPU need GPU available. | GPL | link | ||
SOCS | For ABI SOLiD technologies. Significant increase in time to map reads with mismatches (or color errors). Uses an iterative version of the Rabin-Karp string search algorithm. | Yes | GPL | link | |||
SSAHA and SSAHA2 | Fast for a small number of variants. | Free for academic and non-commercial use. | link | ||||
Stampy | For Illumina reads. High specificity, and sensitive for reads with indels, structural variants, or many SNPs. Slow, but speed increased dramatically by using BWA for first alignment pass). | Yes | Yes | Yes | No | Free for academic and non-commercial use | link |
SToRM | Experimental ; for singles reads only (mainly SOLiD, but with Illumina experimental support now), and with SAM native output. Highly sensitive for reads with many errors, indels (from 1 to 16), and SNPs. Uses spaced seeds. Authors recommend Shrimp2. | No | Yes | Yes | Yes (OpenMP) | link | |
Taipan | de-novo Assembler for Illumina reads | Free for academic and non-commercial use. | link | ||||
UGENE | Visual interface both for Bowtie and BWA, as well as an embedded aligner | Opensource, GPL | link | ||||
VelociMapper | FPGA-accelerated reference sequence alignment mapping tool from TimeLogic. Faster than Burrows-Wheeler transform-based algorithms like BWA and Bowtie. Supports up to 7 mismatches and/or indels with no performance penalty. Produces sensitive Smith-Waterman gapped alignments. | Yes | Yes | Yes | Yes | Commercial | TimeLogic |
XpressAlign | FPGA based sliding window short read aligner which exploits the embarrassingly parallel property of short read alignment. Performance scales linearly with number of transistors on a chip (i.e. performance guaranteed to double with each iteration of Moore's Law without modification to algorithm). Low power consumption is useful for datacentre equipment. Predictable runtime. Better price/performance than software sliding window aligners on current hardware, but not better than software BWT-based aligners currently. Can cope with large numbers (>2) of mismatches. Will find all hit positions for all seeds. Single-FPGA experimental version, needs work to develop it into a multi-FPGA production version. | Free for academic and non-commercial use. | link | ||||
ZOOM | 100% sensitivity for a reads between 15 - 240bp with practical mismatches. Very fast. Support insertions and deletions. Works with Illumina & SOLiD instruments, not 454. | Yes (GUI) No (CLI). | Commercial | link |
Read more about this topic: List Of Sequence Alignment Software
Famous quotes containing the word sequence:
“Reminiscences, even extensive ones, do not always amount to an autobiography.... For autobiography has to do with time, with sequence and what makes up the continuous flow of life. Here, I am talking of a space, of moments and discontinuities. For even if months and years appear here, it is in the form they have in the moment of recollection. This strange formit may be called fleeting or eternalis in neither case the stuff that life is made of.”
—Walter Benjamin (18921940)