Review Article

An Overview of Multiple Sequence Alignments and Cloud Computing in Bioinformatics

Table 4

Parallel bioinformatics software tools.

FunctionAlgorithmDescription

Genomic sequence mappingCloudAligner [70]A MapReduce-based application for mapping short reads generated by the next-generation sequencing machines. URL: http://cloudaligner.sourceforge.net/
CloudBurst [71]A parallel read mapping algorithm used for mapping next-generation sequence data to the human genome and other genomes.
URL: http://sourceforge.net/apps/mediawiki/cloudburst-bio/index.php?title=CloudBurst
SEAL [72]A suite of distributed applications for aligning, manipulating, and analysing short DNA reads.
URL: http://biodoop-seal.sourceforge.net/
BlastReduce [73]A parallel short DNA sequence read mapping algorithm optimized for aligning sequence data for use in SNP discovery, genotyping, and personal genomics. URL: http://www.cbcb.umd.edu/software/blastreduce/

Genomic sequencing analysisCrossbow [74]A scalable software pipeline, which combines Bowtie and SoapsSNP for whole genome resequencing analysis. URL: http://bowtie-bio.sourceforge.net/crossbow/index.shtml
Contrail [75]An algorithm for de novo assembly of large genomes from short sequencing reads. Contrail relies on the graph-theoretic framework of de Bruijn graphs. URL: http://sourceforge.net/apps/mediawiki/contrail-bio/index.php?title=Contrail
CloudBrush [76]A distributed genome assembler based on string graphs. URL: https://github.com/ice91/CloudBrush.

RNA sequencing analysisMyrna [77]A cloud computing pipeline for calculating differential gene expression in large RNA-Seq data sets. URL: http://bowtie-bio.sourceforge.net/myrna/index.shtml
FX [78]RNA sequence analysis tool for the estimation of gene expression levels and genomic variant calling. URL: http://fx.gmi.ac.kr/
Eoulsan [79]An integrated and flexible solution for RNA sequence data analysis of differential expression. URL: http://transcriptome.ens.fr/eoulsan/

Sequence file managementHadoop-BAM [80]A novel library for scalable manipulation of aligned next-generation sequencing data in the Hadoop distributed computing framework. URL: http://sourceforge.net/projects/hadoop-bam/
SeqWare [81]A tool set used to work with next generation genome sequencing technologies (Illumina, ABI SOLiD, 454) which includes a LIMS, Pipeline, and Query Engine. URL: http://sourceforge.net/projects/seqware/
GATK [82]A MapReduce framework for analysing next-generation DNA sequencing data. URL: http://genome.cshlp.org/content/20/9/1297

Phylogenetic analysisMrsRF [83]A scalable, efficient multicore algorithm that uses MapReduce to quickly calculate the all-to-all Robinson-Foulds (RF) distance between large numbers of trees. URL: https://code.google.com/p/mrsrf/
Nephele [84]A set of tools, which use the complete composition vector algorithm in order to group sequence clustering into genotypes based on a distance measure. URL: https://code.google.com/p/nephele/

GPU bioinformatics softwareGPU-BLAST [85]An accelerated version of NCBI-BLAST which uses general purpose graphics processing unit (GPU), designed to rapidly manipulate and alter memory to accelerate overall algorithm processing. URL: http://eudoxus.cheme.cmu.edu/gpublast/gpublast.html
SOAP3 [86]Short sequence read alignment algorithm that uses the multiprocessors in a graphic processing unit to achieve ultrafast alignments. URL: http://soap.genomics.org.cn/soap3.html

Search engine implementationHydra [87]A protein sequence database search engine specifically designed to run efficiently on the Hadoop MapReduce framework. URL: http://www.webcitation.org/query.php?url=http://code.google.com/p/hydra-proteomics/&refdoi=10.1186/1471-2105-13-324
CloudBlast [88]Scalable BLAST in the cloud. URL: http://ammatsun.acis.ufl.edu/amwiki/index.php/CloudBLAST_Project

MiscellaneousPeakRanger [89]A multipurpose, ultrafast ChIP sequence peak caller. URL: http://ranger.sourceforge.net/
YunBe [90]A gene set analysis algorithm for biomarker identification in the cloud. URL: http://lrcv-crp-sante.s3-website-us-east-1.amazonaws.com/
BioDoop [91]A set of tools which modules for handling FASTA streams, wrappers for BLAST, converting sequences to the different formats, and so on. URL: http://dc.crs4.it/projects/biodoop
BlueSNP [92]An algorithm for computationally intensive analyses, feasible for large genotype-phenotype datasets. URL: https://github.com/ibm-bioinformatics/BlueSNP
Quake [93]DNA sequence error detection and correction in sequence reads. URL: http://www.cbcb.umd.edu/software/quake/