Review Article

An Overview of Multiple Sequence Alignments and Cloud Computing in Bioinformatics

Table 3

Related software and projects on MapReduce.

ProjectsDescription

AvroA data serialization system. URL: http://avro.apache.org/
CassandraA highly scalable, consistent, distributed, and structured multimaster database. URL: http://cassandra.apache.org/
ChukwaAn open source data collection system for monitoring large distributed systems. URL: http://incubator.apache.org/chukwa/
DryadAn infrastructure which allows the use of resources of a computer cluster for running data-parallel programs. URL: http://research.microsoft.com/en-us/projects/dryad/
Hadoop CommonThe common utilities that support the other Hadoop subprojects. URL: http://hadoop.apache.org/
Hadoop MapReduceA programming model and an associated implementation for processing and generating large data sets. URL: http://research.google.com/archive/mapreduce.html
HaLoopA modified version of the Hadoop MapReduce framework, which supports iterative applications by making the task scheduler loop-aware and by adding various catching mechanisms. URL: https://code.google.com/p/haloop/
HBaseA Hadoop database, a distributed, scalable big data store. URL: http://hbase.apache.org/
HDFSHadoop Distributed File System is a distributed file system designed to run on commodity hardware. URL: http://hadoop.apache.org/docs/r1.0.4/hdfs_design.html
HiveA data warehouse system for Hadoop that facilitates data summarization and ad hoc queries. URL: http://hive.apache.org/
MahoutA scalable machine learning and data mining library.
URL: http://mahout.apache.org/
MapRA complete distribution for Apache Hadoop and HBase that includes Hive, Mahout, Pig, Cascading, and many other projects. URL: http://www.mapr.com/
PigA platform for analysing large data sets that consists of high-level language for expressing data analysis programs. URL: http://pig.apache.org/
PregelA system for large-scale graph processing. URL: http://kowshik.github.com/JPregel/pregel_paper.pdf
TwisterA support for iterative MapReduce computations.
URL: http://www.iterativemapreduce.org/
YARNNext Generation Apache Hadoop MapReduce Framework. URL: http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html
ZooKeeperA high-performance manager for distributed applications. URL: http://hadoop.apache.org/zookeeper/