International Scholarly Research Notices

Review Article

An Overview of Multiple Sequence Alignments and Cloud Computing in Bioinformatics

Table 3

Related software and projects on MapReduce.


Projects	Description

Avro	A data serialization system. URL: http://avro.apache.org/
Cassandra	A highly scalable, consistent, distributed, and structured multimaster database. URL: http://cassandra.apache.org/
Chukwa	An open source data collection system for monitoring large distributed systems. URL: http://incubator.apache.org/chukwa/
Dryad	An infrastructure which allows the use of resources of a computer cluster for running data-parallel programs. URL: http://research.microsoft.com/en-us/projects/dryad/
Hadoop Common	The common utilities that support the other Hadoop subprojects. URL: http://hadoop.apache.org/
Hadoop MapReduce	A programming model and an associated implementation for processing and generating large data sets. URL: http://research.google.com/archive/mapreduce.html
HaLoop	A modified version of the Hadoop MapReduce framework, which supports iterative applications by making the task scheduler loop-aware and by adding various catching mechanisms. URL: https://code.google.com/p/haloop/
HBase	A Hadoop database, a distributed, scalable big data store. URL: http://hbase.apache.org/
HDFS	Hadoop Distributed File System is a distributed file system designed to run on commodity hardware. URL: http://hadoop.apache.org/docs/r1.0.4/hdfs_design.html
Hive	A data warehouse system for Hadoop that facilitates data summarization and ad hoc queries. URL: http://hive.apache.org/
Mahout	A scalable machine learning and data mining library. URL: http://mahout.apache.org/
MapR	A complete distribution for Apache Hadoop and HBase that includes Hive, Mahout, Pig, Cascading, and many other projects. URL: http://www.mapr.com/
Pig	A platform for analysing large data sets that consists of high-level language for expressing data analysis programs. URL: http://pig.apache.org/
Pregel	A system for large-scale graph processing. URL: http://kowshik.github.com/JPregel/pregel_paper.pdf
Twister	A support for iterative MapReduce computations. URL: http://www.iterativemapreduce.org/
YARN	Next Generation Apache Hadoop MapReduce Framework. URL: http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html
ZooKeeper	A high-performance manager for distributed applications. URL: http://hadoop.apache.org/zookeeper/