Research Article

Handling Big Data Scalability in Biological Domain Using Parallel and Distributed Processing: A Case of Three Biological Semantic Similarity Measures

Table 11

Total time obtained using a distributed system (2, 3, and 4 slaves) with Enhanced SORA and input data divided equally.

Number of Gene PairsOriginal SORA Total Time (ns) Threaded SORA Total Time (ns) (Input Data Divided Equally) % Threaded SORA Total Time (Input Data Divided Equally) vs. Original SORA Total Time
2 Slaves3 Slaves4 Slaves2 Slaves3 Slaves4 Slaves

1017089845486034164204267340741296029007-64.69-75.03-24.16
10012714096406459117718339158120002555994931-63.89-69.20-79.90
10001.11216E+11370525510342304906995620918129803-66.68-79.28-81.19
100003.51063E+131.23639E+131.2036E+137.32991E+12-64.78-65.72-79.12
100000XXXXXXX
1000000XXX XXXX
Average8.80799E+123.10154E+123.01584E+121.83867E+12-65.01-72.31-66.09

X indicates that, due to limited memory, the system required many hours to find the similarity of some of the pairs.