Research Article

Handling Big Data Scalability in Biological Domain Using Parallel and Distributed Processing: A Case of Three Biological Semantic Similarity Measures

Table 12

Enhanced SORA average time in the distributed system (2, 3 and 4 slaves) with input data divided equally.

Number of Gene PairsOriginal SORA Average Time (ns) Threaded SORA Average Time (ns) (Input Data Divided Equally) % Threaded SORA Average Time (Input Data Divided Equally) vs. Original SORA Average Time
2 Slaves3 Slaves4 Slaves2 Slaves3 Slaves4 Slaves

104.14E+076.03E+085.27E+121.06E+134.82E+082.13E+111.84E+11
1001.23E+087.06E+073.20E+116.63E+114.42E+071.63E+114.60E+10
10001.11E+086.15E+073.09E+106.40E+106.61E+076.75E+096.01E+09
100003.51E+091.83E+096.37E+093.56E+093.31E+086.69E+083.46E+08
100000XXXXXXX
1000000XXXXXXX
Average9.47E+086.41E+081.41E+122.83E+122.31E+089.59E+105.91E+10

X indicates that, due to limited memory, the system required many hours to find the similarity of some of the pairs.