Research Article

Handling Big Data Scalability in Biological Domain Using Parallel and Distributed Processing: A Case of Three Biological Semantic Similarity Measures

Table 17

Total time obtained using a distributed system (2, 3, and 4 slaves) with Enhanced SORA and input data divided by their similarity.

Number of Gene PairsOriginal SORA Total Time (ns) Threaded SORA Total Time (ns) (Input Data Divided by Their Similarity) % Threaded SORA Total Time (Input Data Divided by Their Similarity) vs. Original SORA Total Time
2 Slaves3 Slaves4 Slaves2 Slaves3 Slaves4 Slaves

101708984548482298616359402373399416852-71.78-78.97-76.63
10012714096406284053919127880181162459561951-77.66-78.07-80.65
10001.11216E+11285513681612343794791818670965654-74.33-78.93-83.21
100003.51063E+131.22567E+121.03491E+121.24612E+12-96.51-97.05-96.45
100000XXXXXXX
1000000XXXXXXX
Average8.80799E+123.14385E+112.65373E+113.16912E+11-80.07-83.25-84.24

X indicates that, due to limited memory, the system required many hours to find the similarity of some of the pairs.