Research Article
Handling Big Data Scalability in Biological Domain Using Parallel and Distributed Processing: A Case of Three Biological Semantic Similarity Measures
Table 11
Total time obtained using a distributed system (2, 3, and 4 slaves) with Enhanced SORA and input data divided equally.
| Number of Gene Pairs | Original SORA Total Time (ns) | Threaded SORA Total Time (ns) (Input Data Divided Equally) | % Threaded SORA Total Time (Input Data Divided Equally) vs. Original SORA Total Time | 2 Slaves | 3 Slaves | 4 Slaves | 2 Slaves | 3 Slaves | 4 Slaves |
| 10 | 1708984548 | 603416420 | 426734074 | 1296029007 | -64.69 | -75.03 | -24.16 | 100 | 12714096406 | 4591177183 | 3915812000 | 2555994931 | -63.89 | -69.20 | -79.90 | 1000 | 1.11216E+11 | 37052551034 | 23049069956 | 20918129803 | -66.68 | -79.28 | -81.19 | 10000 | 3.51063E+13 | 1.23639E+13 | 1.2036E+13 | 7.32991E+12 | -64.78 | -65.72 | -79.12 | 100000 | X | X | X | X | X | X | X | 1000000 | X | X | X | X | X | X | X | Average | 8.80799E+12 | 3.10154E+12 | 3.01584E+12 | 1.83867E+12 | -65.01 | -72.31 | -66.09 |
|
|
X indicates that, due to limited memory, the system required many hours to find the similarity of some of the pairs.
|