Research Article
Handling Big Data Scalability in Biological Domain Using Parallel and Distributed Processing: A Case of Three Biological Semantic Similarity Measures
Table 17
Total time obtained using a distributed system (2, 3, and 4 slaves) with Enhanced SORA and input data divided by their similarity.
| Number of Gene Pairs | Original SORA Total Time (ns) | Threaded SORA Total Time (ns) (Input Data Divided by Their Similarity) | % Threaded SORA Total Time (Input Data Divided by Their Similarity) vs. Original SORA Total Time | 2 Slaves | 3 Slaves | 4 Slaves | 2 Slaves | 3 Slaves | 4 Slaves |
| 10 | 1708984548 | 482298616 | 359402373 | 399416852 | -71.78 | -78.97 | -76.63 | 100 | 12714096406 | 2840539191 | 2788018116 | 2459561951 | -77.66 | -78.07 | -80.65 | 1000 | 1.11216E+11 | 28551368161 | 23437947918 | 18670965654 | -74.33 | -78.93 | -83.21 | 10000 | 3.51063E+13 | 1.22567E+12 | 1.03491E+12 | 1.24612E+12 | -96.51 | -97.05 | -96.45 | 100000 | X | X | X | X | X | X | X | 1000000 | X | X | X | X | X | X | X | Average | 8.80799E+12 | 3.14385E+11 | 2.65373E+11 | 3.16912E+11 | -80.07 | -83.25 | -84.24 |
|
|
X indicates that, due to limited memory, the system required many hours to find the similarity of some of the pairs.
|