Table of Contents Author Guidelines Submit a Manuscript
The Scientific World Journal
Volume 2014, Article ID 749028, 11 pages
http://dx.doi.org/10.1155/2014/749028
Research Article

Efficient and Scalable Graph Similarity Joins in MapReduce

1College of Information System and Management, National University of Defense Technology, Changsha 410073, China
2Nagoya University, Nagoya, Japan

Received 17 March 2014; Accepted 29 May 2014; Published 8 July 2014

Academic Editor: Jian J. Zhang

Copyright © 2014 Yifan Chen et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Linked References

  1. X. Yan and J. Han, “gSpan: graph-based substructure pattern mining,” in Proceedings of the 2nd IEEE International Conference on Data Mining (ICDM '02), pp. 721–724, December 2002. View at Scopus
  2. X. Yan, P. S. Yu, and J. Han, “Graph indexing: a frequent structure-based approach,” in Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD '04), pp. 335–346, June 2004. View at Scopus
  3. H. He and A. K. Singh, “Closure-tree: an index structure for graph queries,” in Proceedings of the 22nd International Conference on Data Engineering (ICDE '06), p. 38, April 2006. View at Publisher · View at Google Scholar · View at Scopus
  4. X. Yan, P. S. Yu, and J. Han, “Substructure similarity search in graph databases,” in Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 766–777, June 2005. View at Scopus
  5. Y. Tian and J. M. Patel, “TALE: a tool for approximate large graph matching,” in Proceedings of the 24th IEEE International Conference on Data Engineering (ICDE '08), pp. 963–972, April 2008. View at Publisher · View at Google Scholar · View at Scopus
  6. A. Sanfeliu and K.-S. Fu, “A distance measure between attributed relational graphs for pattern recognition,” IEEE Transactions on Systems, Man and Cybernetics, vol. 13, no. 3, pp. 353–362, 1983. View at Google Scholar · View at Scopus
  7. H. Bunke and G. Allermann, “Inexact graph matching for structural pattern recognition,” Pattern Recognition Letters, vol. 1, no. 4, pp. 245–253, 1983. View at Google Scholar · View at Scopus
  8. M. R. Garey and D. S. Johnson, Computers and Intractability, vol. 174, Freeman, San Francisco, Calif, USA, 1979.
  9. X. Zhao, C. Xiao, X. Lin, and W. Wang, “Efficient graph similarity joins with edit distance constraints,” in Proceedings of the 28th IEEE International Conference on Data Engineering (ICDE '12), pp. 834–845, April 2012. View at Publisher · View at Google Scholar · View at Scopus
  10. J. Dean and S. Ghemawat, “MapReduce: simplified data processing on large clusters,” Communications of the ACM, vol. 51, no. 1, pp. 107–113, 2008. View at Publisher · View at Google Scholar · View at Scopus
  11. D. Deng, G. Li, S. Hao, J. Wang, J. Feng, and W. S. Li, “MaSSJoin: a MapReduce-based method for scalable string similarity joins,” in Proceedings of the 30th IEEE International Conference on Data Engineering (ICDE '14), pp. 340–351, Chicago, Ill, USA.
  12. B. H. Bloom, “Space/time trade-offs in hash coding with allowable errors,” Communications of the ACM, vol. 13, no. 7, pp. 422–426, 1970. View at Publisher · View at Google Scholar · View at Scopus
  13. A. Broder and M. Mitzenmacher, “Network applications of bloom filters: a survey,” Internet Mathematics, vol. 1, no. 4, pp. 485–509, 2004. View at Google Scholar
  14. S. Cohen and Y. Matias, “Spectral bloom filters,” in Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 241–252, June 2003. View at Scopus
  15. F. N. Afrati and J. D. Ullman, “Optimizing joins in a map-reduce environment,” in Proceedings of the 13th International Conference on Extending Database Technology: Advances in Database Technology (EDBT '10), pp. 99–110, March 2010. View at Publisher · View at Google Scholar · View at Scopus
  16. Y. N. Silva and J. M. Reed, “Exploiting MapReduce-based similarity joins,” in Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD '12), pp. 693–696, May 2012. View at Publisher · View at Google Scholar · View at Scopus
  17. G. Wang, B. Wang, X. Yang, and G. Yu, “Efficiently indexing large sparse graphs for similarity search,” IEEE Transactions on Knowledge and Data Engineering, vol. 24, no. 3, pp. 440–451, 2012. View at Publisher · View at Google Scholar · View at Scopus
  18. J. Cohen, “Graph twiddling in a MapReduce world,” Computing in Science and Engineering, vol. 11, no. 4, pp. 29–41, 2009. View at Publisher · View at Google Scholar · View at Scopus
  19. S. Lattanzi, B. Moseley, S. Suri, and S. Vassilvitskii, “Filtering: a method for solving graph problems in MapReduce,” in Proceedings of the 23rd ACM Symposium on Parallelism in Algorithms and Architectures (SPAA '11), pp. 85–94, June 2011. View at Publisher · View at Google Scholar · View at Scopus
  20. B. Bahmani, K. Chakrabarti, and D. Xin, “Fast personalized PageRank on MapReduce,” in Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 973–984, June 2011. View at Publisher · View at Google Scholar · View at Scopus
  21. U. Kang, H. Tong, J. Sun, C.-Y. Lin, and C. Faloutsos, “GBASE: a Scalable and general graph management system,” in Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '11), pp. 1091–1099, August 2011. View at Publisher · View at Google Scholar · View at Scopus
  22. B. Bahmani, R. Kumar, and S. Vassilvitskii, “Densest subgraph in streaming and MapReduce,” Proceedings of the VLDB Endowment, vol. 5, no. 5, pp. 454–465, 2012. View at Google Scholar
  23. F. N. Afrati, D. Fotakis, and J. D. Ullman, “Enumerating subgraph instances using map-reduce,” in Proceedings of the 29th International Conference on Data Engineering (ICDE '13), pp. 62–73, April 2013. View at Publisher · View at Google Scholar · View at Scopus
  24. V. Rastogi, A. Machanavajjhala, L. Chitnis, and A. Das Sarma, “Finding connected components in map-reduce in logarithmic rounds,” in Proceedings of the 29th International Conference on Data Engineering (ICDE '13), pp. 50–61, April 2013. View at Publisher · View at Google Scholar · View at Scopus
  25. G. Malewicz, M. H. Austern, A. J. C. Bik et al., “Pregel: a system for large-scale graph processing,” in Proceedings of the International Conference on Management of Data, (SIGMOD '10), pp. 135–145, June 2010. View at Publisher · View at Google Scholar · View at Scopus
  26. E. Krepska, T. Kielmann, W. Fokkink, and H. Bal, “Hipg: parallelprocessing of large-scale graphs,” ACM SIGOPS Operating Systems Review, vol. 45, no. 2, pp. 3–13, 2011. View at Google Scholar
  27. J. E. Gonzalez, Y. Low, H. Gu, D. Bickson, and C. Guestrin, “Powergraph: distributed graph-parallel computation on natural graphs,” in Proceedings of the 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI '12), pp. 17–30, 2012.
  28. Y. Tian, A. Balmin, S. A. Corsten, S. Tatikonda, and J. McPherson, “From “think like a vertex” to “think like a graph”,” Proceedings of the VLDB Endowment, vol. 7, no. 3, 2013. View at Google Scholar
  29. Z. Shang and J. X. Yu, “Catch the wind: graph workload balancing on cloud,” in Proceedings of the 29th International Conference on Data Engineering (ICDE '13), pp. 553–564, April 2013. View at Publisher · View at Google Scholar · View at Scopus