Table of Contents Author Guidelines Submit a Manuscript
Scientific Programming
Volume 2017, Article ID 4382348, 11 pages
https://doi.org/10.1155/2017/4382348
Research Article

Using Hierarchical Latent Dirichlet Allocation to Construct Feature Tree for Program Comprehension

1School of Information Engineering, Yangzhou University, Yangzhou, China
2Tongda College of Nanjing University of Posts and Telecommunications, Nanjing, China
3Hainan University, Haikou, China

Correspondence should be addressed to Bin Li; nc.ude.uzy@bl

Received 9 November 2016; Revised 10 February 2017; Accepted 26 March 2017; Published 12 April 2017

Academic Editor: Michele Risi

Copyright © 2017 Xiaobing Sun et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Linked References

  1. T. Fritz, G. C. Murphy, E. Murphy-Hill, J. Ou, and E. Hill, “Degree-of-knowledge: modeling a developer's knowledge of code,” ACM Transactions on Software Engineering and Methodology, vol. 23, no. 2, article 14, 2014. View at Publisher · View at Google Scholar · View at Scopus
  2. Z. Soh, “Context and vision: studying two factors impacting program comprehension,” in Proceedings of the IEEE 19th International Conference on Program Comprehension (ICPC '11), pp. 258–261, June 2011. View at Publisher · View at Google Scholar · View at Scopus
  3. J.-M. Burkhardt, F. Détienne, and S. Wiedenbeck, “Object-oriented program comprehension: effect of expertise, task and phase,” Empirical Software Engineering, vol. 7, no. 2, pp. 115–156, 2002. View at Publisher · View at Google Scholar · View at Scopus
  4. T. Nakagawa, Y. Kamei, H. Uwano, A. Monden, K. Matsumoto, and D. M. German, “Quantifying programmers' mental workload during program comprehension based on cerebral blood flow measurement: a controlled experiment,” in Proceedings of the 36th International Conference on Software Engineering (ICSE '14), pp. 448–451, June 2014. View at Publisher · View at Google Scholar · View at Scopus
  5. W. Maalej, R. Tiarks, T. Roehm, and R. Koschke, “On the comprehension of program comprehension,” ACM Transactions on Software Engineering and Methodology, vol. 23, no. 4, pp. 31:1–31:37, 2014. View at Publisher · View at Google Scholar · View at Scopus
  6. Y. Kong, M. Zhang, and D. Ye, “A belief propagation-based method for task allocation in open and dynamic cloud environments,” Knowledge-Based Systems, vol. 115, pp. 123–132, 2017. View at Publisher · View at Google Scholar
  7. M. P. O'Brien, “Software comprehension: a review and research direction,” Tech. Rep., Department of Computer Science & Information Systems, University of Limerick, Limerick, Ireland, 2003. View at Google Scholar
  8. T. Kosar, M. Mernik, and J. C. Carver, “Program comprehension of domain-specific and general-purpose languages: comparison using a family of experiments,” Empirical Software Engineering, vol. 17, no. 3, pp. 276–304, 2012. View at Publisher · View at Google Scholar · View at Scopus
  9. K. Maruyama, T. Omori, and S. Hayashi, “A visualization tool recording historical data of program comprehension tasks,” in Proceedings of the 22nd International Conference on Program Comprehension (ICPC '14), pp. 207–211, June 2014. View at Publisher · View at Google Scholar · View at Scopus
  10. C. Y. Chong, S. P. Lee, and T. C. Ling, “Efficient software clustering technique using an adaptive and preventive dendrogram cutting approach,” Information and Software Technology, vol. 55, no. 11, pp. 1994–2012, 2013. View at Publisher · View at Google Scholar · View at Scopus
  11. P. Andritsos and V. Tzerpos, “Information-theoretic software clustering,” IEEE Transactions on Software Engineering, vol. 31, no. 2, pp. 150–165, 2005. View at Publisher · View at Google Scholar · View at Scopus
  12. M. Bauer and M. Trifu, “Architecture-aware adaptive clustering of OO systems,” in Proceedings of the European Conference on Software Maintainance and Reengineering (CSMR '04), pp. 3–14, Tampere, Finland, March 2004. View at Scopus
  13. G. Santos, M. T. Valente, and N. Anquetil, “Remodularization analysis using semantic clustering,” in Proceedings of the Software Evolution Week—IEEE Conference on Software Maintenance, Reengineering, and Reverse Engineering (CSMR-WCRE '14), pp. 224–233, Antwerp, Belgium, February 2014.
  14. A. Kuhn, S. Ducasse, and T. Gîrba, “Semantic clustering: identifying topics in source code,” Information and Software Technology, vol. 49, no. 3, pp. 230–243, 2007. View at Publisher · View at Google Scholar · View at Scopus
  15. C. Li, Z. Xu, C. Qiao, and T. Luo, “Hierarchical clustering driven by cognitive features,” Science China. Information Sciences, vol. 57, no. 1, 012109, 14 pages, 2014. View at Publisher · View at Google Scholar · View at MathSciNet
  16. N. Anquetil and T. C. Lethbridge, “Experiments with clustering as a software remodularization method,” in Proceedings of the 6th Working Conference on Reverse Engineering (WCRE '99), pp. 235–255, October 1999. View at Scopus
  17. Z. Fu, K. Ren, J. Shu, X. Sun, and F. Huang, “Enabling personalized search over encrypted outsourced data with efficiency improvement,” IEEE Transactions on Parallel and Distributed Systems, vol. 27, no. 9, pp. 2546–2559, 2016. View at Publisher · View at Google Scholar
  18. B. Dit, M. Revelle, M. Gethers, and D. Poshyvanyk, “Feature location in source code: a taxonomy and survey,” Journal of software: Evolution and Process, vol. 25, no. 1, pp. 53–95, 2013. View at Publisher · View at Google Scholar · View at Scopus
  19. G. Santos, M. T. Valente, and N. Anquetil, “Remodularization analysis using semantic clustering,” in Proceedings of the 1st Software Evolution Week—IEEE Conference on Software Maintenance, Reengineering, and Reverse Engineering (CSMR-WCRE '14), pp. 224–233, Antwerp, Belgium, February 2014. View at Publisher · View at Google Scholar · View at Scopus
  20. B. L. Vinz and L. H. Etzkorn, “Improving program comprehension by combining code understanding with comment understanding,” Knowledge-Based Systems, vol. 21, no. 8, pp. 813–825, 2008. View at Publisher · View at Google Scholar · View at Scopus
  21. Y. S. Maarek, D. M. Berry, and G. E. Kaiser, “An information retrieval approach for automatically constructing software libraries,” IEEE Transactions on Software Engineering, vol. 17, no. 8, pp. 800–813, 1991. View at Publisher · View at Google Scholar · View at Scopus
  22. Z. Fu, X. Wu, C. Guan, X. Sun, and K. Ren, “Toward efficient multi-keyword fuzzy search over encrypted outsourced data with accuracy improvement,” IEEE Transactions on Information Forensics and Security, vol. 11, no. 12, pp. 2706–2716, 2016. View at Publisher · View at Google Scholar
  23. D. M. Blei, T. L. Griffiths, and M. I. Jordan, “The nested Chinese restaurant process and Bayesian nonparametric inference of topic hierarchies,” Journal of the ACM, vol. 57, no. 2, article 7, 2010. View at Publisher · View at Google Scholar · View at MathSciNet · View at Scopus
  24. D. M. Blei, T. L. Griffiths, M. I. Jordan, and J. B. Tenenbaum, “Hierarchical topic models and the nested chinese restaurant process,” in Advances in Neural Information Processing Systems 16 [Neural Information Processing Systems, NIPS 2003, December 8–13, 2003, Vancouver and Whistler, British Columbia, Canada], pp. 17–24, 2003. View at Google Scholar
  25. X. Sun, X. Liu, J. Hu, and J. Zhu, “Empirical studies on the NLP techniques for source code data preprocessing,” in Proceedings of the 3rd International Workshop on Evidential Assessment of Software Technologies (EAST '14), pp. 32–39, May 2014. View at Publisher · View at Google Scholar · View at Scopus
  26. U. Erdemir, U. Tekin, and F. Buzluca, “Object oriented software clustering based on community structure,” in Proceedings of the 18th Asia Pacific Software Engineering Conference (APSEC '11), pp. 315–321, IEEE, Ho Chi Minh, Vietnam, December 2011. View at Publisher · View at Google Scholar · View at Scopus
  27. A. D. Lucia, M. D. Penta, R. Oliveto, A. Panichella, and S. Panichella, “Using IR methods for labeling source code artifacts: is it worthwhile?” in Proceedings of the IEEE 20th International Conference on Program Comprehension (ICPC '12), pp. 193–202, Passau, Germany, June 2012.
  28. O. Maqbool and H. Babri, “Hierarchical clustering for software architecture recovery,” IEEE Transactions on Software Engineering, vol. 33, no. 11, pp. 759–780, 2007. View at Publisher · View at Google Scholar · View at Scopus
  29. O. Maqbool and H. A. Babri, “The weighted combined algorithm: a linkage algorithm for software clustering,” in Proceedings of the European Conference on Software Maintainance and Reengineering (CSMR '04), pp. 15–24, Tampere, Finland, March 2004. View at Scopus
  30. M. Shtern and V. Tzerpos, “Clustering methodologies for software engineering,” Advances in Software Engineering, vol. 2012, Article ID 792024, 18 pages, 2012. View at Publisher · View at Google Scholar
  31. A. Mahmoud and N. Niu, “Evaluating software clustering algorithms in the context of program comprehension,” in Proceedings of the 21st International Conference on Program Comprehension (ICPC '13), pp. 162–171, May 2013. View at Publisher · View at Google Scholar · View at Scopus
  32. G. Bavota, M. Gethers, R. Oliveto, D. Poshyvanyk, and A. De Lucia, “Improving software modularization via automated analysis of latent topics and dependencies,” ACM Transactions on Software Engineering and Methodology, vol. 23, no. 1, Article ID 2559935, 4 pages, 2014. View at Publisher · View at Google Scholar · View at Scopus
  33. C. J. van Rijsbergen, Information Retrieval, Butterworths, London, UK, 1979.
  34. X. Liu, X. Sun, B. Li, and J. Zhu, “PFN: a novel program feature network for program comprehension,” in Proceedings of the IEEE/ACIS 13th International Conference on Computer and Information Science (ICIS '14), pp. 349–354, Taiyuan, China, June 2014.
  35. V. Rajlich and N. Wilde, “The role of concepts in program comprehension,” in Proceedings of the 10th International Workshop on Program Comprehension (IWPC '02), pp. 271–278, June 2002. View at Publisher · View at Google Scholar · View at Scopus
  36. S. Mancoridis, B. S. Mitchell, C. Rorres, Y. Chen, and E. R. Gansner, “Using automatic clustering to produce high-level system organizations of source code,” in Proceedings of the 6th International Workshop on Program Comprehension (IWPC '98), p. 45, Ischia, Italy, June 1998.
  37. D. Binkley, D. Heinz, D. J. Lawrie, and J. Overfelt, “Understanding LDA in source code analysis,” in Proceedings of the 22nd International Conference on Program Comprehension (ICPC '14), pp. 26–36, ACM, Hyderabad, India, June 2014. View at Publisher · View at Google Scholar · View at Scopus
  38. B. Gu, V. S. Sheng, K. Y. Tay, W. Romano, and S. Li, “Incremental support vector learning for ordinal regression,” IEEE Transactions on Neural Networks and Learning Systems, vol. 26, no. 7, pp. 1403–1416, 2015. View at Publisher · View at Google Scholar · View at MathSciNet
  39. B. S. Mitchell and S. Mancoridis, “On the automatic modularization of software systems using the bunch tool,” IEEE Transactions on Software Engineering, vol. 32, no. 3, pp. 193–208, 2006. View at Publisher · View at Google Scholar · View at Scopus
  40. S. Islam, J. Krinke, D. Binkley, and M. Harman, “Coherent clusters in source code,” Journal of Systems and Software, vol. 88, no. 1, pp. 1–24, 2014. View at Publisher · View at Google Scholar · View at Scopus
  41. S. Mirarab, A. Hassouna, and L. Tahvildari, “Using Bayesian belief networks to predict change propagation in software systems,” in Proceedings of the 15th IEEE International Conference on Program Comprehension (ICPC '07), pp. 177–186, Alberta, Canada, June 2007. View at Publisher · View at Google Scholar · View at Scopus
  42. F. Deng and J. A. Jones, “Weighted system dependence graph,” in Proceedings of the 5th IEEE International Conference on Software Testing, Verification and Validation (ICST '12), pp. 380–389, Montreal, Canada, April 2012. View at Publisher · View at Google Scholar · View at Scopus
  43. M. Gethers, A. Aryani, and D. Poshyvanyk, “Combining conceptual and domain-based couplings to detect database and code dependencies,” in Proceedings of the IEEE 12th International Working Conference on Source Code Analysis and Manipulation (SCAM '12), pp. 144–153, Riva del Garda, Italy, September 2012. View at Publisher · View at Google Scholar · View at Scopus
  44. L. Guerrouj, “Normalizing source code vocabulary to support program comprehension and software quality,” in Proceedings of the 35th International Conference on Software Engineering (ICSE '13), pp. 1385–1388, IEEE, San Francisco, Calif, USA, May 2013. View at Publisher · View at Google Scholar · View at Scopus
  45. A. De Lucia, M. Di Penta, and R. Oliveto, “Improving source code lexicon via traceability and information retrieval,” IEEE Transactions on Software Engineering, vol. 37, no. 2, pp. 205–227, 2011. View at Publisher · View at Google Scholar · View at Scopus
  46. N. Anquetil and T. C. Lethbridge, “Recovering software architecture from the names of source files,” Journal of Software Maintenance and Evolution, vol. 11, no. 3, pp. 201–221, 1999. View at Publisher · View at Google Scholar · View at Scopus
  47. K. Sartipi and K. Kontogiannis, “A user-assisted approach to component clustering,” Journal of Software Maintenance and Evolution, vol. 15, no. 4, pp. 265–295, 2003. View at Publisher · View at Google Scholar · View at Scopus
  48. S. Kawaguchi, P. K. Garg, M. Matsushita, and K. Inoue, “MUDABlue: an automatic categorization system for open source repositories,” in Proceedings of the 11th Asia-Pacific Software Engineering Conference (APSEC '04), pp. 184–193, Busan, Korea, December 2004. View at Publisher · View at Google Scholar · View at Scopus
  49. A. Kuhn, S. Ducasse, and T. Gîrba, “Enriching reverse engineering with semantic clustering,” in Proceedings of the 12th Working Conference on Reverse Engineering (WCRE '05), pp. 133–142, Pittsburgh, Pa, USA, November 2005. View at Publisher · View at Google Scholar · View at Scopus
  50. A. Corazza, S. Di Martino, V. Maggio, and G. Scanniello, “Investigating the use of lexical information for software system clustering,” in Proceedings of the 15th European Conference on Software Maintenance and Reengineering (CSMR '11), pp. 35–44, IEEE, Oldenburg, Germany, March 2011. View at Publisher · View at Google Scholar · View at Scopus
  51. G. Scanniello, M. Risi, and G. Tortora, “Architecture recovery using Latent Semantic Indexing and k-Means: an empirical evaluation,” in Proceedings of the 8th IEEE International Conference on Software Engineering and Formal Methods (SEFM '10), pp. 103–112, September 2010. View at Publisher · View at Google Scholar · View at Scopus
  52. A. Corazza, S. Di Martino, and G. Scanniello, “A probabilistic based approach towards software system clustering,” in Proceedings of the 14th European Conference on Software Maintenance and Reengineering (CSMR '10), pp. 88–96, Madrid, Spain, March 2010. View at Publisher · View at Google Scholar · View at Scopus
  53. G. Scanniello, A. D'Amico, C. D'Amico, and T. D'Amico, “Using the Kleinberg algorithm and vector space model for software system clustering,” in Proceedings of the 18th IEEE International Conference on Program Comprehension (ICPC '10), pp. 180–189, Minho, Portugal, July 2010. View at Publisher · View at Google Scholar · View at Scopus
  54. G. Scanniello and A. Marcus, “Clustering support for static concept location in source code,” in Proceedings of the IEEE 19th International Conference on Program Comprehension (ICPC '11), pp. 1–10, Kingston, Canada, June 2011. View at Publisher · View at Google Scholar · View at Scopus
  55. J. I. Maletic and A. Marcus, “Supporting program comprehension using semantic and structural information,” in Proceedings of the 23rd International Conference ojn Software Engineering (ICSE '01), pp. 103–112, Toronto, Canada, May 2001. View at Scopus
  56. V. Tzerpos and R. Holt, “ACCD: an algorithm for comprehension-driven clustering,” in Proceedings of the 7th Working Conference on Reverse Engineering (WCRE '00), pp. 258–267, Brisbane, Australia, November 2000. View at Publisher · View at Google Scholar