Table of Contents Author Guidelines Submit a Manuscript
Scientific Programming
Volume 2017 (2017), Article ID 2803091, 13 pages
https://doi.org/10.1155/2017/2803091
Research Article

A Tensor CP Decomposition Method for Clustering Heterogeneous Information Networks via Stochastic Gradient Descent Algorithms

Science and Technology on Information System Engineering Laboratory, National University of Defense Technology, Changsha, China

Correspondence should be addressed to Hongbin Huang

Received 1 December 2016; Revised 14 February 2017; Accepted 16 February 2017; Published 30 April 2017

Academic Editor: Fabrizio Riguzzi

Copyright © 2017 Jibing Wu et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Linked References

  1. Y. Sunt, J. Hant, P. Zhao, Z. Yin, H. Cheng, and T. Wu, “RankClus: integrating clustering with ranking for heterogeneous information network analysis,” in Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology (EDBT '09), pp. 565–576, Saint Petersburg, Russia, March 2009. View at Publisher · View at Google Scholar · View at Scopus
  2. Y. Sun, Y. Yu, and J. Han, “Ranking-based clustering of heterogeneous information networks with star network schema,” in Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '09), pp. 797–805, Paris, France, July 2009. View at Publisher · View at Google Scholar · View at Scopus
  3. Y. Sun, J. Han, X. Yan, P. S. Yu, and T. Wu, “PathSim: meta path-based top-K similarity search in heterogeneous information networks,” PVLDB, vol. 4, no. 11, pp. 992–1003, 2011. View at Google Scholar
  4. J. MacQueen, “Some methods for classification and analysis of multivariate observations,” in Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Statistics, pp. 281–297, University of California Press, Berkeley, Calif, USA, 1967.
  5. S. Guha, R. Rastogi, and K. Shim, “Cure: an efficient clustering algorithm for large databases,” Information Systems, vol. 26, no. 1, pp. 35–58, 2001. View at Publisher · View at Google Scholar · View at Scopus
  6. M. Ester, H. Kriegel, J. Sander, and X. Xu, “A density-based algorithm for discovering clusters in large spatial databases with noise,” in Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining (KDD-96), E. Simoudis, J. Han, and U. M. Fayyad, Eds., pp. 226–231, AAAI Press, Portland, Ore, USA, 1996.
  7. W. Wang, J. Yang, and R. R. Muntz, “STING: a statistical information grid approach to spatial data mining,” in Proceedings of the 23rd International Conference on Very Large Data Bases (VLDB '97), M. Jarke, M. J. Carey, K. R. Dittrich, F. H. Lochovsky, P. Loucopoulos, and M. A. Jeusfeld, Eds., pp. 186–195, Morgan Kaufmann, Athens, Greece, August 1997, http://www.vldb.org/conf/1997/P186.PDF.
  8. E. H. Ruspini, “New experimental results in fuzzy clustering,” Information Sciences, vol. 6, no. 73, pp. 273–284, 1973. View at Publisher · View at Google Scholar · View at Scopus
  9. U. von Luxburg, “A tutorial on spectral clustering,” Statistics and Computing, vol. 17, no. 4, pp. 395–416, 2007. View at Publisher · View at Google Scholar · View at MathSciNet
  10. C. Shi, Y. Li, J. Zhang, Y. Sun, and P. S. Yu, “A survey of heterogeneous information network analysis,” IEEE Transactions on Knowledge and Data Engineering, vol. 29, no. 1, pp. 17–37, 2017. View at Publisher · View at Google Scholar
  11. J. Han, Y. Sun, X. Yan, and P. S. Yu, “Mining knowledge from data: an information network analysis approach,” in Proceedings of the IEEE 28th International Conference on Data Engineering (ICDE '12), pp. 1214–1217, IEEE, Washington, DC, USA, April 2012. View at Publisher · View at Google Scholar · View at Scopus
  12. J. Chen, W. Dai, Y. Sun, and J. Dy, “Clustering and ranking in heterogeneous information networks via gamma-poisson model,” in Proceedings of the SIAM International Conference on Data Mining (SDM '15), pp. 424–432, May 2015. View at Scopus
  13. J. Yang, L. Chen, and J. Zhang, “FctClus: a fast clustering algorithm for heterogeneous information networks,” PLoS ONE, vol. 10, no. 6, Article ID e0130086, 2015. View at Publisher · View at Google Scholar · View at Scopus
  14. C. Shi, R. Wang, Y. Li, P. S. Yu, and B. Wu, “Ranking-based clustering on general heterogeneous information networks by network projection,” in Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management (CIKM ’14), pp. 699–708, ACM, Shanghai, China, November 2014. View at Publisher · View at Google Scholar
  15. Y. Sun, B. Norick, J. Han, X. Yan, P. S. Yu, and X. Yu, “Integrating meta-path selection with user-guided object clustering in heterogeneous information networks,” in Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '12), pp. 1348–1356, ACM, Beijing, China, August 2012. View at Publisher · View at Google Scholar · View at Scopus
  16. Y. Sun, B. Norick, J. Han, X. Yan, P. S. Yu, and X. Yu, “PathSelClus: integrating meta-path selection with user-guided Object clustering in heterogeneous information networks,” ACM Transactions on Knowledge Discovery from Data, vol. 7, no. 3, pp. 723–724, 2013. View at Publisher · View at Google Scholar · View at Scopus
  17. X. Yu, Y. Sun, B. Norick, T. Mao, and J. Han, “User guided entity similarity search using meta-path selection in heterogeneous information networks,” in Proceedings of the 21st ACM International Conference on Information and Knowledge Management (CIKM '12), pp. 2025–2029, ACM, November 2012. View at Publisher · View at Google Scholar · View at Scopus
  18. Y. Sun, C. C. Aggarwal, and J. Han, “Relation strength-aware clustering of heterogeneous information networks with incomplete attributes,” Proceedings of the VLDB Endowment, vol. 5, no. 5, pp. 394–405, 2012. View at Google Scholar
  19. M. Zhang, H. Hu, Z. He, and W. Wang, “Top-k similarity search in heterogeneous information networks with x-star network schema,” Expert Systems with Applications, vol. 42, no. 2, pp. 699–712, 2015. View at Publisher · View at Google Scholar · View at Scopus
  20. Y. Zhou and L. Liu, “Social influence based clustering of heterogeneous information networks,” in Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 338–346, ACM, Chicago, Ill, USA, August 2013. View at Publisher · View at Google Scholar
  21. L. R. Tucker, “Some mathematical notes on three-mode factor analysis,” Psychometrika, vol. 31, pp. 279–311, 1966. View at Publisher · View at Google Scholar · View at MathSciNet · View at Scopus
  22. R. A. Harshman, “Foundations of the parafac procedure: model and conditions for an ‘explanatory’ multi-mode factor analysis,” UCLA Working Papers in Phonetics, 1969. View at Google Scholar
  23. H. A. L. Kiers, “Towards a standardized notation and terminology in multiway analysis,” Journal of Chemometrics, vol. 14, no. 3, pp. 105–122, 2000. View at Publisher · View at Google Scholar · View at Scopus
  24. T. G. Kolda and B. W. Bader, “Tensor decompositions and applications,” SIAM Review, vol. 51, no. 3, pp. 455–500, 2009. View at Publisher · View at Google Scholar · View at MathSciNet · View at Scopus
  25. A. Cichocki, D. Mandic, L. De Lathauwer et al., “Tensor decompositions for signal processing applications: from two-way to multiway component analysis,” IEEE Signal Processing Magazine, vol. 32, no. 2, pp. 145–163, 2015. View at Publisher · View at Google Scholar · View at Scopus
  26. W. Peng and T. Li, “Tensor clustering via adaptive subspace iteration,” Intelligent Data Analysis, vol. 15, no. 5, pp. 695–713, 2011. View at Publisher · View at Google Scholar · View at Scopus
  27. J. Hastad, “Tensor rank is NP-complete,” Journal of Algorithms. Cognition, Informatics and Logic, vol. 11, no. 4, pp. 644–654, 1990. View at Publisher · View at Google Scholar · View at MathSciNet · View at Scopus
  28. S. Metzler and P. Miettinen, “Clustering Boolean tensors,” Data Mining & Knowledge Discovery, vol. 29, no. 5, pp. 1343–1373, 2015. View at Publisher · View at Google Scholar · View at Scopus
  29. X. Cao, X. Wei, Y. Han, and D. Lin, “Robust face clustering via tensor decomposition,” IEEE Transactions on Cybernetics, vol. 45, no. 11, pp. 2546–2557, 2015. View at Publisher · View at Google Scholar · View at Scopus
  30. I. Sutskever, R. Salakhutdinov, and J. B. Tenenbaum, “Modelling relational data using Bayesian clustered tensor factorization,” in Proceedings of the 23rd Annual Conference on Neural Information Processing Systems (NIPS '09), pp. 1821–1828, British Columbia, Canada, December 2009. View at Scopus
  31. B. Ermiş, E. Acar, and A. T. Cemgil, “Link prediction in heterogeneous data via generalized coupled tensor factorization,” Data Mining & Knowledge Discovery, vol. 29, no. 1, pp. 203–236, 2015. View at Publisher · View at Google Scholar · View at Scopus
  32. A. R. Benson, D. F. Gleiche, and J. Leskovec, “Tensor spectral clustering for partitioning higher-order network structures,” in Proceedings of the SIAM International Conference on Data Mining (SDM '15), pp. 118–126, Vancouver, Canada, May 2015. View at Scopus
  33. L. Xiong, X. Chen, T.-K. Huang, J. G. Schneider, and J. G. Carbonell, “Temporal collaborative filtering with Bayesian probabilistic tensor factorization,” in Proceedings of the 10th SIAM International Conference on Data Mining (SDM '10), pp. 211–222, Columbus, Ohio, USA, May 2010. View at Scopus
  34. E. E. Papalexakis, L. Akoglu, and D. Ience, “Do more views of a graph help? Community detection and clustering in multi-graphs,” in Proceedings of the 16th International Conference of Information Fusion (FUSION '13), pp. 899–905, Istanbul, Turkey, July 2013. View at Scopus
  35. M. Vandecappelle, M. Bousse, F. V. Eeghem, and L. D. Lathauwer, “Tensor decompositions for graph clustering,” Internal Report 16-170, ESAT-STADIUS, KU Leuven, Leuven, Belgium, 2016, ftp://ftp.esat.kuleuven.be/pub/SISTA/sistakulak/reports/2016_Tensor_Graph_Clustering.pdf. View at Google Scholar
  36. W. Shao, L. He, and P. S. Yu, “Clustering on multi-source incomplete data via tensor modeling and factorization,” in Proceedings of the 19th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD '15), pp. 485–497, Ho Chi Minh City, Vietnam, 2015.
  37. J. D. Carroll and J.-J. Chang, “Analysis of individual differences in multidimensional scaling via an n-way generalization of “Eckart-Young” decomposition,” Psychometrika, vol. 35, no. 3, pp. 283–319, 1970. View at Publisher · View at Google Scholar · View at Zentralblatt MATH · View at Scopus
  38. L. De Lathauwer, B. De Moor, and J. Vandewalle, “On the best rank-1 and rank-(R1, R2, . . . , RN) approximation of higher-order tensors,” SIAM Journal on Matrix Analysis and Applications, vol. 21, no. 4, pp. 1324–1342, 2000. View at Publisher · View at Google Scholar · View at Scopus
  39. E. Acar, D. M. Dunlavy, and T. G. Kolda, “A scalable optimization approach for fitting canonical tensor decompositions,” Journal of Chemometrics, vol. 25, no. 2, pp. 67–86, 2011. View at Publisher · View at Google Scholar · View at Scopus
  40. S. Hansen, T. Plantenga, and T. G. Kolda, “Newton-based optimization for Kullback-Leibler nonnegative tensor factorizations,” Optimization Methods & Software, vol. 30, no. 5, pp. 1002–1029, 2015. View at Publisher · View at Google Scholar · View at MathSciNet · View at Scopus
  41. N. Vervliet and L. De Lathauwer, “A randomized block sampling approach to canonical polyadic decomposition of large-scale tensors,” IEEE Journal on Selected Topics in Signal Processing, vol. 10, no. 2, pp. 284–295, 2016. View at Publisher · View at Google Scholar · View at Scopus
  42. R. Ge, F. Huang, C. Jin, and Y. Yuan, “Escaping from saddle points—online stochastic gradient for tensor decomposition,” http://arxiv.org/abs/1503.02101.
  43. T. G. Kolda, “Multilinear operators for higher-order decompositions,” Tech. Rep. SAND2006-2081, Sandia National Laboratories, 2006, http://www.osti.gov/scitech/biblio/923081=0pt. View at Google Scholar
  44. P. Paatero, “A weighted non-negative least squares algorithm for three-way ‘PARAFAC’ factor analysis,” Chemometrics & Intelligent Laboratory Systems, vol. 38, no. 2, pp. 223–242, 1997. View at Publisher · View at Google Scholar · View at Scopus
  45. P. Paatero, “Construction and analysis of degenerate PARAFAC models,” Journal of Chemometrics, vol. 14, no. 3, pp. 285–299, 2000. View at Publisher · View at Google Scholar · View at Scopus
  46. B. L. Bottou and N. Murata, “Stochastic approximations and efficient learning,” in The Handbook of Brain Theory and Neural Networks, 2nd edition, 2002. View at Google Scholar
  47. B. W. Bader and T. G. Kolda, “Efficient MATLAB computations with sparse and factored tensors,” SIAM Journal on Scientific Computing, vol. 30, no. 1, pp. 205–231, 2007. View at Publisher · View at Google Scholar · View at MathSciNet · View at Scopus
  48. A. Strehl and J. Ghosh, “Cluster ensembles—a knowledge reuse framework for combining multiple partitions,” Journal of Machine Learning Research, vol. 3, no. 3, pp. 583–617, 2003. View at Publisher · View at Google Scholar · View at MathSciNet · View at Scopus