Table of Contents Author Guidelines Submit a Manuscript
Mathematical Problems in Engineering
Volume 2015 (2015), Article ID 293176, 9 pages
http://dx.doi.org/10.1155/2015/293176
Research Article

Obtaining Cross Modal Similarity Metric with Deep Neural Architecture

1School of Computers, Beijing University of Posts and Telecommunications, Beijing 100876, China
2Engineering Research Center of Information Networks, Ministry of Education, Beijing 100876, China

Received 14 September 2014; Accepted 24 December 2014

Academic Editor: Florin Pop

Copyright © 2015 Ruifan Li et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Linked References

  1. L. Li, H. Peng, J. Kurths, Y. Yang, and H. J. Schellnhuber, “Chaos-order transition in foraging behavior of ants,” Proceedings of the National Academy of Sciences, vol. 111, no. 23, pp. 8392–8397, 2014. View at Publisher · View at Google Scholar
  2. Y. Bengio, “Learning deep architectures for AI,” Foundations and Trends in Machine Learning, vol. 2, no. 1, pp. 1–27, 2009. View at Google Scholar · View at Scopus
  3. K. Chen and A. Salman, “Learning speaker-specific characteristics with a deep neural architecture,” IEEE Transactions on Neural Networks, vol. 22, no. 11, pp. 1744–1756, 2011. View at Publisher · View at Google Scholar · View at Scopus
  4. A.-R. Mohamed, G. E. Dahl, and G. Hinton, “Acoustic modeling using deep belief networks,” IEEE Transactions on Audio, Speech and Language Processing, vol. 20, no. 1, pp. 14–22, 2012. View at Publisher · View at Google Scholar · View at Scopus
  5. G. Heigold, V. Vanhoucke, A. Senior et al., “Multilingual acoustic models using distributed deep neural networks,” in Proceedings of the 38th IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '13), pp. 8619–8623, IEEE Computer Society, Vancouver, Canada, May 2013. View at Publisher · View at Google Scholar · View at Scopus
  6. D. Yu, L. Deng, and F. Seide, “The deep tensor neural network with applications to large vocabulary speech recognition,” IEEE Transactions on Audio, Speech and Language Processing, vol. 21, no. 2, pp. 388–396, 2013. View at Publisher · View at Google Scholar · View at Scopus
  7. G. E. Hinton and R. R. Salakhutdinov, “Reducing the dimensionality of data with neural networks,” Science, vol. 313, no. 5786, pp. 504–507, 2006. View at Publisher · View at Google Scholar · View at MathSciNet · View at Scopus
  8. H. Lee, R. Grosse, R. Ranganath, and A. Y. Ng, “Unsupervised learning of hierarchical representations with convolutional deep belief networks,” Communications of the ACM, vol. 54, no. 10, pp. 95–103, 2011. View at Publisher · View at Google Scholar · View at Scopus
  9. A. Krizhevsky, I. Sutskever, and G. Hinton, “Imagenet classification with deep convolutional neural networks,” in Advances in Neural Information Processing Systems, vol. 25, pp. 1106–1114, Morgan Kaufmann, Lake Tahoe, Nev, USA, 2012. View at Google Scholar
  10. I. J. Goodfellow, D. Erhan, P. L. Carrier et al., “Challenges in representation learning: a report on three machine learning contests,” in Proceedings of the 20th International Conference on Neural Information Processing, pp. 117–124, IEEE Computer Society, Daegu, Korea, 2013.
  11. C. Farabet, C. Couprie, L. Najman, and Y. Lecun, “Learning hierarchical features for scene labeling,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 35, no. 8, pp. 1915–1929, 2013. View at Publisher · View at Google Scholar · View at Scopus
  12. F. Feng, X. Wang, and R. Li, “Cross-modal retrieval with correspondence autoencoder,” in Proceedings of the 22nd ACM International Conference on Multimedia, pp. 7–16, ACM, Orlando, Fla, USA, 2014.
  13. D. M. Blei and M. I. Jordan, “Modeling annotated data,” in Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 127–134, ACM, New York, NY, USA, 2003. View at Publisher · View at Google Scholar
  14. E. P. Xing, R. Yan, and A. G. Hauptmann, “Mining associated text and images with dual-wing harmoniums,” in Proceedings of the 21st Annual Conference on Uncertainty in Artificial Intelligence (UAI '05), pp. 633–641, AUAI Press, Arlington, Va, USA, July 2005. View at Scopus
  15. Y. Jia, M. Salzmann, and T. Darrell, “Learning cross-modality similarity for multinomial data,” in Proceedings of ACM the International Conference on Multimedia Information Retrieval, pp. 2407–2414, IEEE, Washington, DC, USA, November 2011. View at Publisher · View at Google Scholar · View at Scopus
  16. S. Chopra, R. Hadsell, and Y. LeCun, “Learning a similarity metric discriminatively, with application to face verification,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '05), pp. 539–546, IEEE, Washington, DC, USA, June 2005. View at Publisher · View at Google Scholar · View at Scopus
  17. J. Ngiam, A. Khosla, M. Kim, J. Nam, H. Lee, and A. Y. Ng, “Multimodal deep learning,” in Proceedings of the 28th International Conference on Machine Learning, pp. 689–696, Omnipress, Bellevue, Wash, USA, July 2011. View at Scopus
  18. N. Srivastava and R. Salakhutdinov, “Multimodal learning with deep Boltzmann machines,” in Advances in Neural Information Processing Systems, vol. 25, pp. 2231–2239, Morgan Kaufmann, Lake Tahoe, Nev, USA, 2012. View at Google Scholar
  19. B. McFee and G. Lanckriet, “Learning multi-modal similarity,” Journal of Machine Learning Research, vol. 12, no. 8, pp. 491–523, 2011. View at Google Scholar · View at MathSciNet
  20. J. Masci, M. M. Bronstein, A. M. Bronstein, and J. Schmidhuber, “Multimodal similarity-preserving hashing,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 36, no. 4, pp. 824–830, 2014. View at Publisher · View at Google Scholar · View at Scopus
  21. P. Smolensky, “Information processing in dynamical systems: foundations of harmony theory,” in Parallel Distributed Processing: Explorations in the Microstructure of Cognition, D. E. Rumelhart, J. L. McClelland, and C. PDP Research Group, Eds., vol. 1, pp. 194–281, MIT Press, Cambridge, Mass, USA, 1986. View at Google Scholar
  22. M. Welling, M. Rosen-Zvi, and G. Hinton, “Exponential family harmoniums with an application to information retrieval,” in Advances in Neural Information Processing Systems 17, pp. 501–508, Morgan Kaufmann, Vancouver, Canada, 2004. View at Google Scholar
  23. R. Salakhutdinov and G. Hinton, “Replicated softmax: an undirected topic model,” in Advances in Neural Information Processing Systems, vol. 22, pp. 1607–1614, Morgan Kaufmann, Vancouver, Canada, 2009. View at Google Scholar
  24. G. E. Hinton, S. Osindero, and Y. W. Teh, “A fast learning algorithm for deep belief nets,” Neural Computation, vol. 18, no. 7, pp. 1527–1554, 2006. View at Publisher · View at Google Scholar · View at MathSciNet · View at Scopus
  25. Y. Bengio, P. Lamblin, D. Popovici, and H. Larochelle, “Greedy layer-wise training of deep networks,” in Advances in Neural Information Processing Systems 19, pp. 153–160, MIT Press, Vancouver, Canada, 2007. View at Google Scholar
  26. G. E. Hinton, “Training products of experts by minimizing contrastive divergence,” Neural Computation, vol. 14, no. 8, pp. 1771–1800, 2002. View at Publisher · View at Google Scholar · View at Scopus
  27. Y. LeCun, S. Chopra, R. Hadsell, M. Ranzato, and F.-J. Huang, “A tutorial on energy-based learning,” in Predicting Structured Data, G. Bakir, T. Hofman, B. Schölkopf, A. Smola, and B. Taskar, Eds., pp. 1–59, MIT Press, Cambridge, Mass, USA, 2006. View at Google Scholar
  28. D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning internal representations by error propagation,” Nature, vol. 323, no. 6088, pp. 533–536, 1986. View at Publisher · View at Google Scholar · View at Scopus
  29. L. von Ahn and L. Dabbish, “Labeling images with a computer game,” in Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 319–326, ACM Press, Vienna, Austria, 2004.
  30. N. Pinto, D. D. Cox, and J. J. DiCarlo, “Why is real-world visual object recognition hard?” PLoS Computational Biology, vol. 4, no. 1, article e27, 2008. View at Publisher · View at Google Scholar · View at MathSciNet · View at Scopus
  31. A. Coates and A. Y. Ng, “The importance of encoding versus training with sparse coding and vector quantization,” in Proceedings of the 28th International Conference on Machine Learning (ICML '11), pp. 921–928, Bellevue, Wash, USA, July 2011. View at Scopus
  32. F. M. Ham and I. Kostanic, Principles of Neurocomputing for Science and Engineering, McGraw-Hill Higher Education, 1st edition, 2000.
  33. T. W. Anderson, An Introduction to Multivariate Statistical Analysis, John Wiley and Sons, New York, NY, USA, 3rd edition, 2003. View at MathSciNet
  34. D. R. Hardoon, S. Szedmak, and J. Shawe-Taylor, “Canonical correlation analysis: an overview with application to learning methods,” Neural Computation, vol. 16, no. 12, pp. 2639–2664, 2004. View at Publisher · View at Google Scholar · View at Scopus
  35. N. Rasiwasia, J. C. Pereira, E. Coviello et al., “A new approach to cross-modal multimedia retrieval,” in Proceedings of the 18th ACM International Conference on Multimedia, pp. 251–260, ACM, New York, NY, USA, 2010.
  36. J. Kim, J. Nam, and I. Gurevych, “Learning semantics with deep belief network for cross-language information retrieval,” in Proceedings of the 24th International Conference on Computational Linguistics, pp. 579–588, ACL Press, IIT Bombay, India, 2012.
  37. T. Fawcett, “An introduction to ROC analysis,” Pattern Recognition Letters, vol. 27, no. 8, pp. 861–874, 2006. View at Publisher · View at Google Scholar · View at Scopus
  38. S. V. Shinkareva, V. L. Malave, R. A. Mason, T. M. Mitchell, and M. A. Just, “Commonality of neural representations of words and pictures,” NeuroImage, vol. 54, no. 3, pp. 2418–2425, 2011. View at Publisher · View at Google Scholar · View at Scopus