Table of Contents Author Guidelines Submit a Manuscript
International Journal of Digital Multimedia Broadcasting
Volume 2010, Article ID 486487, 18 pages
http://dx.doi.org/10.1155/2010/486487
Research Article

Multimodal Indexing of Multilingual News Video

1TCS Innovation Labs Delhi, TCS Towers, 249 D&E Udyog Vihar Phase IV, Gurgaon 122015, India
2TCS Innovation Labs Mumbai, Yantra Park, Pokhran Road no. 2, Thane West 400601, India
3TCS Innovation Labs Kolkata, Plot A2, M2-N2 Sector 5, Block GP, Salt Lake Electronics Complex, Kolkata 700091, India

Received 16 September 2009; Revised 27 December 2009; Accepted 2 March 2010

Academic Editor: Ling Shao

Copyright © 2010 Hiranmay Ghosh et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Linked References

  1. G. S. Lehal, “Optical character recognition of Gurumukhi script using multiple classifiers,” in Proceedings of the International Workshop on Multilingual (OCR '09), Barcelona, Spain, July 2009.
  2. C. V. Jawahar, M. N. S. S. K. P. Kumar, and S. S. R. Kiran, “A bilingual OCR for Hindi-Telugu documents and its applications,” in Proceedings of the 7th International Conference on Document Analysis and Recognition (ICDAR '03), vol. 1, p. 408, 2003.
  3. E. Hassan, S. Chaudhury, and M. Gopal, “Shape descriptor based document image indexing and symbol recognition,” in Proceedings of the International Conference on Document Analysis and Recognition, 2009.
  4. U. Bhattacharya and B. B. Chaudhuri, “Handwritten numeral databases of Indian scripts and multistage recognition of mixed numerals,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 31, no. 3, pp. 444–457, 2009. View at Publisher · View at Google Scholar · View at PubMed
  5. S. K. Parui, K. Guin, U. Bhattacharya, and B. B. Chaudhuri, “Online handwritten Bangla character recognition using HMM,” in Proceedings of the International Conference on Pattern Recognition (ICPR '08), pp. 1–4, 2008.
  6. S. Eickeler and S. Mueller, “Content-based video indexing of TV broadcast news using hidden Markov models,” in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '99), vol. 6, pp. 2997–3000, March 1999.
  7. J. R. Smith, M. Campbell, M. Naphade, A. Natsev, and J. Tesic, “Learning and classification of semantic concepts in broadcast video,” in Proceedings of the International Conference of Intelligence Analysis, 2005.
  8. J.-L. Gauvain, L. Lamel, and G. Adda, “Transcribing broadcast news for audio and video indexing,” Communications of the ACM, vol. 43, no. 2, pp. 64–70, 2000. View at Google Scholar
  9. H. Meinedo and J. Neto, “Detection of acoustic patterns in broadcast news using neural networks,” Acustica, 2004. View at Google Scholar
  10. C.-M. Kuo, C.-P. Chao, W.-H. Chang, and J.-L. Shen, “Broadcast video logo detection and removing,” in Proceedings of the 4th International Conference on Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP '08), pp. 837–840, Harbin, China, August 2008. View at Publisher · View at Google Scholar
  11. D. A. Sadlier, S. Marlow, N. Connor, and N. Murphy, “Automatic TV advertisement detection from MPEG bit stream,” Pattern Recognition, vol. 35, no. 12, pp. 2719–2726, 2002. View at Google Scholar
  12. T.-Y. Liu, T. Qin, and H.-J. Zhang, “Time-constraint boost for TV commercials detection,” in Proceedings of the International Conference on Image Processing (ICIP '04), vol. 3, pp. 1617–1620, October 2004.
  13. X.-S. Hua, L. Lu, and H.-J. Zhang, “Robust learning-based TV commercial detection,” in Proceedings of the ACM International Conference on Multimedia and Expo (ICME '05), pp. 149–152, Amsterdam, The Netherlands, July 2005. View at Publisher · View at Google Scholar
  14. K. Ng and V. W. Zue, “Phonetic recognition for spoken document retrieval,” in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '98), vol. 1, pp. 325–328, 1998.
  15. M. Laxminarayana and S. Kopparapu, “Semi-automatic generation of pronunciation dictionary for proper names: an optimization approach,” in Proceedings of the 6th International Conference on Natural Language Processing (ICON '08), pp. 118–126, CDAC, Pune, India, December 2008.
  16. J. Makhoul, F. Kubala, T. Leek et al., “Speech and language technologies for audio indexing and retrieval,” Proceedings of the IEEE, vol. 88, no. 8, pp. 1338–1352, 2000. View at Google Scholar
  17. S. Renals, D. Abberley, D. Kirby, and T. Robinson, “Indexing and retrieval of broadcast news,” Speech Communication, vol. 32, no. 1, pp. 5–20, 2000. View at Publisher · View at Google Scholar
  18. T. Chua, S. Y. Neo, K. Li et al., “TRECVID 2004 search and feature extraction tasks by NUS PRIS,” in NIST TRECVID-2004, 2004. View at Google Scholar
  19. T. Chua, S.-F. Chang, L. Chaisorn, and W. Hsu, “Story boundary detection in large broadcast news video archives: techniques, experience and trends,” in Proceedings of the 12th ACM International Conference on Multimedia (MM '04), pp. 656–659, 2004.
  20. A. Rosenberg and J. Hirschberg, “Story segmentation of broadcast news in English, Mandarin and Arabic,” in Proceedings of the Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, June 2006.
  21. M. Franz and J.-M. Xu, “Story segmentation of broadcast news in Arabic, Chinese and English using multi-window features,” in Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '07), pp. 703–704, 2007. View at Publisher · View at Google Scholar
  22. M. A. Hearst, “TextTiling: segmenting text into multi-paragraph subtopic passages,” Computational Linguistics, vol. 23, no. 1, pp. 33–64, 1997. View at Google Scholar
  23. X. Gao and X. Tang, “Unsupervised video-shot segmentation and model-free anchor-person detection for news video parsing,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 12, no. 9, pp. 765–776, 2002. View at Google Scholar
  24. S.-F. Chang, R. Manmatha, and T.-S. Chua, “Combining text and audio-visual features in video indexing,” in Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '05), vol. 5, pp. 1005–1008, 2005. View at Publisher · View at Google Scholar
  25. L. Chaisorn, T.-S. Chua, and C.-H. Lee, “A multi-modal approach to story segmentation for news video,” World Wide Web, vol. 6, no. 2, pp. 187–208, 2003. View at Publisher · View at Google Scholar
  26. L. Besacier, G. Quénot, S. Ayache, and D. Moraru, “Video story segmentation with multi-modal features: experiments on TRECvid 2003,” in Proceedings of the 6th ACM SIGMM International Workshop on Multimedia Information Retrieval (MIR '04), pp. 221–226, October 2004.
  27. Anonymous, “F1 Score,” Wikipedia—The Free Encyclopedia, February 2010, http://en.wikipedia.org/wiki/F1_score.
  28. F. Colace, P. Foggia, and G. Percannella, “A probabilistic framework for TV-news stories detection and classification,” in Proceedings of the IEEE International Conference on Multimedia and Expo (ICME '05), pp. 1350–1353, July 2005. View at Publisher · View at Google Scholar
  29. G. Harit, S. Chaudhury, and H. Ghosh, “Using multimedia ontology for generating conceptual annotations and hyperlinks in video collections,” in Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence (WI '06), pp. 211–217, Hong Kong, December 2006. View at Publisher · View at Google Scholar
  30. Anonymous, “News Ticker,” Wikipedia—The Free Encyclopedia, February 2010, http://en.wikipedia.org/wiki/News_ticker.
  31. D. Winer, “RSS 2.0 Specification,” Wikipedia—The free Encyclopedia, February 2010, http://cyber.law.harvard.edu/rss/rss.html.
  32. S. Kopparapu, A. Srivastava, and P. V. S. Rao, “Minimal parsing key concept based question answering system,” Human Computer Interaction, vol. 3, 2007. View at Google Scholar
  33. P. Gelin and C. J. Wellekens, “Keyword spotting for video soundtrack indexing,” in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1, pp. 299–302, May 1996.
  34. Y. Oh, J.-S. Park, and K.-M. Park, “Keyword spotting in broadcast news,” in Global-Network-Oriented Information Electronics (IGNOIE-COE06), pp. 208–213, Sendai, Japan, January 2007.
  35. G. Quenot, T. P. Tan, L. V. Bac, S. Ayache, L. Besacier, and P. Mulhem, “Content-based search in multi-lingual audiovisual documents using the international phonetic alphabet,” in Proceedings of the 7th International Workshop on Content-Based Multimedia Indexing (CBMI '09), Chania, Greece, June 2009.
  36. D. Dimitriadis, A. Metallinou, I. Konstantinou, G. Goumas, P. Maragos, and N. Koziris, “GRIDNEWS1a distribured automatic Greek broadcast transcription system,” in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '09), 2009.
  37. J. Yi, Y. Peng, and J. Xiao, “Color-based clustering for text detection and extraction in image,” in Proceedings of the ACM International Multimedia Conference and Exhibition (MM '07), pp. 847–850, Augsburg, Germany, Sebtember 2007. View at Publisher · View at Google Scholar
  38. J. Sun, Z. Wang, H. Yu, F. Nishino, Y. Katsuyama, and S. Naoi, “Effective text extraction and recognition for WWW images,” in Proceedings of the ACM Symposium on Document Engineering (DocEng '03), pp. 115–117, Grenoble, France, November 2003.
  39. Q. Ye, Q. Huang, W. Gao, and D. Zhao, “Fast and robust text detection in images and video frames,” Image and Vision Computing, vol. 23, no. 6, pp. 565–576, 2005. View at Publisher · View at Google Scholar
  40. J. Gllavata, R. Ewerth, and B. Freisleben, “Tracking text in MPEG videos,” ACM, 2004. View at Google Scholar
  41. A. D. Bagdanov, L. Ballan, M. Bertini, and A. Del Bimbo, “Trademark matching and retrieval in sports video databases,” in Proceedings of the International Workshop on Multimedia Information Retrieval (MIR '07), pp. 79–86, Augsburg, Germany, Sebtember 2007. View at Publisher · View at Google Scholar
  42. N. Otsu, “A threshold selection method from gray-level histograms,” IEEE Transactions on Systems, Man, and Cybernetics, vol. 9, no. 1, pp. 62–66, 1979. View at Google Scholar
  43. R. Y. Tsai and T. S. Huang, “Multiple frame image restoration and registration,” in Advances in Computer Vision and Image Processing, pp. 317–339, JAI Press, Greenwich, Conn, USA, 1984. View at Google Scholar
  44. V. H. Patil, D. S. Bormane, and H. K. Patil, “Color super resolution image reconstruction,” in Proceedings of the International Conference on Computational Intelligence and Multimedia Applications (ICCIMA '07), vol. 3, pp. 366–370, 2007. View at Publisher · View at Google Scholar
  45. P. Vandewalle, S. Süsstrunk, and M. Vetterli, “A frequency domain approach to registration of aliased images with application to super-resolution,” EURASIP Journal on Applied Signal Processing, vol. 2006, pp. 1–14, 2006. View at Publisher · View at Google Scholar
  46. M. A. Hasnat, M. R. Chowdhury, and M. Khan, “Integrating Bangla script recognition support in Tesseract OCR,” in Proceedings of the Conference on Language and Technology, 2009.
  47. S. V. Rice, F. R. Jenkins, and T. A. Nartker, “The fourth annual test of OCR accuracy,” Tech. Rep. 95-04, Information Science Research Institute, University of Nevada, Las Vegas, Nev, USA, April 1995. View at Google Scholar
  48. R. Smith, “An overview of the Tesseract OCR engine,” in Proceedings of the 9th International Conference on Document Analysis and Recognition (ICDAR '07), vol. 2, pp. 629–633, September 2007.
  49. M. Gilleland, “Levenshtein Distance, in Three Flavors,” February 2010, http://www.merriampark.com/ld.htm.
  50. R. Lienhart and A. Wernicke, “Localizing and segmenting text in images and videos,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 12, no. 4, pp. 256–268, 2002. View at Publisher · View at Google Scholar
  51. A. G. Hauptmann, R. Jin, and T. D. Ng, “Multi-modal information retrieval from broadcast video using OCR and speech recognition,” in Proceedings of the 2nd ACM International Conference on Digital Libraries, pp. 160–161, Portland, Ore, USA, 2002.