Table of Contents Author Guidelines Submit a Manuscript
Computational Intelligence and Neuroscience
Volume 2008 (2008), Article ID 872425, 15 pages
http://dx.doi.org/10.1155/2008/872425
Research Article

Extended Nonnegative Tensor Factorisation Models for Musical Sound Source Separation

1Department of Electronic Engineering, Cork Institute of Technology, Cork, Ireland
2School of Electrical Engineering Systems, Dublin Institute of Technology, Kevin Street, Dublin, Ireland

Received 18 December 2007; Revised 3 March 2008; Accepted 17 April 2008

Academic Editor: Morten Morup

Copyright © 2008 Derry FitzGerald et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Linked References

  1. J. P. Stautner, Analysis and synthesis of music using the auditory transform [M.S. thesis], MIT Electrical Engineering and Computer Science Department, Massachusetts Institute of Technology, Cambridge, Mass, USA, 1983.
  2. P. Comon, “Independent component analysis, a new concept?” Signal Processing, vol. 36, no. 3, pp. 287–314, 1994. View at Publisher · View at Google Scholar
  3. M. S. Lewicki and T. J. Sejnowski, “Learning overcomplete representations,” Neural Computation, vol. 12, no. 2, pp. 337–365, 2000. View at Publisher · View at Google Scholar
  4. B. A. Olshausen and D. J. Field, “Sparse coding of sensory inputs,” Current Opinion in Neurobiology, vol. 14, no. 4, pp. 481–487, 2004. View at Publisher · View at Google Scholar
  5. D. Lee and H. Seung, “Learning the parts of objects by nonnegative matrix factorisation,” Nature, vol. 401, no. 6755, pp. 788–791, 1999. View at Publisher · View at Google Scholar
  6. P. Paatero and U. Tapper, “Positive matrix factorization: a non-negative factor model with optimal utilization of error estimates of data values,” Environmetrics, vol. 5, no. 2, pp. 111–126, 1994. View at Publisher · View at Google Scholar
  7. M. Casey and A. Westner, “Separation of mixed audio sources by independent subspace analysis,” in Proceedings of the International Computer Music Conference (ICMC '00), pp. 154–161, Berlin, Germany, August-September 2000.
  8. T. Virtanen, “Sound source separation using sparse coding with temporal continuity objective,” in Proceedings of the International Computer Music Conference (ICMC '03), pp. 231–234, Singapore, September 2003.
  9. P. Smaragdis and J. C. Brown, “Non-negative matrix factorization for polyphonic music transcription,” in Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA '03), pp. 177–180, New Paltz, NY, USA, October 2003.
  10. D. FitzGerald, B. Lawlor, and E. Coyle, “Sub-band independent subspace analysis for drum transcription,” in Proceedings of the 5th International Conference on Digital Audio Effects (DAFX '02), pp. 65–69, Hamburg, Germany, September 2002.
  11. S. Raczynski, N. Ono, and S. Sagayama, “Multipitch analysis with harmonic nonnegative matrix approximation,” in Proceedings of the 8th International Conference on Music Information Retrieval (ISMIR '07), pp. 381–386, Vienna, Austria, September 2007.
  12. P. Sajda, S. Du, and L. Parra, “Recovery of constituent spectra using non-negative matrix factorization,” in Wavelets: Applications in Signal and Image Processing X, vol. 5207 of Proceedings of SPIE, pp. 321–331, San Diego, Calif, USA, August 2003. View at Publisher · View at Google Scholar
  13. T. Virtanen, Sound source separation in monaural music signals [Ph.D. thesis], Tampere University of Technology, Tampere, Finland, 2006.
  14. D. FitzGerald, M. Cranitch, and E. Coyle, “Shifted non-negative matrix factorisation for sound source separation,” in Proceedings of the 13th IEEE/SP Workshop on Statistical Signal Processing, pp. 1132–1137, Bordeaux, France, July 2005. View at Publisher · View at Google Scholar
  15. M. Mørup, L. K. Hansen, and S. M. Arnfred, “Sparse higher order non-negative matrix factorization,” Technical Report IMM2007-04658, Technical University of Denmark.
  16. S. A. Abdallah and M. D. Plumbley, “Polyphonic transcription by non-negative sparse coding of power spectra,” in Proceedings of the 5th International Conference on Music Information Retrieval (ISMIR '04), pp. 318–325, Barcelona, Spain, October 2004.
  17. R. M. Parry and I. Essa, “Incorporating phase information for source separation via spectrogram factorization,” in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '07), vol. 2, pp. 661–664, Honolulu, Hawaii, USA, April 2007. View at Publisher · View at Google Scholar
  18. R. M. Parry and I. Essa, “Phase-aware non-negative spectrogram factorization,” in Proceedings of the 7th International Conference on Independent Component Analysis and Signal Separation (ICA '07), vol. 4666 of Lecture Notes in Computer Science, pp. 536–543, London, UK, September 2007. View at Publisher · View at Google Scholar
  19. R. Kompass, “A generalized divergence measure for non-negative matrix factorization,” in Proceedings of the Neuroinformatics Workshop, Torun, Poland, September 2005.
  20. A. Cichocki, R. Zdunek, and S.-I. Amari, “Csiszár's divergences for non-negative matrix factorization: family of new algorithms,” in Proceedings of the 6th International Conference on Independent Component Analysis and Blind Signal Separation (ICA '06), vol. 3889 of Lecture Notes in Computer Science, pp. 32–39, Springer, Charleston, SC, USA, March 2006. View at Publisher · View at Google Scholar
  21. P. D. O. Grady, Sparse separation of under-determined speech mixtures [Ph.D. thesis], National University of Ireland Maynooth, Kildare, Ireland, 2007.
  22. D. FitzGerald, Automatic drum transcription and source separation [Ph.D. thesis], Dublin Institute of Technology, Dublin, Ireland, 2004.
  23. B. W. Bader and T. G. Kolda, “Algorithm 862: MATLAB tensor classes for fast algorithm prototyping,” ACM Transactions on Mathematical Software, vol. 32, no. 4, pp. 635–653, 2006. View at Publisher · View at Google Scholar
  24. D. FitzGerald, M. Cranitch, and E. Coyle, “Non-negative tensor factorisation for sound source separation,” in Proceedings of the Irish Signals and Systems Conference, pp. 8–12, Dublin, Ireland, September 2005.
  25. R. M. Parry and I. Essa, “Estimating the spatial position of spectral components in audio,” in Proceedings of the 6th International Conference on Independent Component Analysis and Blind Signal Separation (ICA '06), vol. 3889 of Lecture Notes in Computer Science, pp. 666–673, Charleston, SC, USA, March 2006. View at Publisher · View at Google Scholar
  26. D. Barry, B. Lawlor, and E. Coyle, “Sound source separation: azimuth discrimination and resynthesis,” in Proceedings of the 7th International Conference on Digital Audio Effects (DAFX '04), Naples, Italy, October 2004.
  27. P. Smaragdis, “Non-negative matrix factor deconvolution; extraction of multiple sound sources from monophonic inputs,” in Proceedings of the 5th International Conference on Independent Component Analysis and Blind Signal Separation, vol. 3195 of Lecture Notes in Computer Science, pp. 494–499, Grenada, Spain, September 2004. View at Publisher · View at Google Scholar
  28. T. Virtanen, “Separation of sound sources by convolutive sparse coding,” in Proceedings of the ISCA Tutorial and Research Workshop on Statistical and Perceptual Audio Processing (SAPA '04), Jeju, Korea, October 2004.
  29. M. N. Schmidt and M. Mørup, “Nonnegative matrix factor 2-D deconvolution for blind single channel source separation,” in Proceedings of the 6th International Conference on Independent Component Analysis and Blind Signal Separation (ICA '06), vol. 3889 of Lecture Notes in Computer Science, pp. 700–707, Charleston, SC, USA, March 2006. View at Publisher · View at Google Scholar
  30. E. Vincent and X. Rodet, “Music transcription with ISA and HMM,” in Proceedings of the 5th International Conference on Independent Component Analysis and Blind Signal Separation (ICA '04), vol. 3195 of Lecture Notes in Computer Science, pp. 1197–1204, Granada, Spain, September 2004. View at Publisher · View at Google Scholar
  31. A. B. Nielsen, S. Sigurdsson, L. K. Hansen, and J. Arenas-García, “On the relevance of spectral features for instrument classification,” in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '07), vol. 2, pp. 485–488, Honolulu, Hawaii, USA, April 2007. View at Publisher · View at Google Scholar
  32. J. C. Brown, “Calculation of a constant Q spectral transform,” Journal of the Acoustical Society of America, vol. 89, no. 1, pp. 425–434, 1991. View at Publisher · View at Google Scholar
  33. J. Eggert, H. Wersing, and E. Körner, “Transformation-invariant representation and NMF,” in Proceedings of the IEEE International Joint Conference on Neural Networks (IJCNN '04), vol. 4, pp. 2535–2539, Budapest, Hungary, July 2004. View at Publisher · View at Google Scholar
  34. D. FitzGerald, M. Cranitch, and E. Coyle, “Shifted 2D non-negative tensor factorisation,” in Proceedings of the Irish Signals and Systems Conference, pp. 509–513, Dublin, Ireland, June 2006.
  35. M. Mørup and M. N. Schmidt, “Sparse non-negative tensor 2D deconvolution (SNTF2D) for multi channel time-frequency analysis,” Tech. Rep., Technical University of Denmark, Copenhagen, Denmark, 2006. View at Google Scholar
  36. D. FitzGerald, M. Cranitch, and E. Coyle, “Sound source separation using shifted non-negative tensor factorisation,” in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '06), vol. 5, pp. 653–656, Toulouse, France, May 2006. View at Publisher · View at Google Scholar
  37. M. Slaney, “Pattern playback in the 90s,” in Advances in Neural Information Processing Systems 7, MIT Press, Cambridge, Mass, USA, 1996. View at Google Scholar
  38. D. FitzGerald, M. Cranitch, and E. Coyle, “Resynthesis methods for sound source separation using non-negative factorisation methods,” in Proceedings of the Irish Signals and Systems Conference, Derry, Ireland, September 2007.
  39. D. FitzGerald, M. Cranitch, and M. Cychowski, “Towards an inverse constant Q transform,” in Proceedings of the 120th AES Convention, Paris, France, May 2006.
  40. M. N. Schmidt and M. Mørup, “Nonnegative matrix factor 2-D deconvolution for blind single channel source separation,” in Proceedings of the 6th International Conference on Independent Component Analysis and Blind Signal Separation (ICA '06), vol. 3889 of Lecture Notes in Computer Science, pp. 700–707, Charleston, SC, USA, March 2006. View at Publisher · View at Google Scholar
  41. D. DeFatta, J. Lucas, and W. Hodgkiss, Digital Signal Processing: A System Design Approach, John Wiley & Sons, New York, NY, USA, 1988.
  42. A. Freed, X. Rodet, and P. Depalle, “Performance, synthesis and control of additive synthesis on a desktop computer using FFT-1,” in Proceedings of the 19th International Computer Music Conference (ICMC '93), vol. 19, pp. 98–101, Waseda University Center for Scholarly Information, International Computer Music Association, Tokyo, Japan, September 1993.
  43. N. F. Fletcher and T. D. Rossing, The Physics of Musical Instruments, Springer, New York, NY, USA, 2nd edition, 1998.
  44. Tensor Toolbox for Matlab, http://csmr.ca.sandia.gov/~tgkolda/TensorToolbox/.
  45. J. Woodruff, B. Pardo, and R. Dannenberg, “Remixing stereo music with score-informed source separation,” in Proceedings of the 7th International Symposium on Music Information Retrieval (ISMIR '06), Victoria, Canada, October 2006.
  46. T. Virtanen and A. Klapuri, “Analysis of polyphonic audio using source-filter model and non-negative matrix factorization,” in Proceedings of the Advances in Models for Acoustic Processing, Neural Information Processing Systems Workshop, Whistler, Canada, December 2006.
  47. V. Välimäki, J. Pakarinen, C. Erkut, and M. Karjalainen, “Discrete-time modelling of musical instruments,” Reports on Progress in Physics, vol. 69, no. 1, pp. 1–78, 2006. View at Publisher · View at Google Scholar
  48. M. R. Schroeder and B. S. Atal, “Code-excited linear prediction (CELP): high-quality speech at very low bit rates,” in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '85), vol. 10, pp. 937–940, Tampa, Fla, USA, April 1985.
  49. X. Serra, “Musical sound modeling with sinusoids plus noise,” in Musical Signal Processing, G. D. Poli, A. Picialli, S. T. Pope, and C. Roads, Eds., Swets & Zeiltlinger, Lisse, The Netherlands, 1997. View at Google Scholar
  50. Ö. Yilmaz and S. Rickard, “Blind separation of speech mixtures via time-frequency masking,” IEEE Transactions on Signal Processing, vol. 52, no. 7, pp. 1830–1847, 2004. View at Publisher · View at Google Scholar
  51. P. Siedlaczek, Advanced Orchestra Library Set, 1997.
  52. E. Vincent, R. Gribonval, and C. Fevotte, “Performance measurement in blind audio source separation,” IEEE Transactions on Audio, Speech and Language Processing, vol. 14, no. 4, pp. 1462–1469, 2006. View at Publisher · View at Google Scholar
  53. BSS_Eval toolbox, http://bassdb.gforge.inria.fr/bss_eval.