Online NON-negative Tensor Deconvolution for source detection in 3DTV audio

The following article describes research on source detection in multi channel (3DTV) audio streams. The problem is extremely complex due to the fact that multiple layers can be present in scenes (background music, ambience, commentator). In this work a new algorithm is developed that exploits the information from the different audio channels to detect, and possibly localize and separate independent audio sources. An algorithm based on online Non-negative Tensor Deconvolution is realized, to deal with sound sources with time dependent positions in the channel matrix. The evaluation is made on 3DTV 5.1 film soundtracks and on synthetic mixes of 3DTV 5.1 audio with target sounds from a sound effects database: a significant improvement of the detection performance is shown, compared with other decomposition techniques.

[1]  P. Smaragdis,et al.  Non-negative matrix factorization for polyphonic music transcription , 2003, 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (IEEE Cat. No.03TH8684).

[2]  Jérôme Idier,et al.  Algorithms for Nonnegative Matrix Factorization with the β-Divergence , 2010, Neural Computation.

[3]  Björn W. Schuller,et al.  Localization of non-linguistic events in spontaneous speech by Non-Negative Matrix Factorization and Long Short-Term Memory , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[4]  Daniel P. W. Ellis,et al.  Spectral vs. spectro-temporal features for acoustic event detection , 2011, 2011 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA).

[5]  Paris Smaragdis,et al.  Online PLCA for Real-Time Semi-supervised Source Separation , 2012, LVA/ICA.

[6]  Björn W. Schuller,et al.  Real-Time Speech Separation by Semi-supervised Nonnegative Matrix Factorization , 2012, LVA/ICA.

[7]  Fernando Perdigão,et al.  Speech event detection by non negative matrix deconvolution , 2007, 2007 15th European Signal Processing Conference.

[8]  Tamir Hazan,et al.  Non-negative tensor factorization with applications to statistics and computer vision , 2005, ICML.

[9]  Jouni Paulus,et al.  Drum transcription with non-negative spectrogram factorisation , 2005, 2005 13th European Signal Processing Conference.

[10]  Annamaria Mesaros,et al.  Sound Event Detection in Multisource Environments Using Source Separation , 2011 .

[11]  Tuomas Virtanen,et al.  Multichannel audio upmixing based on non-negative tensor factorization representation , 2011, 2011 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA).

[12]  Guillaume Lemaitre,et al.  Real-Time Detection of Overlapping Sound Events with Non-Negative Matrix Factorization , 2013 .

[13]  Tuomas Virtanen,et al.  Sound Source Separation Using Sparse Coding with Temporal Continuity Objective , 2003, ICMC.

[14]  Ioannis Pitas,et al.  A Novel Discriminant Non-Negative Matrix Factorization Algorithm With Applications to Facial Image Characterization Problems , 2007, IEEE Transactions on Information Forensics and Security.

[15]  Alexey Ozerov,et al.  Multichannel nonnegative tensor factorization with structured constraints for user-guided audio source separation , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[16]  Derry Fitzgerald,et al.  On the use of the beta divergence for musical source separation , 2009 .

[17]  H. Kameoka,et al.  Convergence-guaranteed multiplicative algorithms for nonnegative matrix factorization with β-divergence , 2010, 2010 IEEE International Workshop on Machine Learning for Signal Processing.

[18]  Derry Fitzgerald,et al.  Sound Source Separation Using Shifted Non-Negative Tensor Factorisation , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[19]  Paris Smaragdis,et al.  Non-negative Matrix Factor Deconvolution; Extraction of Multiple Sound Sources from Monophonic Inputs , 2004, ICA.

[20]  Andrzej Cichocki,et al.  Nonnegative Matrix and Tensor Factorization T , 2007 .

[21]  Derry Fitzgerald,et al.  Extended Nonnegative Tensor Factorisation Models for Musical Sound Source Separation , 2008, Comput. Intell. Neurosci..