Location Estimation of Predominant Sound Source with Embedded Source Separation in Amplitude-Panned Stereo Signal

This letter proposes a new method of estimating the location of a predominant source in an amplitude-panned stereo signal with two sources. When the conventional method of location estimation is applied to an amplitude-panned multi-source sound, a serious estimation error occurs due to interference between sources. To solve this problem, the proposed method includes an embedded source separation based on non-negative matrix factorization, which first determines the initial estimate of source location, and then computes the basis matrix of the source using the initial location estimate. In this way, the proposed method can perform the source separation without a training stage. The comparative evaluation confirms that the proposed method provides higher performance in location estimation than the conventional method with a training stage.

[1]  Seok-Pil Lee,et al.  Efficient Primary-Ambient Decomposition Algorithm for Audio Upmix , 2012 .

[2]  Minje Kim,et al.  Nonnegative matrix partial co-factorization for drum source separation , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[3]  Michael M. Goodwin,et al.  Geometric signal decompositions for spatial audio enhancement , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[4]  Tuomas Virtanen,et al.  Separation of drums from polyphonic music using non-negative matrix factorization and support vector machine , 2005, 2005 13th European Signal Processing Conference.

[5]  Paris Smaragdis,et al.  Optimal cost function and magnitude power for NMF-based speech separation and music interpolation , 2012, 2012 IEEE International Workshop on Machine Learning for Signal Processing.

[6]  Xindong Wu,et al.  A new descriptive clustering algorithm based on Nonnegative Matrix Factorization , 2008, 2008 IEEE International Conference on Granular Computing.

[7]  Gautham J. Mysore,et al.  Universal speech models for speaker independent single channel source separation , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[8]  David Virette,et al.  Parametric representation of multichannel audio based on Principal Component Analysis , 2006 .

[9]  Bhiksha Raj,et al.  Supervised and Semi-supervised Separation of Sounds from Single-Channel Mixtures , 2007, ICA.

[10]  Carlos Avendano,et al.  Frequency Domain Techniques for Stereo to Multichannel Upmix , 2002 .

[11]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[12]  Sridhar Krishnan,et al.  Time–Frequency Matrix Feature Extraction and Classification of Environmental Audio Signals , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[13]  Tuomas Virtanen,et al.  Sound Source Separation Using Sparse Coding with Temporal Continuity Objective , 2003, ICMC.

[14]  Ville Pulkki,et al.  Virtual Sound Source Positioning Using Vector Base Amplitude Panning , 1997 .