Acoustic echo cancellation using a multi-resolution non-negative matrix factorization method

Acoustic Echo Cancellation is very important in modern day communication owing to the ubiquitous spread of hands free telephony and voice over internet protocol systems. In this paper, a multi resolution non negative matrix factorisation (NMF) based acoustic echo cancellation method is proposed. This acoustic echo cancellation addresses two important research issues. Unlike conventional filter based algorithms for acoustic echo cancellation, which suffer from poor tracking of time varying loudspeaker microphone enclosure (LEM), this method cancels acoustic echo without estimating the LEM. So it is more robust to both double-talk and single talk scenarios. Secondly, the NMF based acoustic echo cancellation method suffers degeneration of speech signal due to fixed time frequency resolution. The use of adaptive multi-resolutions reduces this degeneration and provides a better estimate of near end speech signal. Experiments on acoustic echo cancellation are performed on CMU ARCTIC database with several datasets at different values of echo to near end signal ratio. The experimental results are compared with existing state of the art echo cancellation methods using both perceptual and objective quality evaluations. The proposed method performs reasonably better than other methods motivating its use in acoustic echo cancellation in teleconferencing environment.

[1]  Kazuhiro Kondo,et al.  Subjective Quality Measurement of Speech: Its Evaluation, Estimation and Applications , 2012 .

[2]  Cha Zhang,et al.  CROWDMOS: An approach for crowdsourcing mean opinion score studies , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[3]  Paris Smaragdis,et al.  An adaptive time-frequency resolution approach for Non-negative Matrix Factorization based single channel sound source separation , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[4]  Scott T. Rickard,et al.  Comparing Measures of Sparsity , 2008, IEEE Transactions on Information Theory.

[5]  Robert Lawlor,et al.  A novel approach to Acoustic Echo cancellation , 2008, 2008 16th European Signal Processing Conference.

[6]  Jacob Benesty,et al.  Robust extended multidelay filter and double-talk detector for acoustic echo cancellation , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[7]  Jeremy Todd,et al.  Adaptive Time-Frequency Resolution for Analysis and Processing of Audio , 2006 .

[8]  Christof Faller,et al.  Suppressing acoustic echo in a spectral envelope space , 2005, IEEE Transactions on Speech and Audio Processing.

[9]  Rafik A. Goubran,et al.  A Perceptual Performance Measure for Adaptive Echo Cancellers in Packet-Based Telephony , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[10]  Mahesh Viswanathan,et al.  Measuring speech quality for text-to-speech systems: development and assessment of a modified mean opinion score (MOS) scale , 2005, Comput. Speech Lang..

[11]  Rémi Gribonval,et al.  Non negative sparse representation for Wiener based source separation with a single sensor , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[12]  Simon Haykin,et al.  Adaptive Filter Theory 4th Edition , 2002 .

[13]  Andries P. Hekstra,et al.  Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[14]  D.R. Morgan,et al.  Acoustic Echo Cancellation for Stereophonic Teleconferencing , 1991, Final Program and Paper Summaries 1991 IEEE ASSP Workshop on Applications of Signal Processing to Audio and Acoustics.

[15]  A. Erell,et al.  Estimation using log-spectral-distance criterion for noise-robust speech recognition , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[16]  Bernard Widrow,et al.  Adaptive Signal Processing , 1985 .

[17]  L. Rabiner,et al.  An interpretation of the log likelihood ratio as a measure of waveform coder performance , 1980 .