Speech/music/noise classification in hearing aids using a two-layer classification system with MSE linear discriminants

This paper focuses on the development of an automatic sound classifier for digital hearing aids that aims to enhance the listening comprehension when the user goes from a sound environment to another different one. The implemented approach consists in dividing the classifying algorithm into two layers that make use of two-class algorithms that work more efficiently: the input signal discriminated by the first layer into either speech or non-speech is ulteriorly classified more specifically depending on whether the audio is noise or music. The complete system results in having three classes, labeled “speech”, “noise” and “music”. The classification process is carried out by using a mean squared error linear discriminant, which provides very good results along with a low computational complexity. This is a crucial issue because hearing aids have to work at very low clock frequency. The paper explores the feasibility of this approach thanks to a number of experiments that prove the advantages of using the proposed two-layer system rather than a three-classes, single-layer classifier.

[1]  Malcolm Slaney,et al.  Construction and evaluation of a robust multifeature speech/music discriminator , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2]  P. Balaban,et al.  A Modified Monte-Carlo Simulation Technique for the Evaluation of Error Rate in Digital Communication Systems , 1980, IEEE Trans. Commun..

[3]  L. Cuadra,et al.  Application of Fisher Linear Discriminant Analysis to Speech/Music Classification , 2006 .

[4]  Helmut Neuschmied,et al.  Recognition and analysis of audio for copyright protection: The RAA project , 2004, J. Assoc. Inf. Sci. Technol..

[5]  Enrique Alexandre,et al.  Two-layer automatic sound classification system for conversation enhancement in hearing aids , 2008, Integr. Comput. Aided Eng..

[6]  Lie Lu,et al.  Content analysis for audio classification and segmentation , 2002, IEEE Trans. Speech Audio Process..

[7]  Stan Davis,et al.  Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .

[8]  Paul Mermelstein,et al.  Experiments in syllable-based recognition of continuous speech , 1980, ICASSP.

[9]  Enric Guaus,et al.  A Non-linear Rhythm-Based Style Classifciation for Broadcast Speech-Music Discrimination , 2004 .

[10]  Michael Christoph Büchler,et al.  Algorithms for sound classification in hearing instruments , 2002 .

[11]  John Saunders,et al.  Real-time discrimination of broadcast speech/music , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.