Discriminative Training of GMM via Log-Likelihood Ratio for Abnormal Acoustic Event Classification in Vehicular Environment

In this paper, a discriminative training technique based on Gaussian Mixture Model (GMM) is proposed for detection and classification of abnormal acoustic events in indoor environment. In particular, we consider small indoor space such as vehicular scenes and develop a two-step procedure in which statistical mapping of acoustic features is followed by abnormal event detection. In the first step, Mel- Frequency Cepstral Coefficients (MFCC) feature set is used to construct a Gaussian Mixture Model (GMM) for acoustic event mapping and log-likelihood ratio is used for confidence measure to correct misrecognition over vocal/nonvocal regions. In the 2nd step, an abnormal event is determined using maximum likelihood estimation approach wherein the ratio of abnormal events to cumulative events during an analysis window is compared to a threshold. For performance evaluation, we employ a statistically meaningful database of normal and abnormal acoustic events in actual indoor scenes of two representative scenarios. Subsequent experiments demonstrate a performance of 91% correct detection rate for abnormal context and 2.5% of error detection rate, which indicates it promising for real world vehicular acoustic surveillance applications.

[1]  Vesa T. Peltonen,et al.  Audio-based context recognition , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[2]  Alex Waibel,et al.  Machine listening for context-aware computing , 2006 .

[3]  Alex Pentland,et al.  Auditory Context Awareness via Wearable Computing , 1998 .

[4]  Pattie Maes,et al.  Situational Awareness from Environmental Sounds , 1997 .

[5]  Nikos Fakotakis,et al.  Identification of abnormal audio events based on probabilistic novelty detection , 2010, INTERSPEECH.

[6]  Ben P. Milner,et al.  Acoustic environment classification , 2006, TSLP.

[7]  Vincent Fontaine,et al.  AUTOMATIC CLASSIFICATION OF ENVIRONMENTAL NOISE EVENTS BY HIDDEN MARKOV MODELS , 1998 .

[8]  Renate Sitte,et al.  Comparison of techniques for environmental sound recognition , 2003, Pattern Recognit. Lett..

[9]  Andrey Temko,et al.  Comparison of Sequence Discriminant Support Vector Machines for Acoustic Event Classification , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[10]  Richard S. Goldhor,et al.  Recognition of environmental sounds , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[11]  Nikos Fakotakis,et al.  On acoustic surveillance of hazardous situations , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[12]  Guodong Guo,et al.  Content-based audio classification and retrieval by support vector machines , 2003, IEEE Trans. Neural Networks.

[13]  Ishwar K. Sethi,et al.  Classification of general audio data for content-based retrieval , 2001, Pattern Recognit. Lett..

[14]  Vincent Fontaine,et al.  Automatic classification of environmental noise events by hidden Markov models , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[15]  Hanseok Ko,et al.  Acoustic and visual signal based context awareness system for mobile application , 2011, 2011 IEEE International Conference on Consumer Electronics (ICCE).

[16]  Ben P. Milner,et al.  Environmental Noise Classification for Context-Aware Applications , 2003, DEXA.