Auditory Context Recognition Using SVMs

We study auditory context recognition for context-aware mobile computing systems. Auditory contexts are recordings of a mixture of sounds, or ambient audio, from mobile users' everyday environments. Fortraining a classifier, a set of recordings from different environments are segmented and labeled. The segments are windowed into overlapping frames for feature extraction. While previous work in auditory context recognition has often treated the problem as a sequence classification task and used HMM-based classifiers to recognize a sequence of consecutive MFCCs of frames, we compute averaged Mel-spectrum over the segments and train a SVM-based classifier. Our scheme outperforms an already reported HMM-based scheme. This result is achieved using the same dataset. We also show that often the feature sets used by previous work are affected by attenuation, limiting their applicability in practice. Furthermore, we study the impact of segment duration on recognition accuracy.

[1]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[2]  Pattie Maes,et al.  Situational Awareness from Environmental Sounds , 1997 .

[3]  Ben P. Milner,et al.  Acoustic environment classification , 2006, TSLP.

[4]  Chin-Chuan Han,et al.  Automatic recognition of animal vocalizations using averaged MFCC and linear discriminant analysis , 2006, Pattern Recognit. Lett..

[5]  Keith Dana Martin,et al.  Sound-source recognition: a theory and computational model , 1999 .

[6]  Ian Witten,et al.  Data Mining , 2000 .

[7]  Lie Lu,et al.  Content-based audio classification and segmentation by using support vector machines , 2003, Multimedia Systems.

[8]  Vesa T. Peltonen,et al.  Computational auditory scene recognition , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[9]  Anind K. Dey,et al.  Understanding and Using Context , 2001, Personal and Ubiquitous Computing.

[10]  Malcolm Slaney,et al.  Construction and evaluation of a robust multifeature speech/music discriminator , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[11]  Ryan M. Rifkin,et al.  In Defense of One-Vs-All Classification , 2004, J. Mach. Learn. Res..

[12]  Vesa T. Peltonen,et al.  Audio-based context recognition , 2006, IEEE Transactions on Audio, Speech, and Language Processing.