Adaptive context recognition based on audio signal

Auditory data provide many contextual cues about the crucial content of environments around. The goal of audio based context recognition is to equip the sensing devices with classification algorithms that can automatically classify the environments into pre-defined classes according to the extracted auditory features. In this paper, we first extract various features from the audio signals. We then perform a feature analysis to identify a feature ensemble to optimally classify different contexts. To achieve an efficient and timely online classification, a coarse-to-fine training scheme is adopted, where for each context three HMMs are trained by feature ensembles of different complexities. During online recognition, we start with coarse HMMs (with fewest numbers of features) and progressively apply finer models if necessary. Experiments show that this strategy results in significant saving in computational power with only negligible lose in context recognition accuracy.

[1]  Vesa T. Peltonen,et al.  Audio-based context recognition , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[2]  Alex Pentland,et al.  Auditory Context Awareness via Wearable Computing , 1998 .

[3]  Ben P. Milner,et al.  Context awareness using environmental noise classification , 2003, INTERSPEECH.

[4]  Richard J. Mammone,et al.  A comparative study of robust linear predictive analysis methods with applications to speaker identification , 1995, IEEE Trans. Speech Audio Process..

[5]  David V. Anderson,et al.  Audio classification and scene recognition and for hearing aids , 2005, 2005 IEEE International Symposium on Circuits and Systems.

[6]  Stan Davis,et al.  Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .

[7]  C.-C. Jay Kuo,et al.  Where am I? Scene Recognition for Mobile Robots using Audio Features , 2006, 2006 IEEE International Conference on Multimedia and Expo.

[8]  Alexander H. Waibel,et al.  Classifying user environment for mobile applications using linear autoencoding of ambient audio , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[9]  Pattie Maes,et al.  Situational Awareness from Environmental Sounds , 1997 .