The Effect of Features on Clustering in Audio Surveillance

The effect of the choice of features on unsupervised clustering in audio surveillance is investigated. The importance of individual features in a larger feature set is first analyzed by examining the component loadings in principal component analysis (PCA). The individual sound events are then assigned into clusters using the self-tuning spectral clustering and the classical K-means algorithms. A weighted version of the original set is used, where the weights have been optimized by a genetic algorithm (GA) for maximally error-free clustering. The weighted feature set expectedly outperforms the original feature set and its PCA-reduced version. Insight into the importance of individual features is also gained.

[1]  Horst M. Eidenberger,et al.  Analysis of the Data Quality of Audio Descriptions of Environmental Sounds , 2007, J. Digit. Inf. Manag..

[2]  Horst Eidenberger,et al.  TOWARDS AN OPTIMAL FEATURE SET FOR ENVIRONMENTAL SOUND RECOGNITION , 2005 .

[3]  William F. Punch,et al.  Using Genetic Algorithms for Data Mining Optimization in an Educational Web-Based System , 2003, GECCO.

[4]  Ishwar K. Sethi,et al.  Classification of general audio data for content-based retrieval , 2001, Pattern Recognit. Lett..

[5]  Horst M. Eidenberger,et al.  Analysis of the Data Quality of Audio Features of Environmental Sounds , 2006 .

[6]  Pietro Perona,et al.  Self-Tuning Spectral Clustering , 2004, NIPS.

[7]  Manuele Bicego,et al.  On-line adaptive background modelling for audio surveillance , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[8]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[9]  R. Radhakrishnan,et al.  Audio analysis for surveillance applications , 2005, IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2005..

[10]  Sergios Theodoridis,et al.  Pattern Recognition , 1998, IEEE Trans. Neural Networks.

[11]  Tommi Ilmonen Mustajuuri - An application and toolkit for interactive audio processing , 2001 .

[12]  Jihoon Yang,et al.  Feature Subset Selection Using a Genetic Algorithm , 1998, IEEE Intell. Syst..

[13]  Jeroen Breebaart,et al.  Features for audio and music classification , 2003, ISMIR.

[14]  Janto Skowronek,et al.  Automatic surveillance of the acoustic activity in our living environment , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[15]  Michel Vacher,et al.  Detection and Speech/Sound Segmentation in a Smart Room Environment , 2005 .

[16]  Malcolm Slaney,et al.  Construction and evaluation of a robust multifeature speech/music discriminator , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[17]  Horst M. Eidenberger,et al.  Discrimination and Retrieval of Environmental Sounds , 2005 .

[18]  Özgür Izmirli,et al.  Using a Spectral Flatness Based Feature for Audio Segmentation and Retrieval , 2000, ISMIR.

[19]  E. B. Andersen,et al.  Modern factor analysis , 1961 .

[20]  Hideki Kawahara,et al.  YIN, a fundamental frequency estimator for speech and music. , 2002, The Journal of the Acoustical Society of America.

[21]  Stan Davis,et al.  Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .

[22]  W. Eric L. Grimson,et al.  Adaptive background mixture models for real-time tracking , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[23]  Chris Stauffer,et al.  Automated Audio-visual Activity Analysis , 2005 .