Nearest-neighbor Generic Sound Classification with a WordNet-based Taxonomy

Audio classification methods work well when fine-tuned to reduced domains, such as musical instrument classification or simplified sound effects taxonomies. Classification methods cannot currently offer the detail needed in general sound recognition. A real-world-sound recognition tool would require a taxonomy that represents the real world and thousands of classifiers, each specialized in distinguishing little details. To tackle the taxonomy definition problem we use WordNet, a semantic network that organizes real world knowledge. In order to overcome the second problem, that is the need of a huge number of classifiers to distinguish a huge number of sound classes, we use a nearest-neighbor classifier with a database of isolated sounds unambiguously linked to WordNet concepts.

[1]  Perfecto Herrera-Boyer,et al.  Automatic Classification of Musical Instrument Sounds , 2003 .

[2]  Douglas Keislar,et al.  Content-Based Classification, Search, and Retrieval of Audio , 1996, IEEE Multim..

[3]  David A. Forsyth,et al.  Matching Words and Pictures , 2003, J. Mach. Learn. Res..

[4]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[5]  Bozena Kostek Automatic classification of musical instrument sounds , 2000 .

[6]  Anil K. Jain,et al.  Statistical Pattern Recognition: A Review , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Beth Logan,et al.  Mel Frequency Cepstral Coefficients for Music Modeling , 2000, ISMIR.

[8]  Keith Dana Martin,et al.  Sound-source recognition: a theory and computational model , 1999 .

[9]  C.-C. Jay Kuo,et al.  Classification and retrieval of sound effects in audiovisual data management , 1999, Conference Record of the Thirty-Third Asilomar Conference on Signals, Systems, and Computers (Cat. No.CH37020).

[10]  Pedro Cano,et al.  A review of algorithms for audio fingerprinting , 2002, 2002 IEEE Workshop on Multimedia Signal Processing..

[11]  Ircam,et al.  HIERARCHICAL GAUSSIAN TREE WITH INERTIA RATIO MAXIMIZATION FOR THE CLASSIFICATION OF LARGE MUSICAL INSTRUMENT DATABASES , 2003 .

[12]  Michael A. Casey,et al.  General sound classification and similarity in MPEG-7 , 2001, Organised Sound.

[13]  Malcolm Slaney,et al.  Mixtures of probability experts for audio retrieval and indexing , 2002, Proceedings. IEEE International Conference on Multimedia and Expo.

[14]  Vesa T. Peltonen,et al.  Computational auditory scene recognition , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[15]  Shlomo Dubnov,et al.  REVIEW OF ICA AND HOS METHODS FOR RETRIEVAL OF NATURAL SOUNDS AND SOUND EFFECTS. , 2003 .

[16]  Fabien Gouyon,et al.  Automatic Classification of Drum Sounds: A Comparison of Feature Selection Methods and Classification Techniques , 2002, ICMAI.