Application of Fisher Linear Discriminant Analysis to Speech/Music Classification

This paper proposes the application of Fisher linear discriminants to the problem of speech/music classification. Fisher linear discriminants can classify between two different classes, and are based on the calculation of some kind of centroid for the training data corresponding with each one of these classes. Based on that information a linear boundary is established, which will be used for the classification process. Some results will be given demonstrating the superior behavior of this classification algorithm compared with the well-known K-nearest neighbor algorithm. It will also be demonstrated that it is possible to obtain very good results in terms of probability of error using only one feature extracted from the audio signal, being thus possible to reduce the complexity of this kind of systems in order to implement them in real-time

[1]  Malcolm Slaney,et al.  Construction and evaluation of a robust multifeature speech/music discriminator , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2]  Markus Koppenberger,et al.  Nearest-neighbor Generic Sound Classification with a WordNet-based Taxonomy , 2004 .

[3]  François Pachet,et al.  Representing Musical Genre: A State of the Art , 2003 .

[4]  Daniel P. W. Ellis,et al.  Speech/music discrimination based on posterior probability features , 1999, EUROSPEECH.

[5]  Vesa T. Peltonen,et al.  Computational auditory scene recognition , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[6]  John Saunders,et al.  Real-time discrimination of broadcast speech/music , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[7]  Stan Davis,et al.  Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .

[8]  Bart Kosko,et al.  Neural networks for signal processing , 1992 .

[9]  Ronaldus Maria Aarts,et al.  A real-time speech-music discriminator , 1999 .

[10]  Chi-Min Liu,et al.  A Unified Fast Algorithm for Cosine Modulated Filter Banks in Current Audio Coding Standards , 1999 .

[11]  Stephen A. Billings,et al.  Nonlinear Fisher discriminant analysis using a minimum squared error cost function and the orthogonal least squares algorithm , 2002, Neural Networks.

[12]  Enric Guaus,et al.  A Non-linear Rhythm-Based Style Classifciation for Broadcast Speech-Music Discrimination , 2004 .

[13]  Chengjun Liu,et al.  Enhanced Fisher linear discriminant models for face recognition , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[14]  George Tzanetakis,et al.  Musical genre classification of audio signals , 2002, IEEE Trans. Speech Audio Process..

[15]  Peter Kabal,et al.  Speech/music discrimination for multimedia applications , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[16]  Paul Mermelstein,et al.  Experiments in syllable-based recognition of continuous speech , 1980, ICASSP.

[17]  Xavier Serra,et al.  Towards Instrument Segmentation for Music Content Description: a Critical Review of Instrument Classification Techniques , 2000, ISMIR.

[18]  Beth Logan,et al.  Mel Frequency Cepstral Coefficients for Music Modeling , 2000, ISMIR.

[19]  George Tzanetakis,et al.  MARSYAS: a framework for audio analysis , 1999, Organised Sound.

[20]  B. Scholkopf,et al.  Fisher discriminant analysis with kernels , 1999, Neural Networks for Signal Processing IX: Proceedings of the 1999 IEEE Signal Processing Society Workshop (Cat. No.98TH8468).

[21]  Lie Lu,et al.  Content analysis for audio classification and segmentation , 2002, IEEE Trans. Speech Audio Process..