Towards Instrument Segmentation for Music Content Description: a Critical Review of Instrument Classification Techniques

A system capable of describing the musical content of any kind of sound file or sound stream, as it is supposed to be done in MPEG7-compliant applications, should provide an account of the different moments where a certain instrument can be listened to. In this paper we concentrate on reviewing the different techniques that have been so far proposed for automatic classification of musical instruments. As most of the techniques to be discussed are usable only in "solo" performances we will evaluate their applicability to the more complex case of describing sound mixes. We conclude this survey discussing the necessity of developing new strategies for classifying sound mixes without a priori separation of sound sources.

[1]  J. Grey Multidimensional perceptual scaling of musical timbres. , 1977, The Journal of the Acoustical Society of America.

[2]  Ichiro Fujinaga,et al.  Machine recognition of timbre using steady-state tone of acoustic musical instruments , 1998, ICMC.

[3]  Keld K. Jensen,et al.  Timbre Models of Musical Sounds , 1999 .

[4]  Ichiro Fujinaga,et al.  Implementation of exemplar-based learning model for music cognition , 1998 .

[5]  C.-C. Jay Kuo,et al.  Heuristic approach for generic audio data segmentation and annotation , 1999, MULTIMEDIA '99.

[6]  Barry Vercoe,et al.  Music-listening systems , 2000 .

[7]  David J. Spiegelhalter,et al.  Machine Learning, Neural and Statistical Classification , 2009 .

[8]  Piero Cosi,et al.  Auditory modelling and self‐organizing neural networks for timbre classification , 1994 .

[9]  Greg Ridgeway,et al.  Combining estimators to improve performance , 1999, KDD '99.

[10]  Ichiro Fujinaga,et al.  Toward Real-time Recognition of Acoustic Musical Instruments , 1999, ICMC.

[11]  Bozena Kostek Soft Computing in Acoustics: Applications of Neural Networks, Fuzzy Logic and Rough Sets to Musical Acoustics , 1999 .

[12]  David Wessel,et al.  Timbre Space as a Musical Control Structure , 1979 .

[13]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[14]  Janet Marques,et al.  An automatic annotation system for audio data containing music , 1999 .

[15]  Anssi Klapuri,et al.  Musical instrument recognition using cepstral coefficients and temporal features , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[16]  Roger K. Moore,et al.  Hidden Markov model decomposition of speech and noise , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[17]  B. Feiten,et al.  Automatic indexing of a sound database using self-organizing neural nets , 1994 .

[18]  Kristoffer Jensen,et al.  Binary Decision Tree Classification of Musical Sounds , 1999, ICMC.

[19]  Martin Cooke,et al.  Modelling auditory processing and organisation , 1993, Distinguished dissertations in computer science.

[20]  J C Brown Computer identification of musical instruments using pattern recognition with cepstral coefficients as features. , 1999, The Journal of the Acoustical Society of America.

[21]  Bozena Kostek,et al.  Soft Computing-Based Recognition of Musical Sounds , 1998 .

[22]  Music recognition using note transition context , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[23]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[24]  Daniel P. W. Ellis,et al.  Improved recognition by combining different features and different systems , 2000 .

[25]  Daniel Patrick Whittlesey Ellis,et al.  Prediction-driven computational auditory scene analysis , 1996 .

[26]  Piero Cosi,et al.  Timbre Characterization with Mel-Cepstrum and Neural Nets , 1994, ICMC.

[27]  Alicja Wieczorkowska,et al.  Rough Sets as A Tool for Audio Signal Classification , 1999, ISMIS.

[28]  Andrzej Czyzewski,et al.  Soft Processing of Audio Signals , 1998 .

[29]  Shih-Fu Chang,et al.  Overview of the MPEG-7 standard , 2001, IEEE Trans. Circuits Syst. Video Technol..

[30]  Jonathan Foote,et al.  A Similarity Measure for Automatic Audio Classification , 1997 .

[31]  Z. Pawlak Rough set elements , 1998 .

[32]  Terrence J. Sejnowski,et al.  An Information-Maximization Approach to Blind Separation and Blind Deconvolution , 1995, Neural Computation.

[33]  Ichiro Fujinaga,et al.  Realtime Recognition of Orchestral Instruments , 2000, International Conference on Mathematics and Computing.

[34]  Ian Kaminskyj,et al.  Automatic source identification of monophonic musical instrument sounds , 1995, Proceedings of ICNN'95 - International Conference on Neural Networks.

[35]  Stephen McAdams,et al.  Instrument Sound Description in the Context of MPEG-7 , 2000, ICMC.

[36]  G. Soete,et al.  Perceptual scaling of synthesized musical timbres: Common dimensions, specificities, and latent subject classes , 1995, Psychological research.

[37]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[38]  J. N. Avaritsiotis,et al.  Timbre recognition of single notes using an ARTMAP neural network , 1999, ICECS'99. Proceedings of ICECS '99. 6th IEEE International Conference on Electronics, Circuits and Systems (Cat. No.99EX357).

[39]  S. Lakatos A common perceptual space for harmonic and percussive timbres , 2000, Perception & psychophysics.

[40]  Risto Näätänen,et al.  Timbre Similarity: Convergence of Neural, Behavioral, and Computational Approaches , 1998 .

[41]  Shlomo Dubnov,et al.  Polyspectra as measures of sound texture and timbre , 1997 .

[42]  Keith Dana Martin,et al.  Sound-source recognition: a theory and computational model , 1999 .

[43]  Stephen Grossberg,et al.  ARTMAP: supervised real-time learning and classification of nonstationary data by a self-organizing neural network , 1991, [1991 Proceedings] IEEE Conference on Neural Networks for Ocean Engineering.

[44]  Youngmoo E. Kim,et al.  Musical instrument identification: A pattern‐recognition approach , 1998 .

[45]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[46]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .