A Study of Musical Instrument Classification Using Gaussian Mixture Models and Support Vector Machines

The Cambridge Research Laboratory was founded in 1987 to advance the state of the art in both core computing and human-computer interaction, and to use the knowledge so gained to support the Company's corporate objectives. We believe this is best accomplished through interconnected pursuits in technology creation, advanced systems engineering, and business development. We are multimedia data. We recognize and embrace a technology creation model which is characterized by three major phases: Freedom: The lifeblood of the Laboratory comes from the observations and imaginations of our research staff. It is here that challenging research problems are uncovered (through discussions with customers, through interactions with others in the Corporation, through other professional interactions, through reading, and the like) or that new ideas are born. For any such problem or idea, this phase culminates in the nucleation of a project team around a well-articulated central research question and the outlining of a research plan. Focus: Once a team is formed, we aggressively pursue the creation of new technology based on the plan. This may involve direct collaboration with other technical professionals inside and outside the Corporation. This phase culminates in the demonstrable creation of new technology which may take any of a number of forms—a journal article, a technical talk, a working prototype, a patent application, or some combination of these. The research team is typically augmented with other resident professionals—engineering and business development—who work as integral members of the core team to prepare preliminary plans for how best to leverage this new knowledge, either through internal transfer of technology or through other means. Follow-through: We actively pursue taking the best technologies to the marketplace. For those opportunities which are not immediately transferred internally and where the team has identified a significant opportunity, the business development and engineering staff will lead early-stage commercial development, often in conjunction with members of the research staff. While the value to the Corporation of taking these new ideas to the market is clear, it also has a significant positive impact on our future research work by providing the means to understand intimately the problems and opportunities in the market and to more fully exercise our ideas and concepts in real-world settings. Throughout this process, communicating our understanding is a critical part of what we do, and participating in the larger technical community—through the publication of refereed journal articles and the presentation of our ideas at conferences—is …

[1]  Ian Kaminskyj,et al.  Automatic source identification of monophonic musical instrument sounds , 1995, Proceedings of ICNN'95 - International Conference on Neural Networks.

[2]  Marti A. Hearst,et al.  SVMs—a practical consequence of learning theory , 1998 .

[3]  Yeong-Ho Ha,et al.  Genre classification system of TV sound signals based on a spectrogram analysis , 1998 .

[4]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[5]  A. W. M. van den Enden,et al.  Discrete Time Signal Processing , 1989 .

[6]  Seiji Inokuchi,et al.  The Kansei Music System , 1989 .

[7]  Kunio Kashino,et al.  Application of the Bayesian probability network to music scene analysis , 1998 .

[8]  Malcolm Slaney,et al.  Construction and evaluation of a robust multifeature speech/music discriminator , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[9]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[10]  Vladimir Cherkassky,et al.  The Nature Of Statistical Learning Theory , 1997, IEEE Trans. Neural Networks.

[11]  Jason Weston,et al.  Multi-Class Support Vector Machines , 1998 .

[12]  J C Brown Computer identification of musical instruments using pattern recognition with cepstral coefficients as features. , 1999, The Journal of the Acoustical Society of America.

[13]  Douglas A. Reynolds,et al.  Robust text-independent speaker identification using Gaussian mixture speaker models , 1995, IEEE Trans. Speech Audio Process..

[14]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[15]  Guy J. Brown,et al.  Computational auditory scene analysis , 1994, Comput. Speech Lang..

[16]  K. Martin Automatic transcription of simple polyphonic music , 1996 .

[17]  Walter Liggett,et al.  Insights From the Broadcast News Benchmark Tests , 1998 .

[18]  Youngmoo E. Kim,et al.  Musical instrument identification: A pattern‐recognition approach , 1998 .

[19]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.