Improving instrument recognition in polyphonic music through system integration

A method is proposed for instrument recognition in polyphonic music which combines two independent detector systems. A polyphonic musical instrument recognition system using a missing feature approach and an automatic music transcription system based on shift invariant probabilistic latent component analysis that includes instrument assignment. We propose a method to integrate the two systems by fusing the instrument contributions estimated by the first system onto the transcription system in the form of Dirichlet priors. Both systems, as well as the integrated system are evaluated using a dataset of continuous polyphonic music recordings. Detailed results that highlight a clear improvement in the performance of the integrated system are reported for different training conditions.

[1]  George Tzanetakis,et al.  Musical Instrument Classification Using Individual Partials , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[2]  Paris Smaragdis Relative-pitch tracking of multiple arbitrary sounds. , 2009, The Journal of the Acoustical Society of America.

[3]  J. Stephen Downie,et al.  The Music Information Retrieval Evaluation eXchange (MIREX) , 2006 .

[4]  Changshui Zhang,et al.  Multiple Fundamental Frequency Estimation by Modeling Spectral Peaks and Non-Peak Regions , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[5]  Anssi Klapuri,et al.  Musical Instrument Recognition in Polyphonic Audio Using Missing Feature Approach , 2013, IEEE Transactions on Audio, Speech, and Language Processing.

[6]  Tillman Weyde,et al.  An efficient shift-invariant model for polyphonic music transcription , 2013 .

[7]  Masataka Goto,et al.  RWC Music Database: Music genre database and musical instrument sound database , 2003, ISMIR.

[8]  Anssi Klapuri,et al.  Signal Processing Methods for Music Transcription , 2006 .

[9]  Raul Kompass,et al.  A Generalized Divergence Measure for Nonnegative Matrix Factorization , 2007, Neural Computation.

[10]  Emmanouil Benetos,et al.  Automatic Transcription of Polyphonic Music Exploiting Temporal Evolution , 2012 .

[11]  Christian Schörkhuber CONSTANT-Q TRANSFORM TOOLBOX FOR MUSIC PROCESSING , 2010 .

[12]  Guy J. Brown,et al.  A missing feature approach to instrument identification in polyphonic music , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[13]  Anssi Klapuri,et al.  Automatic music transcription: challenges and future directions , 2013, Journal of Intelligent Information Systems.

[14]  Daniel P. W. Ellis,et al.  Transcribing Multi-Instrument Polyphonic Music With Hierarchical Eigeninstruments , 2011, IEEE Journal of Selected Topics in Signal Processing.

[15]  Daniel P. W. Ellis,et al.  Signal Processing for Music Analysis , 2011, IEEE Journal of Selected Topics in Signal Processing.

[16]  Paris Smaragdis,et al.  Separation by “humming”: User-guided sound extraction from monophonic mixtures , 2009, 2009 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.

[17]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[18]  Jon Barker,et al.  Missing-Data Techniques: Recognition with Incomplete Spectrograms , 2012, Techniques for Noise Robustness in Automatic Speech Recognition.

[19]  Daniel P. W. Ellis,et al.  Decoding speech in the presence of other sources , 2005, Speech Commun..

[20]  Yannis Stylianou,et al.  Three Dimensions of Pitched Instrument Onset Detection , 2010, IEEE Transactions on Audio, Speech, and Language Processing.