Instrument identification and pitch estimation in multi-timbre polyphonic musical signals based on probabilistic mixture model decomposition

In this paper, we propose a method based on probabilistic mixture model decomposition that can simultaneously identify musical instrument types, estimate pitches and assign each pitch to its source instrument in monaural polyphonic audio containing multiple sources. In the proposed system, the probability density function (PDF) of the observed mixture note is treated as a weighted sum approximation of all possible note models. These note models, covering 14 instruments and all their possible pitches, describe their dynamic frequency envelopes in terms of probability. The weight coefficients, indicating the probabilities of the existence of pitches of a certain type of instrument, are estimated using the Expectation-Maximization (EM) algorithm. The weight coefficients are used to detect the types of source instruments and the pitches. The results of experiments involving 14 instruments within a designated pitch range F3–F6 (37 pitches) demonstrate a good discrimination capability, especially in instrument identification and instrument-pitch identification. For the entire system including the note onset detection tool, using quartet polyphonic recordings, the average F-measure values of instrument-pitch identification, instrument identification and pitch estimation were 55.4, 62.5 and 86 % respectively.

[1]  Guizhong Liu,et al.  Dynamic characteristics of musical note for musical instrument classification , 2011, 2011 IEEE International Conference on Signal Processing, Communications and Computing (ICSPCC).

[2]  Hirokazu Kameoka,et al.  A Multipitch Analyzer Based on Harmonic Temporal Structured Clustering , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[3]  Emmanuel Vincent,et al.  Fast bayesian nmf algorithms enforcing harmonicity and temporal continuity in polyphonic music transcription , 2009, 2009 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.

[4]  Bhiksha Raj,et al.  Probabilistic Latent Variable Models as Nonnegative Factorizations , 2008, Comput. Intell. Neurosci..

[5]  George Tzanetakis,et al.  Musical Instrument Classification Using Individual Partials , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[6]  Mert Bay,et al.  Harmonic Source Separation Using Prestored Spectra , 2006, ICA.

[7]  Jeff A. Bilmes,et al.  A gentle tutorial of the em algorithm and its application to parameter estimation for Gaussian mixture and hidden Markov models , 1998 .

[8]  Bhiksha Raj,et al.  A Probabilistic Latent Variable Model for Acoustic Modeling , 2006 .

[9]  Axel Röbel,et al.  Dynamic Spectral Envelope Modeling for Timbre Analysis of Musical Instrument Sounds , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[10]  Anssi Klapuri,et al.  Musical Instrument Recognition in Polyphonic Audio Using Source-Filter Model for Sound Separation , 2009, ISMIR.

[11]  Masataka Goto,et al.  Instrogram: Probabilistic Representation of Instrument Existence for Polyphonic Music , 2007 .

[12]  Gaël Richard,et al.  Temporal Integration for Audio Classification With Application to Musical Instrument Classification , 2009, IEEE Transactions on Audio, Speech, and Language Processing.

[13]  Michael O'Neill,et al.  The Use of Mel-frequency Cepstral Coefficients in Musical Instrument Identification , 2008, ICMC.

[14]  Judith C. Brown Calculation of a constant Q spectral transform , 1991 .

[15]  Jun Wu,et al.  Polyphonic Pitch Estimation and Instrument Identification by Joint Modeling of Sustained and Attack Sounds , 2011, IEEE Journal of Selected Topics in Signal Processing.

[16]  Alicja Wieczorkowska,et al.  Identification of a dominating instrument in polytimbral same-pitch mixes using SVM classifiers with non-linear kernel , 2009, Journal of Intelligent Information Systems.

[17]  Masataka Goto A Predominant-F0 Estimation Method for Polyphonic Musical Audio Signals , 2004 .

[18]  Gaël Richard,et al.  Musical instrument recognition by pairwise classification strategies , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[19]  Daniel P. W. Ellis,et al.  A Probabilistic Subspace Model for Multi-instrument Polyphonic Transcription , 2010, ISMIR.

[20]  Piotr Dalka,et al.  Estimation of Musical Sound Separation Algorithm Effectiveness Employing Neural Networks , 2005, Journal of Intelligent Information Systems.

[21]  Emmanuel Vincent,et al.  Adaptive Harmonic Spectral Decomposition for Multiple Pitch Estimation , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[22]  Bozena Kostek,et al.  Musical instrument classification and duet analysis employing music information retrieval techniques , 2004, Proceedings of the IEEE.

[23]  Preeti Rao,et al.  On the Detection of Melodic Pitch in a Percussive Background , 2004 .

[24]  DeLiang Wang,et al.  Monaural Musical Sound Separation Based on Pitch and Common Amplitude Modulation , 2009, IEEE Transactions on Audio, Speech, and Language Processing.

[25]  Alicja Wieczorkowska,et al.  Musical Instruments in Random Forest , 2009, ISMIS.

[26]  Guillaume Lemaitre,et al.  Real-time Polyphonic Music Transcription with Non-negative Matrix Factorization and Beta-divergence , 2010, ISMIR.

[27]  Zbigniew W. Ras,et al.  Music Instrument Estimation in Polyphonic Sound Based on Short-Term Spectrum Match , 2009, Foundations of Computational Intelligence.