Knowledge discovery-based identification of musical pitches and instruments in polyphonic sounds

Pitch and timber detection methods applicable to monophonic digital signals are common. Conversely, successful detection of multiple pitches and timbers in polyphonic time-invariant music signals remains a challenge. A review of these methods, sometimes called ''Blind Signal Separation'', is presented in this paper. We analyze how musically trained human listeners overcome resonance, noise, and overlapping signals to identify and isolate what instruments are playing and then what pitch each instrument is playing. The part of the instrument and pitch recognition system, presented in this paper, responsible for identifying the dominant instrument from a base signal uses temporal features proposed by Wieczorkowska [Slezak, D., Synak, P., Wieczorkowska, A., Wroblewski, J., 2002. Kdd-based approach to musical instrument sound recognition. Hacid, M.-S., Ras, Z.W., Zighed, D.A., Kodratoff, Y. (Eds.), Foundations of Intelligent Systems. Proceedings of 13th Symposium ISMIS 2002, Lyon, Franc 4519 Berlin, Heidelberg, pp. 28-36.] in addition to the standard 11 MPEG7 features. After retrieving a semantical match for that dominant instrument from the database, it creates a resulting foreign set of features to form a new synthetic basen signal which no longer bears the previously extracted dominant sound. The system may repeat this process until all recognizable dominant instruments are accounted for in the segment. The proposed methodology incorporates Knowledge Discovery, MPEG7 segmentation and Inverse Fourier Transforms.

[1]  Geoffrey Zweig,et al.  Speech Recognition with Dynamic Bayesian Networks , 1998, AAAI/IAAI.

[2]  Piotr Synak,et al.  Application of Temporal Descriptors to Musical Instrument Sound Recognition , 2003, Journal of Intelligent Information Systems.

[3]  Paris Smaragdis,et al.  Blind separation of convolved mixtures in the frequency domain , 1998, Neurocomputing.

[4]  Piotr Dalka,et al.  Estimation of Musical Sound Separation Algorithm Effectiveness Employing Neural Networks , 2005, Journal of Intelligent Information Systems.

[5]  Anssi Klapuri,et al.  Musical instrument recognition using cepstral coefficients and temporal features , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[6]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[7]  S.C. Douglas,et al.  Multichannel blind deconvolution and equalization using the natural gradient , 1997, First IEEE Signal Processing Workshop on Signal Processing Advances in Wireless Communications.

[8]  John S. Denker,et al.  Neural Networks for Computing , 1998 .

[9]  Jean-Francois Cardoso,et al.  Blind signal separation: statistical principles , 1998, Proc. IEEE.

[10]  Russell H. Lambert,et al.  Blind separation of multiple speakers in a multipath environment , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[11]  Barak A. Pearlmutter,et al.  Blind source separation by sparse decomposition , 2000, SPIE Defense + Commercial Sensing.

[12]  Antti J. Eronen,et al.  Musical instrument recognition using ICA-based transform of features and discriminatively trained HMMs , 2003, Seventh International Symposium on Signal Processing and Its Applications, 2003. Proceedings..

[13]  Barak A. Pearlmutter,et al.  Independent Component Analysis: Blind source separation by sparse decomposition in a signal dictionary , 2001 .

[14]  Tong Zhang Instrument classification in polyphonic music based on timbre analysis , 2001, SPIE ITCom.

[15]  Emanuele Pollastri,et al.  Musical Instrument Timbres Classification with Spectral Features , 2001, 2001 IEEE Fourth Workshop on Multimedia Signal Processing (Cat. No.01TH8564).

[16]  Shusaku Tsumoto,et al.  Foundations of Intelligent Systems, 15th International Symposium, ISMIS 2005, Saratoga Springs, NY, USA, May 25-28, 2005, Proceedings , 2005, ISMIS.

[17]  Andrew W. Moore,et al.  Locally Weighted Learning , 1997, Artificial Intelligence Review.

[18]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[19]  Bozena Kostek,et al.  Musical instrument classification and duet analysis employing music information retrieval techniques , 2004, Proceedings of the IEEE.

[20]  David J. Field,et al.  Sparse Coding of Natural Images Produces Localized, Oriented, Bandpass Receptive Fields , 1995 .

[21]  Zbigniew W. Ras,et al.  Differentiated harmonic feature analysis on music information retrieval for instrument recognition , 2006, 2006 IEEE International Conference on Granular Computing.

[22]  James R. Glass,et al.  Hidden feature models for speech recognition using dynamic Bayesian networks , 2003, INTERSPEECH.

[23]  Andrew W. Moore,et al.  Locally Weighted Learning for Control , 1997, Artificial Intelligence Review.

[24]  Ichiro Fujinaga,et al.  Realtime Recognition of Orchestral Instruments , 2000, International Conference on Mathematics and Computing.

[25]  Ella Bingham,et al.  Advances in independent component analysis with applications to data mining , 2003 .

[26]  Christian Jutten,et al.  Space or time adaptive signal processing by neural network models , 1987 .

[27]  Barak A. Pearlmutter,et al.  Blind Source Separation by Sparse Decomposition in a Signal Dictionary , 2001, Neural Computation.

[28]  S. Cessie,et al.  Ridge Estimators in Logistic Regression , 1992 .

[29]  Piotr Synak,et al.  KDD-Based Approach to Musical Instrument Sound Recognition , 2002, ISMIS.