Musical instrument classification and duet analysis employing music information retrieval techniques

The aim of this paper is to present solutions related to identifying musical data. These are discussed mainly on the basis of experiments carried out at the Multimedia Systems Department, Gdansk University of Technology, Gdansk, Poland. The topics presented in this paper include automatic recognition of musical instruments and separation of duet sounds. The classification process is shown as a three-layer process consisting of pitch extraction, parametrization, and pattern recognition. These three stages are discussed on the basis of experimental examples. Artificial neural networks (ANNs) are employed as a decision system and they are trained with a set of feature vectors (FVs) extracted from musical sounds recorded at the Multimedia Systems Department. The frequency envelope distribution (FED) algorithm is presented, which was introduced to musical duet separation. For the purpose of checking the efficiency of the FED algorithm, ANNs are also used. They are tested on FVs derived from musical sounds after the separation process is performed. The experimental results are shown and discussed.

[1]  A.V. Oppenheim,et al.  The importance of phase in signals , 1980, Proceedings of the IEEE.

[2]  Shuji Hashimoto,et al.  Blind Decomposition of Concurrent Sounds , 1994, ICMC.

[3]  Anssi Klapuri,et al.  Musical instrument recognition using cepstral coefficients and temporal features , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[4]  Thomas F. Quatieri,et al.  Pitch estimation and voicing detection based on a sinusoidal speech model , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[5]  Constantin Papaodysseus,et al.  A New Approach to the Automatic Recognition of Musical Recordings , 2001 .

[6]  Perfecto Herrera-Boyer,et al.  Automatic Classification of Musical Instrument Sounds , 2003 .

[7]  Ian Kaminskyj Multi-feature musical instrument sound classifier , 2001 .

[8]  Paolo Prandoni,et al.  Sonological models for timbre characterization , 1997 .

[9]  T. Parks,et al.  Maximum likelihood pitch estimation , 1976 .

[10]  Tetsuya Shimamura,et al.  Robust method of measurement of fundamental frequency by ACLOS: autocorrelation of log spectrum , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[11]  J C Brown,et al.  Feature dependence in the automatic identification of musical woodwind instruments. , 2001, The Journal of the Acoustical Society of America.

[12]  Aaron E. Rosenberg,et al.  A comparative performance study of several pitch detection algorithms , 1976 .

[13]  Bozena Kostek Soft Computing in Acoustics: Applications of Neural Networks, Fuzzy Logic and Rough Sets to Musical Acoustics , 1999 .

[14]  Bozena Kostek,et al.  Wavelet‐based automatic recognition of musical instruments , 2001 .

[15]  Pierre-Yves Rolland,et al.  Discovery of Patterns in Musical Sequences , 1999 .

[16]  Nicola Laurenti A METHOD FOR SPECTRUM SEPARATION AND ENVELOPE ESTIMATION OF THE RESIDUAL IN SPECTRUM MODELING OF MUSICAL SOUND , 2000 .

[17]  Wolfgang Hess,et al.  Pitch Determination of Speech Signals , 1983 .

[18]  A. Noll Cepstrum pitch determination. , 1967, The Journal of the Acoustical Society of America.

[19]  Xavier Serra,et al.  Musical Sound Modeling with Sinusoids plus Noise , 1997 .

[20]  John G. Proakis,et al.  Digital Signal Processing: Principles, Algorithms, and Applications , 1992 .

[21]  M. Schroeder Period histogram and product spectrum: new methods for fundamental-frequency measurement. , 1968, The Journal of the Acoustical Society of America.

[22]  Judith C. Brown Musical fundamental frequency tracking using a pattern recognition method , 1992 .

[23]  S. McAdams Segregation of concurrent sounds. I: Effects of frequency modulation coherence. , 1989, The Journal of the Acoustical Society of America.

[24]  Jacek M. Zurada,et al.  Introduction to artificial neural systems , 1992 .

[25]  Piotr Synak,et al.  Application of Temporal Descriptors to Musical Instrument Sound Recognition , 2003, Journal of Intelligent Information Systems.

[26]  Leah H. Jamieson,et al.  A probabilistic approach to AMDF pitch detection , 1994, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[27]  William H. Press,et al.  Numerical recipes , 1990 .

[28]  Bozena Kostek,et al.  Statistical Analysis of Musical Sound Features Derived from Wavelet Representation , 2002 .

[29]  Lotfi A. Zadeh,et al.  Fuzzy logic = computing with words , 1996, IEEE Trans. Fuzzy Syst..

[30]  Ian Kaminskyj Multi-feature musical instrument sound classifer with user determined generalisation performance , 2002 .

[31]  Andrzej Czyzewski,et al.  Rough Set Based Automatic Classification of Musical Instrument Sounds , 2003, Electron. Notes Theor. Comput. Sci..

[32]  Ikuyo Masuda-Katsuse A new method for speech recognition in the presence of non-stationary, unpredictable and high-level noise , 2001, INTERSPEECH.

[33]  Anssi Klapuri,et al.  Wide-Band Pitch Estimation for Natural Sound Sources with Inharmonicities , 1999 .

[34]  Bozena Kostek,et al.  Computer-Based Recognition of Musical Phrases Using The Rough-Set Approach , 1998, Inf. Sci..

[35]  Lotfi A. Zadeh,et al.  From Computing with Numbers to Computing with Words - from Manipulation of Measurements to Manipulation of Perceptions , 2005, Logic, Thought and Action.

[36]  Bozena Kostek "Computing with words" Concept Applied to Musical Information Retrieval , 2003, Electron. Notes Theor. Comput. Sci..

[37]  George Tzanetakis,et al.  Audio Information Retrieval (AIR) Tools , 2000, ISMIR.

[38]  Jürgen Herre,et al.  MPEG-7 and MPEG-7 audio: An overview , 2001 .

[39]  J C Brown Computer identification of musical instruments using pattern recognition with cepstral coefficients as features. , 1999, The Journal of the Acoustical Society of America.

[40]  Ramdas Kumaresan,et al.  A variable frame pitch estimator and test results , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[41]  Wolfgang Hess,et al.  Pitch Determination of Speech Signals: Algorithms and Devices , 1983 .

[42]  Pierre-Yves Rolland Adaptive User Modeling in a Content-Based Music Retrieval System , 2001 .

[43]  Andrzej Czyzewski,et al.  Representing Musical Instrument Sounds for Their Automatic Classification , 2001 .

[44]  T. Parks,et al.  Maximum likelihood pitch estimation , 1976, 1977 IEEE Conference on Decision and Control including the 16th Symposium on Adaptive Processes and A Special Symposium on Fuzzy Set Theory and Applications.

[45]  David Talkin,et al.  A Robust Algorithm for Pitch Tracking ( RAPT ) , 2005 .

[46]  J. Stephen Downie,et al.  Music information retrieval , 2005, Annu. Rev. Inf. Sci. Technol..

[47]  Bozena Kostek,et al.  Further Developments of Methods for Searching Optimum Musical and Rhythmic Feature Vectors , 2002 .

[48]  Xavier Serra,et al.  A proposal for the description of audio in the context of MPEG-7 , 1999 .

[49]  Stephen A. Dyer,et al.  Digital signal processing , 2018, 8th International Multitopic Conference, 2004. Proceedings of INMIC 2004..

[50]  Kunio Kashino,et al.  A Sound Source Separation System with the Ability of Automatic Tone Modeling , 1993, International Conference on Mathematics and Computing.

[51]  James W. Beauchamp,et al.  Detection of Musical Pitch from Recorded Solo Performances , 1993 .

[52]  Piero Cosi,et al.  Auditory modelling and self‐organizing neural networks for timbre classification , 1994 .

[53]  Xavier Serra,et al.  Towards Instrument Segmentation for Music Content Description: a Critical Review of Instrument Classification Techniques , 2000, ISMIR.