Musical Instrument Recognition Based on the Bionic Auditory Model

We present a bionic auditory system for musical instrument recognition. This system is designed based on the physiological structures of the human auditory system that are essential to sound source recognition, such as the basilar membrane and inner hair cells in the cochlea of the inner ear, cochlear nucleus, and the auditory cortex. A large solo database consisting of 243 acoustic and synthetic solo tones over the full pitch ranges of seven different instruments (guitar, harp, horn, piano, saxophone, trumpet, and violin) is used to encompass different sound possibilities of each instrument. The gamma tone model, the Meddis model, and poster ventral cochlear nucleus (PVCN) model are constructed to imitate the basilar membrane, the inner hair cells, and the cochlear nucleus, respectively. By using 33%/67% splits between training and test data, a self-organizing mapping neural network (SOMNN) based on the function of auditory cortex is established to classify the instruments. The instruments are recognized with an overall success rate of over 75%. This bionic auditory system indicates high efficiency and high accuracy in musical instrument recognition.

[1]  Kumar Banchhor,et al.  Musical Instrument Recognition using Spectrogram and Autocorrelation Sumit , 2012 .

[2]  Torsten Dau,et al.  Prediction of speech intelligibility based on an auditory preprocessing model , 2010, Speech Commun..

[3]  Renate Sitte,et al.  Comparison of techniques for environmental sound recognition , 2003, Pattern Recognit. Lett..

[4]  Anssi Klapuri,et al.  Musical instrument recognition using cepstral coefficients and temporal features , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[5]  Soo-Young Lee,et al.  Learning self-organized topology-preserving complex speech features at primary auditory cortex , 2005, Neurocomputing.

[6]  Teuvo Kohonen,et al.  The self-organizing map , 1990 .

[7]  Ignacio Olmeda,et al.  Self-organizing maps could improve the classification of Spanish mutual funds , 2006, Eur. J. Oper. Res..

[8]  A. V. Schaik,et al.  A Silicon Representation of the Meddis Inner Hair Cell Model , 2000 .

[9]  Mathieu Lagrange,et al.  Explicit modeling of temporal dynamics within musical signals for acoustical unit similarity , 2010, Pattern Recognit. Lett..

[10]  Stephanie Seneff,et al.  A computational model for the peripheral auditory system: Application of speech recognition research , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[11]  Werner Hemmert,et al.  Speech encoding in a model of peripheral auditory processing: Quantitative assessment by means of automatic speech recognition , 2007, Speech Commun..

[12]  Jin Huijun Gammatone filter bank to simulate the characteristics of the human basilar membrane , 2008 .

[13]  Michael J Newton,et al.  A neurally inspired musical instrument classification system based upon the sound onset. , 2012, The Journal of the Acoustical Society of America.

[14]  R. Meddis Simulation of mechanical to neural transduction in the auditory receptor. , 1986, The Journal of the Acoustical Society of America.

[15]  M. Paradiso,et al.  Neuroscience: Exploring the Brain , 1996 .

[16]  Thippur V. Sreenivas,et al.  Blocking artifacts in speech/audio: Dynamic auditory model-based characterization and optimal time-frequency smoothing , 2009, Signal Process..

[17]  Bob L. Sturm,et al.  Incorporating scale information with cepstral features: Experiments on musical instrument recognition , 2010, Pattern Recognit. Lett..

[18]  M. Hunt,et al.  Speech recognition using a cochlear model , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[19]  R. Meddis,et al.  Implementation details of a computation model of the inner hair‐cell auditory‐nerve synapse , 1990 .

[20]  Antti Eronen,et al.  Automatic musical instrument recognition , 2001 .

[21]  Stephanie Seneff,et al.  Pitch and spectral estimation of speech based on auditory synchrony model , 1983, ICASSP.

[22]  Takuji Koike,et al.  Modeling of the human middle ear using the finite-element method. , 2002, The Journal of the Acoustical Society of America.

[23]  Yong Shi,et al.  A Modified Clustering Method Based on Self-Organizing Maps and Its Applications , 2012, ICCS.

[24]  S. Schultz Principles of Neural Science, 4th ed. , 2001 .