Perceptually motivated quasi-periodic signal selection for polyphonic music transcription

A multiple fundamental frequency estimator is a key building block in music transcription and indexing operations. However, systems trying to perform this task tend to be very complex [1]. Indeed, music transcription requires an analysis accounting for both physical and psycho-acoustical matters. In this work, we propose a physically-motivated audio signal analysis followed by an auditory-based selection. The audio signal model allows for a better time/frequency resolution tradeoff, while the auditory distance discards the redundant/non-relevant information. No prior information on the musical instrument, musical genre, and/or maximum polyphony are needed. Simulations show that the proposed technique achieves good transcription results for a variety of string and wind instruments. The proposed scheme is also shown to be robust in the presence of noise, percussive sounds and in unbalanced Signal-to-Interference Ratio (SIR) situations.

[1]  Emmanuel Vincent,et al.  Low Bit-Rate Object Coding of Musical Audio Using Bayesian Harmonic Models , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[2]  Mahdi Triki Some contributions to statistical signal processing and applications to audio enhancement and mobile localization , 2007 .

[3]  José M. Iñesta,et al.  Multiple fundamental frequency estimation using Gaussian smoothness and short context , 2008 .

[4]  Ivan Bruno,et al.  Automatic music transcription supporting different instruments , 2003, Proceedings Third International Conference on WEB Delivering of Music.

[5]  Mahdi Triki Music source separation via sparsified dictionaries vs. parametric models , 2006 .

[6]  Alain de Cheveigné,et al.  Separation of concurrent harmonic sounds: Fundamental frequency estimation and a time-domain cancell , 1993 .

[7]  Steven van de Par,et al.  Musical Key Extraction from Audio Using Profile Training , 2006, ISMIR.

[8]  Jesper Jensen,et al.  A Perceptual Model for Sinusoidal Audio Coding Based on Spectral Integration , 2005, EURASIP J. Adv. Signal Process..

[9]  Ray Meddis,et al.  Virtual pitch and phase sensitivity of a computer model of the auditory periphery , 1991 .

[10]  Hirokazu Kameoka,et al.  Probabilistic Approach to Automatic Music Transcription from Audio Signals , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[11]  Masataka Goto,et al.  RWC Music Database: Music genre database and musical instrument sound database , 2003, ISMIR.

[12]  Anssi Klapuri,et al.  Multiple Fundamental Frequency Estimation by Summing Harmonic Amplitudes , 2006, ISMIR.

[13]  G. Reis,et al.  Genetic Algorithm Approach to Polyphonic Music Transcription , 2007, 2007 IEEE International Symposium on Intelligent Signal Processing.

[14]  Dirk T. M. Slock,et al.  Periodic signal extraction with frequency-selective amplitude modulation and global time-warping for music signal decomposition , 2008, 2008 IEEE 10th Workshop on Multimedia Signal Processing.

[15]  A.P. Klapuri,et al.  A perceptually motivated multiple-F0 estimation method , 2005, IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2005..

[16]  M.P. Ryynanen,et al.  Polyphonic music transcription using note event modeling , 2005, IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2005..

[17]  Simon J. Godsill,et al.  Bayesian harmonic models for musical pitch estimation and analysis , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.