Music Scene-Adaptive Harmonic Dictionary for Unsupervised Note-Event Detection

Harmonic decompositions are a powerful tool dealing with polyphonic music signals in some potential applications such as music visualization, music transcription and instrument recog- nition. The usefulness of a harmonic decomposition relies on the design of a proper harmonic dictionary. Music scene-adaptive har- monic atoms have been used with this purpose. These atoms are adapted to the musical instruments and to the music scene, in- cluding aspects related with the venue, musician, and other rele- vant acoustic properties. In this paper, an unsupervised process to obtain music scene-adaptive spectral patterns for each MIDI-note is proposed. Furthermore, the obtained harmonic dictionary is ap- plied to note-event detection with matching pursuits. In the case of a music database that only consists of one-instrument signals, promising results (high accuracy and low error rate) have been achieved for note-event detection.

[2]  Mark B. Sandler,et al.  Automatic Piano Transcription Using Frequency and Time-Domain Information , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[3]  Anssi Klapuri,et al.  Multiple Fundamental Frequency Estimation by Summing Harmonic Amplitudes , 2006, ISMIR.

[4]  Hirokazu Kameoka,et al.  Specmurt Analysis of Polyphonic Music Signals , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[5]  Laurent Daudet,et al.  Sparse and structured decompositions of signals with the molecular matching pursuit , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[6]  Y. C. Pati,et al.  Orthogonal matching pursuit: recursive function approximation with applications to wavelet decomposition , 1993, Proceedings of 27th Asilomar Conference on Signals, Systems and Computers.

[7]  J. J. Carabias-Orti,et al.  Note-event Detection in Polyphonic Musical Signals based on Harmonic Matching Pursuit and Spectral Smoothness , 2008 .

[8]  Matija Marolt,et al.  A connectionist approach to automatic transcription of polyphonic piano music , 2004, IEEE Transactions on Multimedia.

[9]  Emmanuel Vincent,et al.  Instrument-Specific Harmonic Atoms for Mid-Level Music Representation , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[10]  A. Willsky,et al.  HIGH RESOLUTION PURSUIT FOR FEATURE EXTRACTION , 1998 .

[11]  Stéphane Mallat,et al.  Matching pursuits with time-frequency dictionaries , 1993, IEEE Trans. Signal Process..

[12]  Soledad Torres-Guijarro,et al.  Multiple Piano Note Identification Using a Spectral Matching Method with Derived Patterns , 2005 .

[13]  Masataka Goto,et al.  Development of the RWC Music Database , 2004 .

[14]  Ian Witten,et al.  Data Mining , 2000 .

[15]  D. Donoho,et al.  Basis pursuit , 1994, Proceedings of 1994 28th Asilomar Conference on Signals, Systems and Computers.

[16]  Manuel Rosa-Zurera,et al.  Transient modeling by matching pursuits with a wavelet dictionary for parametric audio coding , 2004, IEEE Signal Processing Letters.

[17]  Simon Dixon,et al.  On the Computer Recognition of Solo Piano Music , 2000 .

[18]  Teresa H. Y. Meng,et al.  Sinusoidal modeling using frame-based perceptually weighted matching pursuits , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[19]  Ye Wang,et al.  Music transcription using an instrument model , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[20]  Roland Badeau,et al.  Blind Signal Decompositions for Automatic Transcription of Polyphonic Music: NMF and K-SVD on the Benchmark , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[21]  Michael M. Goodwin,et al.  Matching pursuit with damped sinusoids , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[22]  Rémi Gribonval,et al.  Harmonic decomposition of audio signals with matching pursuit , 2003, IEEE Trans. Signal Process..

[23]  Thomas F. Quatieri,et al.  Speech analysis/Synthesis based on a sinusoidal representation , 1986, IEEE Trans. Acoust. Speech Signal Process..

[24]  José Manuel Iñesta Quereda,et al.  Multiple fundamental frequency estimation using Gaussian smoothness , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[25]  Masataka Goto,et al.  RWC Music Database: Popular, Classical and Jazz Music Databases , 2002, ISMIR.

[26]  Bhaskar D. Rao,et al.  Sparse signal reconstruction from limited data using FOCUSS: a re-weighted minimum norm algorithm , 1997, IEEE Trans. Signal Process..

[27]  Daniel P. W. Ellis,et al.  A Discriminative Model for Polyphonic Piano Transcription , 2007, EURASIP J. Adv. Signal Process..

[28]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[29]  Masataka Goto A predominant-F/sub 0/ estimation method for CD recordings: MAP estimation using EM algorithm for adaptive tone models , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[30]  Nicolás Ruiz-Reyes,et al.  A Joint Approach to Extract Multiple Fundamental Frequency in Polyphonic Signals Minimizing Gaussian Spectral Distance , 2009 .