Efficient data adaption for musical source separation methods based on parametric models

The decomposition of a monaural audio recording into musically meaningful sound sources constitutes one of the central research topics in music signal processing. In this context, many recent approaches employ parametric models that describe a recording in a highly structured and musically informed way. However, a major drawback of such approaches is that the parameter learning process typically relies on computationally expensive data adaption methods. In this paper, the main idea is to distinguish parameters in which the model is linear explicitly from the remaining parameters. Exploiting the linearity we translate the data adaption problem into a sparse linear least squares problem with box constraints (SLLS-BC), a class of problems for which highly efficient numerical solvers exist. First experiments show that our approach based on modified SLLS-BC methods accelerates the data adaption by a factor of four or more compared to recently proposed methods.

[1]  Meinard Müller,et al.  Score-Informed Voice Separation For Piano Recordings , 2011, ISMIR.

[2]  Jun Wu,et al.  Multipitch estimation by joint modeling of harmonic and transient sounds , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[3]  Meinard Müller,et al.  Score-Informed Source Separation for Music Signals , 2012, Multimodal Music Processing.

[4]  Masataka Goto,et al.  Integration and Adaptation of Harmonic and Inharmonic Models for Separating Polyphonic Musical Signals , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[5]  Roland Badeau,et al.  Time-dependent parametric and harmonic templates in non-negative matrix factorization , 2010 .

[6]  Shigeki Sagayama,et al.  HMM-based approach for automatic chord detection using refined acoustic features , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[7]  Bhiksha Raj,et al.  Probabilistic Latent Variable Models as Nonnegative Factorizations , 2008, Comput. Intell. Neurosci..

[8]  D K Smith,et al.  Numerical Optimization , 2001, J. Oper. Res. Soc..

[9]  Roland Badeau,et al.  Score informed audio source separation using a parametric model of non-negative spectrogram , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[10]  Meinard Müller,et al.  Estimating note intensities in music recordings , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[11]  Masataka Goto,et al.  A real-time music-scene-description system: predominant-F0 estimation for detecting melody and bass lines in real-world audio signals , 2004, Speech Commun..

[12]  Hirokazu Kameoka,et al.  A Multipitch Analyzer Based on Harmonic Temporal Structured Clustering , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[13]  Anssi Klapuri,et al.  Musical Instrument Recognition in Polyphonic Audio Using Source-Filter Model for Sound Separation , 2009, ISMIR.

[14]  Guillermo Sapiro,et al.  Gaussian mixture models for score-informed instrument separation , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[15]  Parinya Sanguansat,et al.  A Music Information System Based on Improved Melody Contour Extraction , 2010, 2010 International Conference on Signal Acquisition and Processing.

[16]  Emilia Gómez,et al.  Melody Extraction From Polyphonic Music Signals Using Pitch Contour Characteristics , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[17]  Masataka Goto,et al.  Instrument Equalizer for Query-by-Example Retrieval: Improving Sound Source Separation Based on Integrated Harmonic and Inharmonic Models , 2008, ISMIR.

[18]  Mark D. Plumbley,et al.  Analysis-based sparse reconstruction with synthesis-based solvers , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[19]  Gaël Richard,et al.  Source/Filter Model for Unsupervised Main Melody Extraction From Polyphonic Audio Signals , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[20]  Meinard Müller,et al.  SAARLAND MUSIC DATA ( SMD ) , 2011 .