Local AM/FM Parameters Estimation: Application to Sinusoidal Modeling and Blind Audio Source Separation

This letter extends our recently introduced method which was designed to estimate instantaneous frequency and chirp rate of linearly modulated signals. Indeed, we derive several new estimators related to our previous ones which provide in the time-frequency plane all the signal parameters of the investigated model: amplitude, frequency, and their local modulations (AM/FM). Our estimators are first introduced and compared in terms of statistical efficiency with theoretical bounds and with other state-of-the-art estimators. Then, they are used to improve spectral analysis applied to audio sinusoidal modeling. Finally, they lead to a new source separation technique based on coherent amplitude and frequency modulation that is evaluated on real-world music signals.

[1]  Sylvain Marchand,et al.  Informed spectral analysis: audio signal parameter estimation using side information , 2013, EURASIP J. Adv. Signal Process..

[2]  Sylvain Meignen,et al.  Chirp Rate and Instantaneous Frequency Estimation: Application to Recursive Vertical Synchrosqueezing , 2017, IEEE Signal Processing Letters.

[3]  Anssi Klapuri,et al.  Separation of harmonic sound sources using sinusoidal modeling , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[4]  S. Marchand The Simplest Analysis Method for Non-stationary Sinusoidal Modeling , 2012 .

[5]  Julius O. Smith,et al.  PARSHL: An Analysis/Synthesis Program for Non-Harmonic Sounds Based on a Sinusoidal Representation , 1987, ICMC.

[6]  Ananthram Swami,et al.  On polynomial phase signals with time-varying amplitudes , 1996, IEEE Trans. Signal Process..

[7]  Krzysztof Czarnecki,et al.  A fast time-frequency multi-window analysis using a tuning directional kernel , 2018, Signal Process..

[8]  Patrick Flandrin,et al.  Improving the readability of time-frequency and time-scale representations by the reassignment method , 1995, IEEE Trans. Signal Process..

[9]  Geoffroy Peeters,et al.  Fast and Adaptive Blind Audio Source Separation Using Recursive Levenberg-Marquardt Synchrosqueezing , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[10]  Ingrid Daubechies,et al.  A Nonlinear Squeezing of the Continuous Wavelet Transform Based on Auditory Nerve Models , 2017 .

[11]  Mark J. T. Smith,et al.  Speech analysis/synthesis and modification using an analysis-by-synthesis/overlap-add sinusoidal model , 1997, IEEE Trans. Speech Audio Process..

[12]  Udo Zoelzer,et al.  DAFX: Digital Audio Effects , 2011 .

[13]  Philippe Depalle,et al.  A unified view of non-stationary sinusoidal parameter estimation methods using signal derivatives , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[14]  Jeroen Breebaart,et al.  ADVANCES IN PARAMETRIC CODING FOR HIGH-QUALITY AUDIO , 2003 .

[15]  Patrick Flandrin,et al.  Time-Frequency/Time-Scale Analysis , 1998 .

[16]  Sylvain Meignen,et al.  Time-Frequency Reassignment and Synchrosqueezing: An Overview , 2013, IEEE Signal Processing Magazine.

[17]  Matthias Mauch,et al.  MedleyDB: A Multitrack Dataset for Annotation-Intensive MIR Research , 2014, ISMIR.

[18]  Leon Cohen,et al.  Time Frequency Analysis: Theory and Applications , 1994 .

[19]  Pierre Comon,et al.  Handbook of Blind Source Separation: Independent Component Analysis and Applications , 2010 .

[20]  Thomas F. Quatieri,et al.  Speech analysis/Synthesis based on a sinusoidal representation , 1986, IEEE Trans. Acoust. Speech Signal Process..

[21]  Rémi Gribonval,et al.  Performance measurement in blind audio source separation , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[22]  Patrick Flandrin,et al.  Recursive versions of the Levenberg-Marquardt reassigned spectrogram and of the synchrosqueezed STFT , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[23]  Jont B. Allen,et al.  Short term spectral analysis, synthesis, and modification by discrete Fourier transform , 1977 .

[24]  Albert S. Bregman,et al.  The Auditory Scene. (Book Reviews: Auditory Scene Analysis. The Perceptual Organization of Sound.) , 1990 .

[25]  S. Marchand,et al.  GENERALIZATION OF THE DERIVATIVE ANALYSIS METHOD TO NON-STATIONARY SINUSOIDAL MODELING , 2008 .

[26]  T. Oberlin,et al.  Theoretical analysis of the second-order synchrosqueezing transform , 2016, Applied and Computational Harmonic Analysis.

[27]  Heiko Purnhagen,et al.  HILN-the MPEG-4 parametric audio coding tools , 2000, 2000 IEEE International Symposium on Circuits and Systems. Emerging Technologies for the 21st Century. Proceedings (IEEE Cat No.00CH36353).

[28]  Roland Badeau,et al.  Nonnegative Tensor Factorization with Frequency Modulation Cues for Blind Audio Source Separation , 2016, ISMIR.

[29]  François Auger,et al.  Estimation of time-frequency complex phase-based speech attributes using narrow band filter banks , 2017, 2017 Signal Processing Symposium (SPSympo).

[30]  Sylvain Meignen,et al.  The ASTRES toolbox for mode extraction of non-stationary multicomponent signals , 2017, 2017 25th European Signal Processing Conference (EUSIPCO).

[31]  Antoine Liutkus,et al.  Common fate model for unison source separation , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).