Similarity graphs for the concealment of long duration data loss in music

We present a novel method for the compensation of long duration data gaps in audio signals, in particular music. The concealment of such signal defects is based on a graph that encodes signal structure in terms of time-persistent spectral similarity. A suitable candidate segment for the substitution of the lost content is proposed by an intuitive optimization scheme and smoothly inserted into the gap. Extensive listening tests show that the proposed algorithm provides highly promising results when applied to a variety of real-world music signals.

[1]  Michael Elad,et al.  Self-content-based audio inpainting , 2015, Signal Process..

[2]  Tristan Jehan EVENT-SYNCHRONOUS MUSIC ANALYSIS / SYNTHESIS , 2004 .

[3]  Kai Siedenburg,et al.  Audio Inpainting with Social Sparsity , 2013 .

[4]  Peter L. Søndergaard,et al.  The Phase Derivative Around Zeros of the Short-Time Fourier Transform , 2011 .

[5]  Michael Elad,et al.  Audio Inpainting , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[6]  Wai-Choong Wong,et al.  Waveform substitution techniques for recovering missing speech segments in packet voice communications , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[7]  DeLiang Wang,et al.  Separation of singing voice from music accompaniment for monaural recordings , 2007 .

[8]  Jeremy Todd,et al.  Parametric Interpolation of Gaps in Audio Signals , 2008 .

[9]  Roy D. Patterson,et al.  A Dynamic Compressive Gammachirp Auditory Filterbank , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[10]  W. Etter,et al.  Restoration of a discrete-time signal segment by interpolation based on the left-sided and right-sided autoregressive parameters , 1996, IEEE Trans. Signal Process..

[11]  David G. Lowe,et al.  Scalable Nearest Neighbor Algorithms for High Dimensional Data , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Michael Elad,et al.  A constrained matching pursuit approach to audio declipping , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[13]  Patrick Flandrin,et al.  Improving the readability of time-frequency and time-scale representations by the reassignment method , 1995, IEEE Trans. Signal Process..

[14]  M. Victor Wickerhauser,et al.  Adapted local trigonometric transforms and speech processing , 1993, IEEE Trans. Signal Process..

[15]  Pierre Vandergheynst,et al.  GSPBOX: A toolbox for signal processing on graphs , 2014, ArXiv.

[16]  V. Hardman,et al.  A survey of packet loss recovery techniques for streaming audio , 1998, IEEE Network.

[17]  Dennis Gabor,et al.  Theory of communication , 1946 .

[18]  Nicki Holighaus,et al.  Reassignment and synchrosqueezing for general time-frequency filter banks, subsampling and processing , 2016, Signal Process..

[19]  J.B. Allen,et al.  A unified approach to short-time Fourier analysis and synthesis , 1977, Proceedings of the IEEE.

[20]  Ta Vinh Thong,et al.  Exemplar-based Assignment of Large Missing Audio Parts using String Matching on Tonal Features , 2011, ISMIR.

[21]  Stephen McAdams,et al.  Music: A science of the mind? , 1987 .

[22]  Mathieu Lagrange,et al.  Long Interpolation of Audio Signals Using Linear Prediction in Sinusoidal Modeling , 2005 .

[23]  Heiga Zen,et al.  An HMM-based singing voice synthesis system , 2006, INTERSPEECH.

[24]  Nicki Holighaus,et al.  The Large Time-Frequency Analysis Toolbox 2.0 , 2013, CMMR.

[25]  Thibaud Necciari,et al.  A Perceptually Motivated Filter Bank with Perfect Reconstruction for Audio Signal Processing , 2016, ArXiv.