Analysis of the Reconstruction of Sparse Signals in the DCT Domain Applied to Audio Signals

Sparse signals can be reconstructed from a reduced set of signal samples using compressive sensing (CS) methods. The discrete cosine transform (DCT) can provide highly concentrated representations of audio signals. This property implies the DCT as a good sparsity domain for the audio signals. In this paper, the DCT is studied within the context of sparse audio signal processing using the CS theory and methods. The DCT coefficients of a sparse signal, calculated with a reduced set of available samples, can be modeled as random variables. It has been shown that the statistical properties of these variables are closely related to the unique reconstruction conditions. The main result of this paper is in an exact formula for the mean-square reconstruction error in the case of approximately sparse and nonsparse noisy signals reconstructed under the sparsity assumption. Based on the presented analysis, a simple and computationally efficient reconstruction algorithm is proposed. The presented theoretical concepts and the efficiency of the reconstruction algorithm are verified numerically, including examples with synthetic and recorded audio signals with unavailable or corrupted samples. Random disturbances and disturbances simulating clicks or inpainting in audio signals are considered. Statistical verification is done on a dataset with experimental signals. Results are compared with some classical and recent methods used in similar signal and disturbance scenarios.

[1]  V. Hardman,et al.  A survey of packet loss recovery techniques for streaming audio , 1998, IEEE Network.

[2]  Man-Hung Siu,et al.  A Robust Viterbi Algorithm Against Impulsive Noise With Application to Speech Recognition , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[3]  Olivier Cappé,et al.  Elimination of the musical noise phenomenon with the Ephraim and Malah noise suppressor , 1994, IEEE Trans. Speech Audio Process..

[4]  R. Tibshirani,et al.  Regression shrinkage and selection via the lasso: a retrospective , 2011 .

[5]  Andreas Peter Burg,et al.  Live demonstration: Real-time audio restoration using sparse signal recovery , 2013, 2013 IEEE International Symposium on Circuits and Systems (ISCAS2013).

[6]  Emmanuel Vincent,et al.  Subjective and Objective Quality Assessment of Audio Source Separation , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[7]  Bernd Girod,et al.  A new error concealment technique for audio transmission with packet loss , 1996, 1996 8th European Signal Processing Conference (EUSIPCO 1996).

[8]  Ljubisa Stankovic,et al.  On a Gradient-Based Algorithm for Sparse Signal Reconstruction in the Signal/Measurements Domain , 2016 .

[9]  Luiz W. P. Biscainho,et al.  Bayesian Restoration of Audio Signals Degraded by Impulsive Noise Modeled as Individual Pulses , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[10]  Marc Teboulle,et al.  A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[11]  K. Rao,et al.  Discrete Cosine and Sine Transforms: General Properties, Fast Algorithms and Integer Approximations , 2006 .

[12]  Birger Kollmeier,et al.  PEMO-Q—A New Method for Objective Audio Quality Assessment Using a Model of Auditory Perception , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[13]  Srdjan Stankovic,et al.  Missing samples analysis in signals for applications to L-estimation and compressive sensing , 2014, Signal Process..

[14]  Stefan Goetze,et al.  Reduction of Gaussian, Supergaussian, and Impulsive Noise by Interpolation of the Binary Mask Residual , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[15]  Raymond N. J. Veldhuis,et al.  Adaptive interpolation of discrete-time signals that can be modeled as autoregressive processes , 1986, IEEE Trans. Acoust. Speech Signal Process..

[16]  Lloyd R. Welch,et al.  Lower bounds on the maximum cross correlation of signals (Corresp.) , 1974, IEEE Trans. Inf. Theory.

[17]  David Zhang,et al.  A Survey of Sparse Representation: Algorithms and Applications , 2015, IEEE Access.

[18]  David Malah,et al.  Packet Loss Concealment for Audio Streaming based on the GAPES and MAPES Algorithms , 2006, 2006 IEEE 24th Convention of Electrical & Electronics Engineers in Israel.

[19]  Mads Graesbll Christensen,et al.  On compressed sensing and its application to speech and audio signals , 2009, 2009 Conference Record of the Forty-Third Asilomar Conference on Signals, Systems and Computers.

[20]  Holger Rauhut Stability Results for Random Sampling of Sparse Trigonometric Polynomials , 2008, IEEE Transactions on Information Theory.

[21]  E.J. Candes,et al.  An Introduction To Compressive Sampling , 2008, IEEE Signal Processing Magazine.

[22]  Chung-Hsien Wu,et al.  Compressive Sensing-Based Speech Enhancement , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[23]  Richard G. Baraniuk,et al.  Compressive Sensing , 2008, Computer Vision, A Reference Guide.

[24]  Sanjit K. Mitra,et al.  Sampling rate conversion based on DFT and DCT , 2013, Signal Process..

[25]  Sergio Canazza,et al.  Restoration of Audio Documents by Means of Extended Kalman Filter , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[26]  Marlin H. Mickle,et al.  Comparative analysis of compressive sensing approaches for recovery of missing samples in implantable wireless Doppler device , 2014, IET Signal Process..

[27]  Simon J. Godsill,et al.  Statistical Model-Based Approaches to Audio Restoration and Analysis , 2001 .

[28]  Simon J. Godsill,et al.  Bayesian interpolation in a dynamic sinusoidal model with application to packet-loss concealment , 2010, 2010 18th European Signal Processing Conference.

[29]  Michael Elad,et al.  Optimally sparse representation in general (nonorthogonal) dictionaries via ℓ1 minimization , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[30]  Ljubisa Stankovic,et al.  Reconstruction of Sparse Signals in Impulsive Disturbance Environments , 2017, Circuits Syst. Signal Process..

[31]  Lawrence Carin,et al.  Bayesian Compressive Sensing , 2008, IEEE Transactions on Signal Processing.

[32]  Henrique S. Malvar,et al.  Signal processing with lapped transforms , 1992 .

[33]  K.R. Rao,et al.  An efficient implementation of the forward and inverse MDCT in MPEG audio coding , 2001, IEEE Signal Processing Letters.

[34]  Kenneth E. Barner,et al.  Robust Sampling and Reconstruction Methods for Sparse Signals in the Presence of Impulsive Noise , 2010, IEEE Journal of Selected Topics in Signal Processing.

[35]  Peter J. W. Rayner,et al.  Digital Audio Restoration: A Statistical Model Based Approach , 1998 .

[36]  Richard G. Baraniuk,et al.  VLSI Design of Approximate Message Passing for Signal Restoration and Compressive Sensing , 2012, IEEE Journal on Emerging and Selected Topics in Circuits and Systems.

[37]  Simon J. Godsill,et al.  Robust noise modelling with application to audio restoration , 1995, Proceedings of 1995 Workshop on Applications of Signal Processing to Audio and Accoustics.

[38]  Irena Orovic,et al.  An automated signal reconstruction method based on analysis of compressive sensed signals in noisy environment , 2014, Signal Process..

[39]  Joseph Nuzman,et al.  Audio Restoration : An Investigation of Digital Methods for Click Removal and Hiss Reduction , 2015 .

[40]  P.J.W. Rayner,et al.  The Detection and Correction of Artefacts in Degraded Gramophone Recordings , 1991, Final Program and Paper Summaries 1991 IEEE ASSP Workshop on Applications of Signal Processing to Audio and Acoustics.

[41]  Joon-Hyuk Chang,et al.  Packet Loss Concealment Based on Deep Neural Networks for Digital Speech Transmission , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[42]  Simon J. Godsill,et al.  A two-channel approach to the removal of impulsive noise from archived recordings , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[43]  Søren Holdt Jensen,et al.  Compressed domain packet loss concealment of sinusoidally coded speech , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[44]  W. Etter,et al.  Restoration of a discrete-time signal segment by interpolation based on the left-sided and right-sided autoregressive parameters , 1996, IEEE Trans. Signal Process..

[45]  S. Godsill,et al.  The multi-channel AR model for real-time audio restoration , 2005, IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2005..

[46]  Israel Cohen,et al.  Audio Packet Loss Concealment in a Combined MDCT-MDST Domain , 2007, IEEE Signal Processing Letters.

[47]  Lu Wang,et al.  An Improved Auto-Calibration Algorithm Based on Sparse Bayesian Learning Framework , 2013, IEEE Signal Processing Letters.

[48]  Hugo Van hamme,et al.  Compressive Sensing for Missing Data Imputation in Noise Robust Speech Recognition , 2010, IEEE Journal of Selected Topics in Signal Processing.

[49]  S. Mallat,et al.  Adaptive greedy approximations , 1997 .

[50]  A. Subramanya Automatic Removal of Typed Keyst , 2006 .

[51]  Jean Laroche,et al.  Evaluation of short-time spectral attenuation techniques for the restoration of musical recordings , 1995, IEEE Trans. Speech Audio Process..

[52]  Emmanuel Vincent,et al.  Improved Perceptual Metrics for the Evaluation of Audio Source Separation , 2012, LVA/ICA.

[53]  Maciej Niedzwiecki,et al.  Elimination of Impulsive Disturbances From Archive Audio Signals Using Bidirectional Processing , 2013, IEEE Transactions on Audio, Speech, and Language Processing.

[54]  Andries P. Hekstra,et al.  Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[55]  Wei-Ping Zhu,et al.  The Theory of Compressive Sensing Matching Pursuit Considering Time-domain Noise with Application to Speech Enhancement , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[56]  Simon J. Godsill,et al.  A Bayesian approach to the restoration of degraded audio signals , 1995, IEEE Trans. Speech Audio Process..

[57]  Ljubisa Stankovic,et al.  Nonsparsity influence on the ISAR recovery from reduced data [Correspondence] , 2016, IEEE Transactions on Aerospace and Electronic Systems.

[58]  Simon J. Godsill,et al.  An acoustic keystroke transient canceler for speech communication terminals using a semi-blind adaptive filter model , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[59]  Maciej Niedzwiecki,et al.  Detection of impulsive disturbances in archive audio signals , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[60]  David L Donoho,et al.  Compressed sensing , 2006, IEEE Transactions on Information Theory.

[61]  Xiao Su,et al.  A survey of error-concealment schemes for real-time audio and video transmissions over the Internet , 2000, Proceedings International Symposium on Multimedia Software Engineering.

[62]  Michael Elad,et al.  Sparse and Redundant Representations - From Theory to Applications in Signal and Image Processing , 2010 .

[63]  Jun Huang,et al.  A DCT-based fast signal subspace technique for robust speech recognition , 2000, IEEE Trans. Speech Audio Process..

[64]  Michael Elad,et al.  Audio Inpainting , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[65]  Simon J. Godsill,et al.  Removal of low frequency transient noise from old recordings using model-based signal separation techniques , 1997, Proceedings of 1997 Workshop on Applications of Signal Processing to Audio and Acoustics.

[66]  Wai-Choong Wong,et al.  Waveform substitution techniques for recovering missing speech segments in packet voice communications , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.