Sparse audio representations using the MCLT

We consider sparse representations of audio based around the modulated complex lapped transform (MCLT) and a generalized iteratively reweighted least squares algorithm which can be interpreted as a variation of expectation maximization. We compare this mildly overcomplete representation to the more traditional modified discrete cosine transform (MDCT) in terms of coding cost and explore the possibility of extending it to a dual-resolution analysis using a pair of MCLT transforms, illustrating its potential application for audio modification.

[1]  Barak A. Pearlmutter,et al.  Blind separation of sources with sparse representations in a given signal dictionary , 2000 .

[2]  Bruno Torrésani,et al.  Determining local transientness of audio signals , 2004, IEEE Signal Processing Letters.

[3]  P. Tseng,et al.  Block Coordinate Relaxation Methods for Nonparametric Wavelet Denoising , 2000 .

[4]  Edward H. Adelson,et al.  Shiftable multiscale transforms , 1992, IEEE Trans. Inf. Theory.

[5]  Bruno Torrésani,et al.  Hybrid representations for audiophonic signal encoding , 2002, Signal Process..

[6]  S. Mallat A wavelet tour of signal processing , 1998 .

[7]  Joel A. Tropp,et al.  Greed is good: algorithmic results for sparse approximation , 2004, IEEE Transactions on Information Theory.

[8]  G. McLachlan,et al.  The EM algorithm and extensions , 1996 .

[9]  Mário A. T. Figueiredo Adaptive Sparseness Using Jeffreys Prior , 2001, NIPS.

[10]  Terrence J. Sejnowski,et al.  Learning Overcomplete Representations , 2000, Neural Computation.

[11]  M. West On scale mixtures of normal distributions , 1987 .

[12]  Michael A. Saunders,et al.  Atomic Decomposition by Basis Pursuit , 1998, SIAM J. Sci. Comput..

[13]  Stéphane Mallat,et al.  Matching pursuits with time-frequency dictionaries , 1993, IEEE Trans. Signal Process..

[14]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[15]  P. Laguna,et al.  Signal Processing , 2002, Yearbook of Medical Informatics.

[16]  Bhaskar D. Rao,et al.  Subset selection in noise based on diversity measure minimization , 2003, IEEE Trans. Signal Process..

[17]  George Eastman House,et al.  Sparse Bayesian Learning and the Relevance Vector Machine , 2001 .

[18]  Mike E. Davies,et al.  Fast sparse subband decomposition using FIRSP , 2004, 2004 12th European Signal Processing Conference.

[19]  N.G. Kingsbury,et al.  Frequency-domain motion estimation using a complex lapped transform , 1993, IEEE Trans. Image Process..

[20]  Stephen G. Wilson,et al.  Magnitude/Phase Quantization of Independent Gaussian Variates , 1980, IEEE Trans. Commun..

[21]  Mike E. Davies,et al.  Sparsifying subband decompositions , 2003, 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (IEEE Cat. No.03TH8684).

[22]  Barak A. Pearlmutter,et al.  Blind source separation by sparse decomposition , 2000, SPIE Defense + Commercial Sensing.

[23]  Henrique S. Malvar A modulated complex lapped transform and its applications to audio processing , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[24]  Barak A. Pearlmutter,et al.  Blind Source Separation by Sparse Decomposition in a Signal Dictionary , 2001, Neural Computation.

[25]  J. Tropp JUST RELAX: CONVEX PROGRAMMING METHODS FOR SUBSET SELECTION AND SPARSE APPROXIMATION , 2004 .

[26]  Bhaskar D. Rao,et al.  Sparse signal reconstruction from limited data using FOCUSS: a re-weighted minimum norm algorithm , 1997, IEEE Trans. Signal Process..