AUDIO SIGNAL REPRESENTATIONS FOR FACTORIZATION IN THE SPARSE DOMAIN

In this paper, a new class of audio representations is introduced, together with a corresponding fast decomposition algorithm. The main feature of these representations is that they are both sparse and approximately shift-invariant, which allows similarity search in a sparse domain. The common sparse support of detected similar patterns is then used to factorize their representations. The potential of this method for simultaneous structural analysis and compressing tasks is illustrated by preliminary experiments on simple musical data.

[1]  Bob L. Sturm,et al.  On Similarity Search in Audio Signals Using Adaptive Sparse Approximations , 2009, Adaptive Multimedia Retrieval.

[2]  Kannan Ramchandran,et al.  Distributed source coding using syndromes (DISCUS): design and construction , 2003, IEEE Trans. Inf. Theory.

[3]  Abraham Lempel,et al.  A universal algorithm for sequential data compression , 1977, IEEE Trans. Inf. Theory.

[4]  Vladimir N. Temlyakov,et al.  A Criterion for Convergence of Weak Greedy Algorithms , 2002, Adv. Comput. Math..

[5]  Stéphane Mallat,et al.  Matching pursuits with time-frequency dictionaries , 1993, IEEE Trans. Signal Process..

[6]  Michael S. Lewicki,et al.  Efficient coding of natural sounds , 2002, Nature Neuroscience.

[7]  Gaël Richard,et al.  Union of MDCT Bases for Audio Coding , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[8]  Mike E. Davies,et al.  Sparse and shift-Invariant representations of music , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[9]  Xavier Rodet,et al.  Analysis of sound signals with high resolution matching pursuit , 1996, Proceedings of Third International Symposium on Time-Frequency and Time-Scale Analysis (TFTS-96).