Unsupervised learning of sparse and shift-invariant decompositions of polyphonic music

Many time-series in engineering arise from a sparse mixture of individual components. Sparse coding can be used to decompose such signals into a set of functions. Most sparse coding algorithms divide the signal into blocks. The functions learned from these blocks are, however, not independent of the temporal alignment of the blocks. We present a fast algorithm for sparse coding that does not depend on the block location. To reduce the dimensionality of the problem, a subspace selection step is used during signal decomposition. Due to this reduction, an iterative reweighted least squares method can be used for the constrained optimisation. We demonstrate the algorithm's abilities by learning functions from a polyphonic piano recording. The found functions represent individual notes and a sparse signal decomposition leads to a transcription of the piano signal.

[1]  Bhaskar D. Rao,et al.  Sparse signal reconstruction from limited data using FOCUSS: a re-weighted minimum norm algorithm , 1997, IEEE Trans. Signal Process..

[2]  Terrence J. Sejnowski,et al.  Learning Overcomplete Representations , 2000, Neural Computation.

[3]  David J. Field,et al.  Emergence of simple-cell receptive field properties by learning a sparse code for natural images , 1996, Nature.

[4]  Joseph F. Murray,et al.  An improved FOCUSS-based learning algorithm for solving sparse linear inverse problems , 2001, Conference Record of Thirty-Fifth Asilomar Conference on Signals, Systems and Computers (Cat.No.01CH37256).

[5]  Samer A. Abdallah,et al.  Towards music perception by redundancy reduction and unsupervised learning in probabilistic models , 2002 .

[6]  Anil K. Jain,et al.  Bayesian learning of sparse classifiers , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[7]  Mário A. T. Figueiredo Adaptive Sparseness for Supervised Learning , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Bhaskar D. Rao,et al.  An affine scaling methodology for best basis selection , 1999, IEEE Trans. Signal Process..