Classification and Representation via Separable Subspaces: Performance Limits and Algorithms

We study the classification performance of Kronecker-structured (K-S) subpsace models in two asymptotic regimes and develop an algorithm for fast and compact K-S subspace learning for better classification and representation of multidimensional signals by exploiting the structure in the signal. First, we study the classification performance in terms of <italic>diversity order</italic> and pairwise geometry of the subspaces. We derive an exact expression for the diversity order as a function of the signal and subspace dimensions of a K-S model. Next, we study the <italic> classification capacity</italic>, the maximum rate at which the number of classes can grow as the signal dimension goes to infinity. Then, we describe a fast algorithm for <italic>Kronecker-structured learning of discriminative dictionaries</italic> (K-SLD<inline-formula><tex-math notation="LaTeX">$^2$</tex-math></inline-formula>). Finally, we evaluate the empirical classification performance of K-S models for the synthetic data, showing that they agree with the diversity order analysis. We also evaluate the performance of K-SLD<inline-formula><tex-math notation="LaTeX">$^2$ </tex-math></inline-formula> on synthetic and real-world datasets showing that the K-SLD<inline-formula> <tex-math notation="LaTeX">$^2$</tex-math></inline-formula> balances compact signal representation and good classification performance.

[1]  Michael Elad,et al.  Image Denoising Via Sparse and Redundant Representations Over Learned Dictionaries , 2006, IEEE Transactions on Image Processing.

[2]  Matthew S. Nokleby,et al.  Fast and compact Kronecker-structured dictionary learning for classification and representation , 2017, 2017 51st Asilomar Conference on Signals, Systems, and Computers.

[3]  Matthew S. Nokleby,et al.  Performance limits on the classification of Kronecker-structured models , 2017, 2017 IEEE International Symposium on Information Theory (ISIT).

[4]  A. Robert Calderbank,et al.  Discrimination on the grassmann manifold: Fundamental limits of subspace classifiers , 2014, 2014 IEEE International Symposium on Information Theory.

[5]  Xuanqin Mou,et al.  Tensor-based dictionary learning for dynamic tomographic reconstruction , 2015, Physics in medicine and biology.

[6]  Alfred O. Hero,et al.  On Convergence of Kronecker Graphical Lasso Algorithms , 2012, IEEE Transactions on Signal Processing.

[7]  Petre Stoica,et al.  On Estimation of Covariance Matrices With Kronecker Product Structure , 2008, IEEE Transactions on Signal Processing.

[8]  R. E. Cline,et al.  The Rank of a Difference of Matrices and Associated Generalized Inverses , 1976 .

[9]  Syed Zubair,et al.  Tensor dictionary learning with sparse TUCKER decomposition , 2013, 2013 18th International Conference on Digital Signal Processing (DSP).

[10]  Christian A. Rees,et al.  Systematic variation in gene expression patterns in human cancer cell lines , 2000, Nature Genetics.

[11]  Michael Elad,et al.  Compression of facial images using the K-SVD algorithm , 2008, J. Vis. Commun. Image Represent..

[12]  Peizhen Zhu,et al.  Principal angles between subspaces and their tangents , 2012 .

[13]  Isabelle Guyon,et al.  Comparison of classifier methods: a case study in handwritten digit recognition , 1994, Proceedings of the 12th IAPR International Conference on Pattern Recognition, Vol. 3 - Conference C: Signal Processing (Cat. No.94CH3440-5).

[14]  Larry S. Davis,et al.  Learning a discriminative dictionary for sparse coding via label consistent K-SVD , 2011, CVPR 2011.

[15]  Anand D. Sarwate,et al.  Minimax lower bounds for Kronecker-structured dictionary learning , 2016, 2016 IEEE International Symposium on Information Theory (ISIT).

[16]  Lizhong Zheng,et al.  Diversity and multiplexing: a fundamental tradeoff in multiple-antenna channels , 2003, IEEE Trans. Inf. Theory.

[17]  M. Elad,et al.  $rm K$-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation , 2006, IEEE Transactions on Signal Processing.

[18]  Larry S. Davis,et al.  Label Consistent K-SVD: Learning a Discriminative Dictionary for Recognition , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  A. Robert Calderbank,et al.  The Role of Principal Angles in Subspace Classification , 2015, IEEE Transactions on Signal Processing.

[20]  A. Robert Calderbank,et al.  Classification and Reconstruction of High-Dimensional Signals From Low-Dimensional Features in the Presence of Side Information , 2014, IEEE Transactions on Information Theory.

[21]  J. W. Silverstein The Smallest Eigenvalue of a Large Dimensional Wishart Matrix , 1985 .

[22]  Lei Zhang,et al.  Metaface learning for sparse representation based face recognition , 2010, 2010 IEEE International Conference on Image Processing.

[23]  Alfred O. Hero,et al.  Kronecker sum decompositions of space-time data , 2013, 2013 5th IEEE International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP).

[24]  Demetri Terzopoulos,et al.  Multilinear subspace analysis of image ensembles , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[25]  Alfred O. Hero,et al.  Covariance Estimation in High Dimensions Via Kronecker Product Expansions , 2013, IEEE Transactions on Signal Processing.

[26]  Martin Kleinsteuber,et al.  Separable Dictionary Learning , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Joos Vandewalle,et al.  A Multilinear Singular Value Decomposition , 2000, SIAM J. Matrix Anal. Appl..

[28]  Baoxin Li,et al.  Discriminative K-SVD for dictionary learning in face recognition , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[29]  Daniel D. Lee,et al.  Grassmann discriminant analysis: a unifying view on subspace-based learning , 2008, ICML '08.

[30]  Yen-Wei Chen,et al.  K-CPD: Learning of overcomplete dictionaries for tensor sparse coding , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[31]  Kjersti Engan,et al.  Method of optimal directions for frame design , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[32]  H. Begleiter,et al.  Event related potentials during object recognition tasks , 1995, Brain Research Bulletin.

[33]  A. Bruckstein,et al.  K-SVD : An Algorithm for Designing of Overcomplete Dictionaries for Sparse Representation , 2005 .

[34]  David Zhang,et al.  Fisher Discrimination Dictionary Learning for sparse representation , 2011, 2011 International Conference on Computer Vision.

[35]  David J. Kriegman,et al.  From Few to Many: Illumination Cone Models for Face Recognition under Variable Lighting and Pose , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[36]  Douglas A. Reynolds,et al.  Robust text-independent speaker identification using Gaussian mixture speaker models , 1995, IEEE Trans. Speech Audio Process..

[37]  Haizhou Li,et al.  An overview of text-independent speaker recognition: From features to supervectors , 2010, Speech Commun..

[38]  Tamara G. Kolda,et al.  Tensor Decompositions and Applications , 2009, SIAM Rev..

[39]  Guillermo Sapiro,et al.  Classification and clustering via dictionary learning with structured incoherence and shared features , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[40]  Misha Elena Kilmer,et al.  A tensor-based dictionary learning approach to tomographic image reconstruction , 2015, BIT Numerical Mathematics.

[41]  Thomas S. Huang,et al.  Coupled Dictionary Training for Image Super-Resolution , 2012, IEEE Transactions on Image Processing.

[42]  Vishal Monga,et al.  Fast Low-Rank Shared Dictionary Learning for Image Classification , 2016, IEEE Transactions on Image Processing.

[43]  D. Bernstein Matrix Mathematics: Theory, Facts, and Formulas , 2009 .

[44]  Richard P. Wildes,et al.  Dynamic scene understanding: The role of orientation features in space and time in scene classification , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[45]  P. Dutilleul The mle algorithm for the matrix normal distribution , 1999 .

[46]  Yi Yang,et al.  Decomposable Nonlocal Tensor Dictionary Learning for Multispectral Image Denoising , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[47]  David J. Kriegman,et al.  Acquiring linear subspaces for face recognition under variable lighting , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[48]  T. Kailath The Divergence and Bhattacharyya Distance Measures in Signal Selection , 1967 .