Learning From Hidden Traits: Joint Factor Analysis and Latent Clustering

Dimensionality reduction techniques play an essential role in data analytics, signal processing, and machine learning. Dimensionality reduction is usually performed in a preprocessing stage that is separate from subsequent data analysis, such as clustering or classification. Finding reduced-dimension representations that are well-suited for the intended task is more appealing. This paper proposes a joint factor analysis and latent clustering framework, which aims at learning cluster-aware low-dimensional representations of matrix and tensor data. The proposed approach leverages matrix and tensor factorization models that produce essentially unique latent representations of the data to unravel latent cluster structure-which is otherwise obscured because of the freedom to apply an oblique transformation in latent space. At the same time, latent cluster structure is used as prior information to enhance the performance of factorization. Specific contributions include several custom-built problem formulations, corresponding algorithms, and discussion of associated convergence properties. Besides extensive simulations, real-world datasets such as Reuters document data and MNIST image data are also employed to showcase the effectiveness of the proposed approaches.

[1]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[2]  Antonio J. Plaza,et al.  A Signal Processing Perspective on Hyperspectral Unmixing: Insights from Remote Sensing , 2014, IEEE Signal Processing Magazine.

[3]  Patrice Y. Simard,et al.  Metrics and Models for Handwritten Character Recognition , 1998 .

[4]  Christos Faloutsos,et al.  Location based Social Network analysis using Tensors and Signal Processing tools , 2015, 2015 IEEE 6th International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP).

[5]  J. Carroll,et al.  K-means clustering in a low-dimensional Euclidean space , 1994 .

[6]  Yihong Gong,et al.  Document clustering by concept factorization , 2004, SIGIR '04.

[7]  Christos Faloutsos,et al.  MultiAspectForensics: Pattern Mining on Large-Scale Heterogeneous Networks with Tensor Analysis , 2011, 2011 International Conference on Advances in Social Networks Analysis and Mining.

[8]  Xiaojun Wu,et al.  Graph Regularized Nonnegative Matrix Factorization for Data Representation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  J. A. Hartigan,et al.  A k-means clustering algorithm , 1979 .

[10]  Nikos D. Sidiropoulos,et al.  From K-Means to Higher-Way Co-Clustering: Multilinear Decomposition With Sparse Latent Factors , 2013, IEEE Transactions on Signal Processing.

[11]  Nikos D. Sidiropoulos,et al.  A Flexible and Efficient Algorithmic Framework for Constrained Matrix and Tensor Factorization , 2015, IEEE Transactions on Signal Processing.

[12]  R. Vidal A TUTORIAL ON SUBSPACE CLUSTERING , 2010 .

[13]  R. Mooney,et al.  Impact of Similarity Measures on Web-page Clustering , 2000 .

[14]  Chong-Yung Chi,et al.  A Convex Analysis-Based Minimum-Volume Enclosing Simplex Algorithm for Hyperspectral Unmixing , 2009, IEEE Transactions on Signal Processing.

[15]  P. Tseng Nearest q-Flat to m Points , 2000 .

[16]  Allen Y. Yang,et al.  Robust Face Recognition via Sparse Representation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Shuzhong Zhang,et al.  Maximum Block Improvement and Polynomial Optimization , 2012, SIAM J. Optim..

[18]  Bo Yang,et al.  Joint factor analysis and latent clustering , 2015, 2015 IEEE 6th International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP).

[19]  Bo Yang,et al.  Robust Volume Minimization-Based Matrix Factorization for Remote Sensing and Document Clustering , 2016, IEEE Transactions on Signal Processing.

[20]  Thomas Hofmann,et al.  Probabilistic Latent Semantic Indexing , 1999, SIGIR Forum.

[21]  René Vidal,et al.  Latent Space Sparse Subspace Clustering , 2013, 2013 IEEE International Conference on Computer Vision.

[22]  Nikolaos D. Sidiropoulos,et al.  Putting nonnegative matrix factorization to the test: a tutorial derivation of pertinent cramer—rao bounds and performance benchmarking , 2014, IEEE Signal Processing Magazine.

[23]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[24]  Chris H. Q. Ding,et al.  Symmetric Nonnegative Matrix Factorization for Graph Clustering , 2012, SDM.

[25]  Rakesh Agrawal,et al.  A Study of Distinctiveness in Web Results of Two Search Engines , 2015, WWW.

[26]  Richard A. Harshman,et al.  Foundations of the PARAFAC procedure: Models and conditions for an "explanatory" multi-model factor analysis , 1970 .

[27]  N. Sidiropoulos,et al.  On the uniqueness of multilinear decomposition of N‐way arrays , 2000 .

[28]  Ian Davidson,et al.  Network discovery via constrained tensor analysis of fMRI data , 2013, KDD.

[29]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[30]  Anna-Lan Huang,et al.  Similarity Measures for Text Document Clustering , 2008 .

[31]  Yong Yu,et al.  Robust Subspace Segmentation by Low-Rank Representation , 2010, ICML.

[32]  Christos Faloutsos,et al.  HaTen2: Billion-scale tensor decompositions , 2015, 2015 IEEE 31st International Conference on Data Engineering.

[33]  Victoria Stodden,et al.  When Does Non-Negative Matrix Factorization Give a Correct Decomposition into Parts? , 2003, NIPS.

[34]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[35]  J. Kruskal Three-way arrays: rank and uniqueness of trilinear decompositions, with application to arithmetic complexity and statistics , 1977 .

[36]  Jiawei Han,et al.  Locally Consistent Concept Factorization for Document Clustering , 2011, IEEE Transactions on Knowledge and Data Engineering.

[37]  José M. Bioucas-Dias,et al.  A variable splitting augmented Lagrangian approach to linear spectral unmixing , 2009, 2009 First Workshop on Hyperspectral Image and Signal Processing: Evolution in Remote Sensing.

[38]  Andreas T. Ernst,et al.  ICE: a statistical approach to identifying endmembers in hyperspectral images , 2004, IEEE Transactions on Geoscience and Remote Sensing.

[39]  N. Sidiropoulos,et al.  Least squares algorithms under unimodality and non‐negativity constraints , 1998 .

[40]  Wotao Yin,et al.  A Block Coordinate Descent Method for Regularized Multiconvex Optimization with Applications to Nonnegative Tensor Factorization and Completion , 2013, SIAM J. Imaging Sci..

[41]  H. Kiers,et al.  Factorial k-means analysis for two-way data , 2001 .

[42]  Tamara G. Kolda,et al.  Temporal Analysis of Social Networks using Three-way DEDICOM , 2006 .

[43]  Nicolas Gillis,et al.  Robust near-separable nonnegative matrix factorization using linear optimization , 2013, J. Mach. Learn. Res..

[44]  Dimitri P. Bertsekas,et al.  Nonlinear Programming , 1997 .

[45]  Nikos D. Sidiropoulos,et al.  Non-Negative Matrix Factorization Revisited: Uniqueness and Algorithm for Symmetric Decomposition , 2014, IEEE Transactions on Signal Processing.

[46]  Nicolas Gillis,et al.  The Why and How of Nonnegative Matrix Factorization , 2014, ArXiv.

[47]  Nikos D. Sidiropoulos,et al.  Blind Separation of Quasi-Stationary Sources: Exploiting Convex Geometry in Covariance Domain , 2015, IEEE Transactions on Signal Processing.

[48]  Haesun Park,et al.  Fast Nonnegative Matrix Factorization: An Active-Set-Like Method and Comparisons , 2011, SIAM J. Sci. Comput..

[49]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[50]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[51]  Yuan Yan Tang,et al.  Total variation norm-based nonnegative matrix factorization for identifying discriminant representation of image patterns , 2008, Neurocomputing.

[52]  René Vidal,et al.  Sparse Subspace Clustering: Algorithm, Theory, and Applications , 2012, IEEE transactions on pattern analysis and machine intelligence.

[53]  Nikos D. Sidiropoulos,et al.  Robust volume minimization-based matrix factorization via alternating optimization , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[54]  Xin Liu,et al.  Document clustering based on non-negative matrix factorization , 2003, SIGIR.