A Novel Deep Density Model for Unsupervised Learning

Density models are fundamental in machine learning and have received a widespread application in practical cognitive modeling tasks and learning problems. In this work, we introduce a novel deep density model, referred to as deep mixtures of factor analyzers with common loadings (DMCFA), with an efficient greedy layer-wise unsupervised learning algorithm. The model employs a mixture of factor analyzers sharing common component loadings in each layer. The common loadings can be considered to be a feature selection or reduction matrix which makes this new model more physically meaningful. Importantly, sharing common components is capable of reducing both the number of free parameters and computation complexity remarkably. Consequently, DMCFA makes inference and learning rely on a dramatically more succinct model and avoids sacrificing its flexibility in estimating the data density by utilizing Gaussian distributions as the priors. Our model is evaluated on five real datasets and compared to three other competitive models including mixtures of factor analyzers (MFA), MFA with common loadings (MCFA), deep mixtures of factor analyzers (DMFA), and their collapsed counterparts. The results demonstrate the superiority of the proposed model in the tasks of density estimation, clustering, and generation.

[1]  Huachun Tan,et al.  Variational Deep Embedding: An Unsupervised and Generative Approach to Clustering , 2016, IJCAI.

[2]  Geoffrey J. McLachlan,et al.  Mixtures of common t-factor analyzers for clustering high-dimensional microarray data , 2011, Bioinform..

[3]  Brian Johnson,et al.  Classifying a high resolution image of an urban area using super-object information , 2013 .

[4]  Rui Zhang,et al.  Deep Mixtures of Factor Analyzers with Common Loadings: A Novel Deep Generative Approach to Clustering , 2017, ICONIP.

[5]  Junyu Dong,et al.  Stretching deep architectures for text recognition , 2015, 2015 13th International Conference on Document Analysis and Recognition (ICDAR).

[6]  Ryan P. Browne,et al.  A mixture of generalized hyperbolic factor analyzers , 2013, Advances in Data Analysis and Classification.

[7]  Richard G. Baraniuk,et al.  A Probabilistic Theory of Deep Learning , 2015, ArXiv.

[8]  Danyang Li,et al.  Ensemble of Deep Neural Networks with Probability-Based Fusion for Facial Expression Recognition , 2017, Cognitive Computation.

[9]  G. McLachlan,et al.  The EM algorithm and extensions , 1996 .

[10]  Nikos A. Vlassis,et al.  The global k-means clustering algorithm , 2003, Pattern Recognit..

[11]  Fuad E. Alsaadi,et al.  Deep Belief Networks for Quantitative Analysis of a Gold Immunochromatographic Strip , 2016, Cognitive Computation.

[12]  Geoffrey J. McLachlan,et al.  Mixtures of Factor Analyzers with Common Factor Loadings: Applications to the Clustering and Visualization of High-Dimensional Data , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Wan-Lun Wang,et al.  Mixtures of common factor analyzers for high-dimensional data with missing information , 2013, J. Multivar. Anal..

[14]  Ryan P. Adams,et al.  High-Dimensional Probability Estimation with Deep Density Models , 2013, ArXiv.

[15]  Simon Haykin,et al.  GradientBased Learning Applied to Document Recognition , 2001 .

[16]  Geoffrey E. Hinton,et al.  Restricted Boltzmann machines for collaborative filtering , 2007, ICML '07.

[17]  Hui Wei,et al.  V4 Neural Network Model for Shape-Based Feature Extraction and Object Discrimination , 2015, Cognitive Computation.

[18]  G. McLachlan,et al.  The EM Algorithm and Extensions: Second Edition , 2008 .

[19]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[20]  Geoffrey E. Hinton,et al.  The EM algorithm for mixtures of factor analyzers , 1996 .

[21]  Brian Everitt,et al.  An Introduction to Latent Variable Models , 1984 .

[22]  Zoubin Ghahramani,et al.  Probabilistic machine learning and artificial intelligence , 2015, Nature.

[23]  Yann Ollivier,et al.  Layer-wise learning of deep generative models , 2012, ArXiv.

[24]  Sameer A. Nene,et al.  Columbia Object Image Library (COIL100) , 1996 .

[25]  Ryan P. Adams,et al.  Learning the Structure of Deep Sparse Graphical Models , 2009, AISTATS.

[26]  Geoffrey J. McLachlan,et al.  Mixtures of Factor Analyzers , 2000, International Conference on Machine Learning.

[27]  Sun-Yuan Kung,et al.  Biometric Authentication: A Machine Learning Approach , 2004 .

[28]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[29]  Brian Johnson,et al.  High-resolution urban land-cover classification using a competitive multi-scale object-based approach , 2013 .

[30]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[31]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[32]  Geoffrey E. Hinton,et al.  Deep Mixtures of Factor Analysers , 2012, ICML.

[33]  Yu Xue,et al.  Weight Uncertainty in Boltzmann Machine , 2016, Cognitive Computation.

[34]  Kaizhu Huang,et al.  Reducing and Stretching Deep Convolutional Activation Features for Accurate Image Classification , 2018, Cognitive Computation.

[35]  David B. Dunson,et al.  The Hierarchical Beta Process for Convolutional Factor Analysis and Deep Learning , 2011, ICML.

[36]  Rui Zhang,et al.  Joint Learning of Unsupervised Dimensionality Reduction and Gaussian Mixture Model , 2017, Neural Processing Letters.