Supervised Dirichlet Process Mixtures of Principal Component Analysis

Abstract We introduce probabilistic principal component analysis (PPCA) into Dirichlet Process Mixtures of Generalized Linear Models (DPGLM) and propose a new model called Supervised Dirichlet Process Mixtures of Principal Component Analysis (SDPM-PCA). In SDPM-PCA, we assume covariates and response variable are generated separately through the latent variable of PPCA, and nonparametrically modeled using the Dirichlet Process Mixture. By jointly learning the latent variable, cluster label and response variable, SDPM-PCA performs locally dimensionality reduction within each mixture component, and learns a supervised model based on the latent variable. In this way, SDPM-PCA improves the performance of both dimensionality reduction and prediction on high-dimensional data with all advantages of DPGLM. We also develop an inference algorithm for SDPM-PCA based on variational inference, which provides faster training speed and deterministic approximation compared with sampling algorithms based on MCMC method. Finally, we instantiate SDPM-PCA in regression problem with a Bayesian linear regression model. We test it on several real-world datasets and compare the prediction performance with DPGLM and other regular regression model. Experiment results show that by setting properly latent dimension number, SDPM-PCA would provide better prediction performance on high-dimensional regression problem and avoid the curse of dimensionality problem in DPGLM.

[1]  Zoubin Ghahramani,et al.  Propagation Algorithms for Variational Bayesian Learning , 2000, NIPS.

[2]  Fang Zhou,et al.  Predicting the Geographical Origin of Music , 2014, 2014 IEEE International Conference on Data Mining.

[3]  P. McCullagh,et al.  Generalized Linear Models , 1984 .

[4]  Ning Chen,et al.  Infinite SVM: a Dirichlet Process Mixture of Large-margin Kernel Machines , 2011, ICML.

[5]  Kevin P. Murphy,et al.  Machine learning - a probabilistic perspective , 2012, Adaptive computation and machine learning series.

[6]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[7]  Samuel J. Gershman,et al.  A Tutorial on Bayesian Nonparametric Models , 2011, 1106.2697.

[8]  Todd,et al.  Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning , 2002, Nature Medicine.

[9]  C. Antoniak Mixtures of Dirichlet Processes with Applications to Bayesian Nonparametric Problems , 1974 .

[10]  Christopher M. Bishop,et al.  Robust Bayesian Mixture Modelling , 2005, ESANN.

[11]  Christopher J. C. Burges,et al.  Dimension Reduction: A Guided Tour , 2010, Found. Trends Mach. Learn..

[12]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[13]  Babak Shahbaba,et al.  Nonlinear Models Using Dirichlet Process Mixtures , 2007, J. Mach. Learn. Res..

[14]  Matthew J. Beal,et al.  The variational Bayesian EM algorithm for incomplete data: with application to scoring graphical model structures , 2003 .

[15]  Warren B. Powell,et al.  Dirichlet Process Mixtures of Generalized Linear Models , 2009, J. Mach. Learn. Res..

[16]  K. Ritter,et al.  The Curse of Dimension and a Universal Method For Numerical Integration , 1997 .

[17]  Michael I. Jordan,et al.  Variational inference for Dirichlet process mixtures , 2006 .

[18]  Michael E. Tipping,et al.  Probabilistic Principal Component Analysis , 1999 .

[19]  Hans-Peter Kriegel,et al.  Supervised probabilistic principal component analysis , 2006, KDD '06.

[20]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine Learning.

[21]  Jiangtao Ren,et al.  An Infinite Latent Generalized Linear Model , 2014, WAIM.

[22]  Stan Lipovetsky,et al.  Latent Variable Models and Factor Analysis , 2001, Technometrics.

[23]  Charles M. Bishop Variational principal components , 1999 .