Variational Approach to Factor Analysis and Related Models

This thesis focuses on factor analysis, extended factor analysis and parallel factor analysis. For these models Expectation maximization and variational Bayes algorithms are derived. The variational algorithms are implemented with hierarchical automatic relevance determination priors in order to automatically infer the model order. The algorithms are tested on synthetic data and used for analyzing amino acids and fMRI data. In a model selection task the variational lower bound on the marginal likelihood inferred, in a reliable manner, the correct model type. On the other hand, it was found that the lower bound under estimated the model order compared to the Bayesian information criterion. The eUect of hyperparameter optimiza- tion is discussed in relation to the generalization error. Furthermore, it is found that the variational algorithms have properties similar to independent component analysis. A model called probabilistic partial least squares is proposed and used in a NPAIRS analysis of the fMRI data. The NPAIRS analysis and the parallel factor model were found to give very similar results.

[1]  Charles M. Bishop Variational principal components , 1999 .

[2]  Hagai Attias,et al.  Independent Factor Analysis , 1999, Neural Computation.

[3]  Tommi S. Jaakkola,et al.  Tutorial on variational approximation methods , 2000 .

[4]  L. K. Hansen,et al.  Feature‐space clustering for fMRI meta‐analysis , 2001, Human brain mapping.

[5]  Matthew J. Beal Variational algorithms for approximate Bayesian inference , 2003 .

[6]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine Learning.

[7]  Richard A. Harshman,et al.  Foundations of the PARAFAC procedure: Models and conditions for an "explanatory" multi-model factor analysis , 1970 .

[8]  N. L. Johnson,et al.  Multivariate Analysis , 1958, Nature.

[9]  Lars Kai Hansen,et al.  Bayesian Averaging is Well-Temperated , 1999, NIPS.

[10]  Matthew J. Beal,et al.  The variational Bayesian EM algorithm for incomplete data: with application to scoring graphical model structures , 2003 .

[11]  Essa Yacoub,et al.  The Evaluation of Preprocessing Choices in Single-Subject BOLD fMRI Using NPAIRS Performance Metrics , 2003, NeuroImage.

[12]  Hagai Attias,et al.  Inferring Parameters and Structure of Latent Variable Models by Variational Bayes , 1999, UAI.

[13]  Geoffrey E. Hinton,et al.  Keeping the neural networks simple by minimizing the description length of the weights , 1993, COLT '93.

[14]  Michael E. Tipping,et al.  Probabilistic Principal Component Analysis , 1999 .

[15]  R. Bro PARAFAC. Tutorial and applications , 1997 .

[16]  David J. C. MacKay,et al.  Bayesian Interpolation , 1992, Neural Computation.

[17]  Zoubin Ghahramani,et al.  Propagation Algorithms for Variational Bayesian Learning , 2000, NIPS.

[18]  Zoubin Ghahramani,et al.  A Unifying Review of Linear Gaussian Models , 1999, Neural Computation.

[19]  Geoffrey E. Hinton,et al.  The EM algorithm for mixtures of factor analyzers , 1996 .

[20]  Radford M. Neal Annealed importance sampling , 1998, Stat. Comput..

[21]  Rasmus Bro,et al.  MULTI-WAY ANALYSIS IN THE FOOD INDUSTRY Models, Algorithms & Applications , 1998 .

[22]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[23]  David J. C. MacKay,et al.  Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[24]  Temple F. Smith Occam's razor , 1980, Nature.

[25]  Zoubin Ghahramani,et al.  Variational Inference for Bayesian Mixtures of Factor Analysers , 1999, NIPS.

[26]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[27]  Geoffrey E. Hinton,et al.  Parameter estimation for linear dynamical systems , 1996 .

[28]  Michael I. Jordan,et al.  Bayesian parameter estimation via variational methods , 2000, Stat. Comput..

[29]  Geoffrey E. Hinton,et al.  A View of the Em Algorithm that Justifies Incremental, Sparse, and other Variants , 1998, Learning in Graphical Models.

[30]  David Mackay,et al.  Probable networks and plausible predictions - a review of practical Bayesian methods for supervised neural networks , 1995 .