A Comparative Study on Data Smoothing Regularization for Local Factor Analysis

Selecting the cluster number and the hidden factor numbers of Local Factor Analysis (LFA) model is a typical model selection problem, which is difficult when the sample size is finite or small. Data smoothing is one of the three regularization techniques integrated in the statistical learning framework, Bayesian Ying-Yang (BYY) harmony learning theory, to improve parameter learning and model selection. In this paper, we will comparatively investigate the performance of five existing formulas to determine the hyper-parameter namely the smoothing parameter that controls the strength of data smoothing regularization. BYY learning algorithms on LFA using these formulas are evaluated by model selection accuracy on simulated data and classification accuracy on real world data. Two observations are obtained. First, learning with data smoothing works better than that without it especially when sample size is small. Second, the gradient method derived from imposing a sample set based improper prior on the smoothing parameter generally outperforms other methods such as the one from Gamma or Chi-square prior, and the one under the equal covariance principle.

[1]  Lei Xu,et al.  Bayesian Ying Yang learning , 2007, Scholarpedia.

[2]  Lei Xu,et al.  A Trend on Regularization and Model Selection in Statistical Learning: A Bayesian Ying Yang Learning Perspective , 2007, Challenges for Computational Intelligence.

[3]  H. Akaike A new look at the statistical model identification , 1974 .

[4]  Lei Xu,et al.  A Unified Learning Scheme: Bayesian-Kullback Ying-Yang Machines , 1995, NIPS.

[5]  Nikola Kasabov,et al.  Brain-like Computing and Intelligent Information Systems , 1998 .

[6]  Lei Shi,et al.  Bayesian Ying-Yang Harmony Learning for Local Factor Analysis: A Comparative Investigation , 2008, Oppositional Concepts in Computational Intelligence.

[7]  Christopher M. Bishop,et al.  A New Framework for Machine Learning , 2008, WCCI.

[8]  Lei Xu,et al.  Bayesian Ying Yang System, Best Harmony Learning, and Gaussian Manifold Based Family , 2008, WCCI.

[9]  Geoffrey E. Hinton,et al.  Recognizing Handwritten Digits Using Mixtures of Linear Models , 1994, NIPS.

[10]  Włodzisław Duch,et al.  Challenges for Computational Intelligence , 2007, Studies in Computational Intelligence.

[11]  Christopher M. Bishop,et al.  Mixtures of Probabilistic Principal Component Analyzers , 1999, Neural Computation.

[12]  Lei Xu,et al.  Data smoothing regularization, multi-sets-learning, and problem solving strategies , 2003, Neural Networks.

[13]  Lei Xu,et al.  A unified perspective and new results on RHT computing, mixture based learning, and multi-learner based problem solving , 2007, Pattern Recognit..

[14]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[15]  Zoubin Ghahramani,et al.  Variational Inference for Bayesian Mixtures of Factor Analysers , 1999, NIPS.

[16]  Christopher M. Bishop,et al.  Current address: Microsoft Research, , 2022 .