Visual learning given sparse data of unknown complexity

This study addresses the problem of unsupervised visual learning. It examines existing popular model order selection criteria before proposes two novel criteria for improving visual learning given sparse data and without any knowledge about model complexity. In particular, a rectified Bayesian information criterion (BICr) and a completed likelihood Akaike's information criterion (CL-AIC) are formulated to estimate the optimal model order (complexity) for learning the dynamic structure of a visual scene. Both criteria are designed to overcome poor model selection by existing popular criteria when the data sample size varies from very small to large. Extensive experiments on learning a dynamic scene structure are carried out to demonstrate the effectiveness of BICr and CL-AIC, compared to that of BIC (Schwarz, 1978), AIC (Akaike, 1973), ICL (Biernacki, 2000) and a MML (Figueiredo and Jain, 2002) based criterion.

[1]  Robert H. Shumway,et al.  Improved estimators of Kullback-Leibler information for autoregressive model selection in small samples , 1990 .

[2]  Gérard Govaert,et al.  Assessing a Mixture Model for Clustering with the Integrated Completed Likelihood , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Jorma Rissanen,et al.  Stochastic Complexity in Statistical Inquiry , 1989, World Scientific Series in Computer Science.

[4]  Stephen J. McKenna,et al.  Learning spatial context from tracking using penalised likelihoods , 2004, ICPR 2004.

[5]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[6]  Yoshua Bengio,et al.  Model Selection for Small Sample Regression , 2002, Machine Learning.

[7]  William D. Penny,et al.  Bayesian Approaches to Gaussian Mixture Modeling , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Solomon Kullback,et al.  Information Theory and Statistics , 1970, The Mathematical Gazette.

[9]  Takeo Kanade,et al.  Recognizing Action Units for Facial Expression Analysis , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Christopher M. Bishop,et al.  Mixtures of Probabilistic Principal Component Analyzers , 1999, Neural Computation.

[11]  Anil K. Jain,et al.  Unsupervised Learning of Finite Mixture Models , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Shaogang Gong,et al.  Autonomous Visual Events Detection and Classification without Explicit Object-Centred Segmentation and Tracking , 2002, BMVC.

[13]  L. Wasserman,et al.  Computing Bayes Factors by Combining Simulation and Asymptotic Approximations , 1997 .

[14]  H. Akaike,et al.  Information Theory and an Extension of the Maximum Likelihood Principle , 1973 .

[15]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[16]  Timothy F. Cootes,et al.  Active Appearance Models , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  S. Kullback,et al.  Information Theory and Statistics , 1959 .

[18]  A. Raftery Bayesian Model Selection in Social Research , 1995 .