Variational learning of finite Dirichlet mixture models using component splitting

Finite Dirichlet mixture models have proved to be an effective knowledge representation and inference engine in several machine learning and data mining applications. In this paper, we address the task of learning and selecting finite Dirichlet mixture models in an incremental variational way. A learning algorithm based on component splitting and local model selection is proposed. The merits of the proposed approach are illustrated using synthetic data as well as real challenging applications involving object detection, text documents clustering and distinguishing photographic images from computer graphics.

[1]  Andrew Zisserman,et al.  Scene Classification Via pLSA , 2006, ECCV.

[2]  David G. Lowe,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[3]  Jiawei Han,et al.  Efficient and Effective Clustering Methods for Spatial Data Mining , 1994, VLDB.

[4]  M. Evans,et al.  Methods for Approximating Integrals in Statistics with Special Emphasis on Bayesian Integration Problems , 1995 .

[5]  Tomaso A. Poggio,et al.  A Trainable System for Object Detection , 2000, International Journal of Computer Vision.

[6]  Nizar Bouguila,et al.  A Hybrid Feature Extraction Selection Approach for High-Dimensional Non-Gaussian Data Clustering , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Nizar Bouguila,et al.  Using unsupervised learning of a finite Dirichlet mixture model to improve pattern recognition applications , 2005, Pattern Recognit. Lett..

[8]  Enrique F. Castillo,et al.  Learning and Updating of Uncertainty in Dirichlet Models , 2004, Machine Learning.

[9]  Siwei Lyu,et al.  How realistic is photorealistic , 2005 .

[10]  Arne Leijon,et al.  Bayesian Estimation of Beta Mixture Models with Variational Inference , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Eytan Domany,et al.  Resampling Method for Unsupervised Estimation of Cluster Validity , 2001, Neural Computation.

[12]  Sergio M. Savaresi,et al.  Cluster Selection in Divisive Clustering Algorithms , 2002, SDM.

[13]  Nizar Bouguila,et al.  Variational Learning for Finite Dirichlet Mixture Models and Applications , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[14]  Dan Roth,et al.  Learning a Sparse Representation for Object Detection , 2002, ECCV.

[15]  Takeo Kanade,et al.  Human Face Detection in Visual Scenes , 1995, NIPS.

[16]  Geoffrey J. McLachlan,et al.  Finite Mixture Models , 2019, Annual Review of Statistics and Its Application.

[17]  Bo Wang,et al.  Convergence and Asymptotic Normality of Variational Bayesian Approximations for Expon , 2004, UAI.

[18]  Luc Van Gool,et al.  SURF: Speeded Up Robust Features , 2006, ECCV.

[19]  Nizar Bouguila,et al.  On Bayesian analysis of a finite generalized Dirichlet mixture via a Metropolis-within-Gibbs sampling , 2009, Pattern Analysis and Applications.

[20]  Nizar Bouguila,et al.  Model-based subspace clustering of non-Gaussian data , 2010, Neurocomputing.

[21]  Andrew Blake,et al.  Contour-based learning for object detection , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[22]  Charles R. Dyer,et al.  Model-based recognition in robot vision , 1986, CSUR.

[23]  Matthew J. Beal,et al.  The variational Bayesian EM algorithm for incomplete data: with application to scoring graphical model structures , 2003 .

[24]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[25]  Brendan J. Frey,et al.  Variational Learning in Nonlinear Gaussian Belief Networks , 1999, Neural Computation.

[26]  Nasir D. Memon,et al.  New Features to Identify Computer Generated Images , 2007, 2007 IEEE International Conference on Image Processing.

[27]  Andrew Zisserman,et al.  Extending Pictorial Structures for Object Recognition , 2004, BMVC.

[28]  Mark W. Woolrich,et al.  Variational bayes inference of spatial mixture models for segmentation , 2006, IEEE Transactions on Medical Imaging.

[29]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine Learning.

[30]  Shimon Ullman,et al.  Learning to Segment , 2004, ECCV.

[31]  Tomaso A. Poggio,et al.  A general framework for object detection , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[32]  James T. Kwok,et al.  Simplifying Mixture Models Through Function Approximation , 2006, IEEE Transactions on Neural Networks.

[33]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[34]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[35]  Nizar Bouguila,et al.  Unsupervised selection of a finite Dirichlet mixture model: an MML-based approach , 2006, IEEE Transactions on Knowledge and Data Engineering.

[36]  A. Hamza,et al.  Software modules categorization through likelihood and bayesian analysis of finite dirichlet mixtures , 2010 .

[37]  Anil K. Jain,et al.  Unsupervised Learning of Finite Mixture Models , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[38]  Daniel P. Huttenlocher,et al.  Pictorial Structures for Object Recognition , 2004, International Journal of Computer Vision.

[39]  Dimitris Karlis,et al.  Choosing Initial Values for the EM Algorithm for Finite Mixtures , 2003, Comput. Stat. Data Anal..

[40]  Vladimir Cherkassky,et al.  Learning from Data: Concepts, Theory, and Methods , 1998 .

[41]  Hagai Attias,et al.  Inferring Parameters and Structure of Latent Variable Models by Variational Bayes , 1999, UAI.

[42]  Ashish Kapoor,et al.  Located Hidden Random Fields: Learning Discriminative Parts for Object Detection , 2006, ECCV.

[43]  D. Chandler,et al.  Introduction To Modern Statistical Mechanics , 1987 .

[44]  Kenji Fukumizu,et al.  Critical Lines in Symmetry of Mixture Models and its Application to Component Splitting , 2002, NIPS.

[45]  Alex Pentland,et al.  Bayesian face recognition , 2000, Pattern Recognit..

[46]  Shih-Fu Chang,et al.  Physics-motivated features for distinguishing photographic images and computer graphics , 2005, ACM Multimedia.

[47]  N. Bouguila,et al.  A Dirichlet process mixture of dirichlet distributions for classification and prediction , 2008, 2008 IEEE Workshop on Machine Learning for Signal Processing.

[48]  G. W. Snedecor Statistical Methods , 1964 .

[49]  Cordelia Schmid,et al.  An Affine Invariant Interest Point Detector , 2002, ECCV.

[50]  Juyang Weng,et al.  Hierarchical Discriminant Regression , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[51]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[52]  Nizar Bouguila,et al.  Unsupervised Feature Selection for Accurate Recommendation of High-Dimensional Image Data , 2007, NIPS.

[53]  Adrian Corduneanu,et al.  Variational Bayesian Model Selection for Mixture Distributions , 2001 .

[54]  Ulrike von Luxburg,et al.  Nearest Neighbor Clustering: A Baseline Method for Consistent Clustering with Arbitrary Objective Functions , 2009, J. Mach. Learn. Res..

[55]  Aristidis Likas,et al.  A probabilistic RBF network for classification , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.

[56]  Jorma Laaksonen,et al.  Techniques for Still Image Scene Classification and Object Detection , 2006, ICANN.

[57]  Kenji Fukumizu,et al.  Local minima and plateaus in hierarchical structures of multilayer perceptrons , 2000, Neural Networks.

[58]  Christian P. Robert,et al.  Monte Carlo Statistical Methods , 2005, Springer Texts in Statistics.

[59]  D. Titterington,et al.  Convergence properties of a general algorithm for calculating variational Bayesian estimates for a normal mixture model , 2006 .

[60]  Paul E. Green,et al.  A cautionary note on using internal cross validation to select the number of clusters , 1999 .

[61]  Sergio M. Savaresi,et al.  On the performance of bisecting K-means and PDDP , 2001, SDM.

[62]  Shih-Fu Chang,et al.  Columbia Photographic Images and Photorealistic Computer Graphics Dataset , 2005 .

[63]  Marina Meila,et al.  Comparing clusterings: an axiomatic view , 2005, ICML.

[64]  Ingemar J. Cox,et al.  Feature-based face recognition using mixture-distance , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[65]  Daniel A. Keim,et al.  Clustering methods for large databases: from the past to the future , 1999, SIGMOD '99.

[66]  Aristidis Likas,et al.  Unsupervised Learning of Gaussian Mixtures Based on Variational Component Splitting , 2007, IEEE Transactions on Neural Networks.

[67]  Hagai Attias,et al.  A Variational Bayesian Framework for Graphical Models , 1999 .

[68]  Ivor W. Tsang,et al.  Maximum Margin Clustering Made Practical , 2009, IEEE Trans. Neural Networks.

[69]  Nizar Bouguila,et al.  A Variational Statistical Framework for Object Detection , 2011, ICONIP.

[70]  Pietro Perona,et al.  Weakly Supervised Scale-Invariant Learning of Models for Visual Recognition , 2007, International Journal of Computer Vision.

[71]  Daniel Boley,et al.  Principal Direction Divisive Partitioning , 1998, Data Mining and Knowledge Discovery.

[72]  Ashok N. Srivastava,et al.  Mixture Density Mercer Kernels: A Method to Learn Kernels Directly from Data , 2004, SDM.