Bayesian inference by reversible jump MCMC for clustering based on finite generalized inverted Dirichlet mixtures

The goal of constructing models from examples has been approached from different perspectives. Statistical methods have been widely used and proved effective in generating accurate models. Finite Gaussian mixture models have been widely used to describe a wide variety of random phenomena and have played a prominent role in many attempts to develop expressive statistical models in machine learning. However, their effectiveness is limited to applications where underlying modeling assumptions (e.g., the per-components densities are Gaussian) are reasonably satisfied. Thus, much research efforts have been devoted to developing better alternatives. In this paper, we focus on constructing statistical models from positive vectors (i.e., vectors whose elements are strictly greater than zero) for which the generalized inverted Dirichlet (GID) mixture has been shown to be a flexible and powerful parametric framework. In particular, we propose a Bayesian density estimation method based upon mixtures of GIDs. The consideration of Bayesian learning is interesting in several respects. It allows to take uncertainty into account by introducing prior information about the parameters, it allows simultaneous parameters estimation and model selection, and it allows to overcome learning problems related to over- or under-fitting. Indeed, we develop a reversible jump Markov Chain Monte Carlo sampler for GID mixtures that we apply for simultaneous clustering and feature selection in the context of some challenging real-world applications concerning scene classification, action recognition, and video forgery detection.

[1]  P. Green,et al.  On Bayesian Analysis of Mixtures with an Unknown Number of Components (with discussion) , 1997 .

[2]  Ming Tan,et al.  Cost-Sensitive Learning of Classification Knowledge and Its Applications in Robotics , 1993, Machine Learning.

[3]  John M. Olin Markov Chain Monte Carlo Analysis of Correlated Count Data , 2003 .

[4]  Chia-Wen Lin,et al.  Video forgery detection using correlation of noise residue , 2008, 2008 IEEE 10th Workshop on Multimedia Signal Processing.

[5]  Ming Tan,et al.  Cost-sensitive learning of classification knowledge and its applications in robotics , 2004, Machine Learning.

[6]  Jean-Marc Odobez,et al.  A Thousand Words in a Scene , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Wen Gao,et al.  Histogram of Gabor Phase Patterns (HGPP): A Novel Object Representation Approach for Face Recognition , 2007, IEEE Transactions on Image Processing.

[8]  Jon M. Kleinberg,et al.  Mapping the world's photos , 2009, WWW '09.

[9]  Inchi Hu,et al.  Flexible modelling of random effects in linear mixed models - A Bayesian approach , 2008, Comput. Stat. Data Anal..

[10]  James W. Davis,et al.  The Recognition of Human Movement Using Temporal Templates , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  Mubarak Shah,et al.  View-invariance in action recognition , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[12]  George K. Papakonstantinou,et al.  Mixture Density Estimation Based on Maximum Likelihood and Sequential Test Statistics , 1999, Neural Processing Letters.

[13]  Silvio Savarese,et al.  Toward coherent object detection and scene layout understanding , 2011, Image Vis. Comput..

[14]  Heleno Bolfarine,et al.  Bayesian density estimation using skew student-t-normal mixtures , 2008, Comput. Stat. Data Anal..

[15]  Juan Moreno García,et al.  Video sequence motion tracking by fuzzification techniques , 2010, Appl. Soft Comput..

[16]  Ajay Jasra,et al.  Population-Based Reversible Jump Markov Chain Monte Carlo , 2007, 0711.0186.

[17]  Nizar Bouguila,et al.  Video Completion Using Bandlet Transform , 2012, IEEE Transactions on Multimedia.

[18]  Charles Bouveyron,et al.  Simultaneous model-based clustering and visualization in the Fisher discriminative subspace , 2011, Statistics and Computing.

[19]  Hassan Hajji,et al.  Statistical analysis of network traffic for adaptive faults detection , 2005, IEEE Transactions on Neural Networks.

[20]  Shuang Wang,et al.  A Review on Human Activity Recognition Using Vision-Based Method , 2017, Journal of healthcare engineering.

[21]  P. Bickel,et al.  Some theory for Fisher''s linear discriminant function , 2004 .

[22]  Ivan Laptev,et al.  Velocity adaptation of space-time interest points , 2004, ICPR 2004.

[23]  Xiaochun Cao,et al.  MIFT: A Mirror Reflection Invariant Feature Descriptor , 2009, ACCV.

[24]  José G. Dias,et al.  An empirical comparison of EM, SEM and MCMC performance for problematic Gaussian mixture likelihoods , 2004, Stat. Comput..

[25]  D.M. Mount,et al.  An Efficient k-Means Clustering Algorithm: Analysis and Implementation , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[26]  Yingjie Tian,et al.  A Comprehensive Survey of Clustering Algorithms , 2015, Annals of Data Science.

[27]  Kin-Man Lam,et al.  Optimal sampling of Gabor features for face recognition , 2004, Pattern Recognit. Lett..

[28]  Zhuowen Tu,et al.  Image Segmentation by Data-Driven Markov Chain Monte Carlo , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[29]  Takahiro Okabe,et al.  Detecting Forgery From Static-Scene Video Based on Inconsistency in Noise Level Functions , 2010, IEEE Transactions on Information Forensics and Security.

[30]  M. Karthikeyan,et al.  Probability based document clustering and image clustering using content-based image retrieval , 2013, Appl. Soft Comput..

[31]  Zhihua Zhang,et al.  Learning a multivariate Gaussian mixture model with the reversible jump MCMC algorithm , 2004, Stat. Comput..

[32]  Mubarak Shah,et al.  Action MACH a spatio-temporal Maximum Average Correlation Height filter for action recognition , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[33]  Nizar Bouguila,et al.  On Bayesian analysis of a finite generalized Dirichlet mixture via a Metropolis-within-Gibbs sampling , 2009, Pattern Analysis and Applications.

[34]  M. J. Rufo,et al.  Bayesian analysis of finite mixture models of distributions from exponential families , 2006, Comput. Stat..

[35]  M. Meilă Comparing clusterings---an information based distance , 2007 .

[36]  Serge J. Belongie,et al.  Behavior recognition via sparse spatio-temporal features , 2005, 2005 IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance.

[37]  D. Geiger,et al.  Stratified exponential families: Graphical models and model selection , 2001 .

[38]  Amit Konar,et al.  Automatic image pixel clustering with an improved differential evolution , 2009, Appl. Soft Comput..

[39]  Yuan Yan Tang,et al.  Efficient Human Motion Retrieval via Temporal Adjacent Bag of Words and Discriminative Neighborhood Preserving Dictionary Learning , 2017, IEEE Transactions on Human-Machine Systems.

[40]  James L. Crowley,et al.  Probabilistic recognition of activity using local appearance , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[41]  Bernt Schiele,et al.  Recognition without Correspondence using Multidimensional Receptive Field Histograms , 2004, International Journal of Computer Vision.

[42]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[43]  Douglas R. Heisterkamp,et al.  Content-based image retrieval with the normalized information distance , 2008, Comput. Vis. Image Underst..

[44]  William W. Cohen,et al.  Learning to match and cluster large high-dimensional data sets for data integration , 2002, KDD.

[45]  Sami Bourouis,et al.  Bayesian learning of finite generalized inverted Dirichlet mixtures: Application to object classification and forgery detection , 2014, Expert Syst. Appl..

[46]  Anil K. Jain,et al.  Simultaneous feature selection and clustering using mixture models , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[47]  James W. Davis,et al.  The representation and recognition of human movement using temporal templates , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[48]  Cun-Quan Zhang,et al.  A new clustering method and its application in social networks , 2011, Pattern Recognit. Lett..

[49]  Pierre Baldi,et al.  A Bayesian framework for the analysis of microarray expression data: regularized t -test and statistical inferences of gene changes , 2001, Bioinform..

[50]  David A. Forsyth,et al.  Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary , 2002, ECCV.

[51]  Svetlana Lazebnik,et al.  Scene recognition and weakly supervised object localization with deformable part-based models , 2011, 2011 International Conference on Computer Vision.

[52]  Zeshui Xu,et al.  A spectral clustering algorithm based on intuitionistic fuzzy information , 2013, Knowl. Based Syst..

[53]  Nikos A. Vlassis,et al.  A kurtosis-based dynamic approach to Gaussian mixture modeling , 1999, IEEE Trans. Syst. Man Cybern. Part A.

[54]  Larry S. Davis,et al.  Gait Recognition Using Image Self-Similarity , 2004, EURASIP J. Adv. Signal Process..

[55]  Bernt Schiele,et al.  Pedestrian detection in crowded scenes , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[56]  D. Rubin,et al.  Inference from Iterative Simulation Using Multiple Sequences , 1992 .

[57]  Rainer Lienhart,et al.  Empirical Analysis of Detection Cascades of Boosted Classifiers for Rapid Object Detection , 2003, DAGM-Symposium.

[58]  Zoltan Kato,et al.  Segmentation of color images via reversible jump MCMC sampling , 2008, Image Vis. Comput..

[59]  Ashish Ghosh,et al.  Fuzzy clustering algorithms incorporating local information for change detection in remotely sensed images , 2012, Appl. Soft Comput..

[60]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[61]  Radford M. Neal Slice Sampling , 2003, The Annals of Statistics.

[62]  Weihong Wang,et al.  Exposing Digital Forgeries in Interlaced and Deinterlaced Video , 2007, IEEE Transactions on Information Forensics and Security.

[63]  Nizar Bouguila,et al.  Bayesian hybrid generative discriminative learning based on finite Liouville mixture models , 2011, Pattern Recognit..

[64]  Fatih Murat Porikli,et al.  Compressive Clustering of High-Dimensional Data , 2012, 2012 11th International Conference on Machine Learning and Applications.

[65]  LinLin Shen,et al.  MutualBoost learning for selecting Gabor features for face recognition , 2006, Pattern Recognit. Lett..

[66]  Deng Cai,et al.  Unsupervised feature selection for multi-cluster data , 2010, KDD.

[67]  G. McLachlan,et al.  On a resampling approach for tests on the number of clusters with mixture model-based clustering of tissue samples , 2004 .

[68]  Ivor W. Tsang,et al.  Visual Event Recognition in Videos by Learning from Web Data , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[69]  Nizar Bouguila,et al.  Positive vectors clustering using inverted Dirichlet finite mixture models , 2012, Expert Syst. Appl..

[70]  Nizar Bouguila,et al.  Spatial Color Image Databases Summarization , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[71]  Domenico Talia,et al.  P-AutoClass: Scalable Parallel Clustering for Mining Large Data Sets , 2003, IEEE Trans. Knowl. Data Eng..

[72]  Chien-Hsing Chen FEATURE SELECTION BASED ON COMPACTNESS AND SEPARABILITY: COMPARISON WITH FILTER‐BASED METHODS , 2014, Comput. Intell..

[73]  Raphaël Marée,et al.  Random subwindows for robust image classification , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[74]  Daphne Koller,et al.  Learning Spatial Context: Using Stuff to Find Things , 2008, ECCV.

[75]  Nizar Bouguila,et al.  Robust simultaneous positive data clustering and unsupervised feature selection using generalized inverted Dirichlet mixture models , 2014, Knowl. Based Syst..

[76]  Sudipto Guha,et al.  CURE: an efficient clustering algorithm for large databases , 1998, SIGMOD '98.

[77]  Thomas Hofmann,et al.  Unsupervised Learning by Probabilistic Latent Semantic Analysis , 2004, Machine Learning.

[78]  Geoffrey E. Hinton Products of experts , 1999 .

[79]  Song-Chun Zhu,et al.  Analysis and synthesis of textured motion: particles and waves , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[80]  Weihong Wang,et al.  Exposing digital forgeries in video by detecting duplication , 2007, MM&Sec.

[81]  Jake K. Aggarwal,et al.  Human Motion Analysis: A Review , 1999, Comput. Vis. Image Underst..

[82]  S. Walker Invited comment on the paper "Slice Sampling" by Radford Neal , 2003 .

[83]  Xiaodong Liu,et al.  DBCAMM: A novel density based clustering algorithm via using the Mahalanobis metric , 2012, Appl. Soft Comput..

[84]  Mustafa Mat Deris,et al.  MAR: Maximum Attribute Relative of soft set for clustering attribute selection , 2013, Knowl. Based Syst..

[85]  Tsung-I Lin,et al.  Bayesian analysis of hierarchical linear mixed modeling using the multivariate t distribution , 2007 .

[86]  Lihi Zelnik-Manor,et al.  Event-based analysis of video , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[87]  B. S. Manjunath,et al.  Cortina: a system for large-scale, content-based web image retrieval , 2004, MULTIMEDIA '04.

[88]  José Carlos Príncipe,et al.  Information Theoretic Clustering , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[89]  Jun S. Liu,et al.  The Multiple-Try Method and Local Optimization in Metropolis Sampling , 2000 .

[90]  Hujun Bao,et al.  A Variance Minimization Criterion to Feature Selection Using Laplacian Regularization , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[91]  Charles T. Zahn,et al.  Graph-Theoretical Methods for Detecting and Describing Gestalt Clusters , 1971, IEEE Transactions on Computers.

[92]  Geoffrey J. McLachlan,et al.  Modelling high-dimensional data by mixtures of factor analyzers , 2003, Comput. Stat. Data Anal..

[93]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[94]  Mandava Rajeswari,et al.  Multi-objective nature-inspired clustering and classification techniques for image segmentation , 2011, Appl. Soft Comput..

[95]  Nizar Bouguila,et al.  Finite Generalized Gaussian Mixture Modeling and Applications to Image and Video Foreground Segmentation , 2007, Fourth Canadian Conference on Computer and Robot Vision (CRV '07).

[96]  Shree K. Nayar,et al.  Multiresolution histograms and their use for recognition , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[97]  Shi-Kuo Chang,et al.  An Intelligent Image Database System , 1988, IEEE Trans. Software Eng..