Online Learning of Hierarchical Pitman–Yor Process Mixture of Generalized Dirichlet Distributions With Feature Selection

In this paper, a novel statistical generative model based on hierarchical Pitman-Yor process and generalized Dirichlet distributions (GDs) is presented. The proposed model allows us to perform joint clustering and feature selection thanks to the interesting properties of the GD distribution. We develop an online variational inference algorithm, formulated in terms of the minimization of a Kullback-Leibler divergence, of our resulting model that tackles the problem of learning from high-dimensional examples. This variational Bayes formulation allows simultaneously estimating the parameters, determining the model’s complexity, and selecting the appropriate relevant features for the clustering structure. Moreover, the proposed online learning algorithm allows data instances to be processed in a sequential manner, which is critical for large-scale and real-time applications. Experiments conducted using challenging applications, namely, scene recognition and video segmentation, where our approach is viewed as an unsupervised technique for visual learning in high-dimensional spaces, showed that the proposed approach is suitable and promising.

[1]  Chong-Wah Ngo,et al.  Threading and autodocumenting news videos: a promising solution to rapidly browse news topics , 2006, IEEE Signal Processing Magazine.

[2]  Anil K. Jain,et al.  Simultaneous feature selection and clustering using mixture models , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Ramesh C. Jain,et al.  Production model based digital video segmentation , 1995, Multimedia Tools and Applications.

[4]  Hayit Greenspan,et al.  A Probabilistic Framework for Spatio-Temporal Video Representation & Indexing , 2002, ECCV.

[5]  Brendan J. Frey,et al.  A comparison of algorithms for inference and learning in probabilistic graphical models , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Chong Wang,et al.  Variational inference in nonconjugate models , 2012, J. Mach. Learn. Res..

[7]  Michael I. Jordan,et al.  Variational inference for Dirichlet process mixtures , 2006 .

[8]  Boon-Lock Yeo,et al.  Rapid scene analysis on compressed video , 1995, IEEE Trans. Circuits Syst. Video Technol..

[9]  J. Filipe,et al.  OBJECTIVE EVALUATION OF VIDEO SEGMENTATION QUALITY , 2009 .

[10]  Guoliang Fan,et al.  Selecting Salient Frames for Spatiotemporal Video Modeling and Segmentation , 2007, IEEE Transactions on Image Processing.

[11]  Michael I. Jordan,et al.  Shared Segmentation of Natural Scenes Using Dependent Pitman-Yor Processes , 2008, NIPS.

[12]  Nizar Bouguila,et al.  Object clustering and recognition using multi-finite mixtures for semantic classes and hierarchy modeling , 2014, Expert Syst. Appl..

[13]  J. Sethuraman A CONSTRUCTIVE DEFINITION OF DIRICHLET PRIORS , 1991 .

[14]  Michael I. Jordan,et al.  Bayesian Nonparametrics: Hierarchical Bayesian nonparametric models with applications , 2010 .

[15]  Sheng-Wen Shih,et al.  Spatiotemporal Motion Analysis for the Detection and Classification of Moving Targets , 2008, IEEE Transactions on Multimedia.

[16]  Harpreet S. Sawhney,et al.  Compact Representations of Videos Through Dominant and Multiple Motion Estimation , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Michael I. Jordan,et al.  Hierarchical Bayesian Nonparametric Models with Applications , 2008 .

[18]  Françoise Dibos,et al.  Displacement Following of Hidden Objects in a Video Sequence , 2004, International Journal of Computer Vision.

[19]  Nicholas I. Fisher,et al.  Bump hunting in high-dimensional data , 1999, Stat. Comput..

[20]  Dongbing Gu,et al.  Distributed EM Algorithm for Gaussian Mixtures in Sensor Networks , 2008, IEEE Transactions on Neural Networks.

[21]  Geoffrey J. McLachlan,et al.  On the choice of the number of blocks with the incremental EM algorithm for the fitting of normal mixtures , 2003, Stat. Comput..

[22]  Jianping Fan,et al.  Concept-oriented indexing of video databases: toward semantic sensitive retrieval and browsing , 2004, IEEE Transactions on Image Processing.

[23]  Chong Wang,et al.  Stochastic variational inference , 2012, J. Mach. Learn. Res..

[24]  Trevor Darrell,et al.  Supervised hierarchical Pitman-Yor process for natural scene segmentation , 2011, CVPR 2011.

[25]  Hichem Frigui,et al.  Unsupervised clustering and feature weighting based on Generalized Dirichlet mixture modeling , 2014, Inf. Sci..

[26]  Hagai Attias,et al.  A Variational Bayesian Framework for Graphical Models , 1999 .

[27]  Qi Tian,et al.  A unified framework for semantic shot classification in sports video , 2005, IEEE Trans. Multim..

[28]  Nizar Bouguila,et al.  A hybrid SEM algorithm for high-dimensional unsupervised learning using a finite generalized Dirichlet mixture , 2006, IEEE Transactions on Image Processing.

[29]  Xiang Zhang,et al.  OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks , 2013, ICLR.

[30]  Z. Meral Özsoyoglu,et al.  Distance-based indexing for high-dimensional metric spaces , 1997, SIGMOD '97.

[31]  Jordi Vitrià,et al.  Using an ICA representation of high dimensional data for object recognition and classification , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[32]  C. Robert,et al.  Deviance information criteria for missing data models , 2006 .

[33]  Nizar Bouguila,et al.  A Hybrid Feature Extraction Selection Approach for High-Dimensional Non-Gaussian Data Clustering , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  Michael I. Jordan,et al.  Hierarchical Dirichlet Processes , 2006 .

[35]  A. Murat Tekalp,et al.  Automatic Soccer Video Analysis and Summarization , 2003, IS&T/SPIE Electronic Imaging.

[36]  Kentaro Toyama,et al.  Wallflower: principles and practice of background maintenance , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[37]  A. Murat Tekalp,et al.  Performance measures for video object segmentation and tracking , 2003, IEEE Transactions on Image Processing.

[38]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[39]  Trevor Darrell,et al.  DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition , 2013, ICML.

[40]  J. Pitman,et al.  The two-parameter Poisson-Dirichlet distribution derived from a stable subordinator , 1997 .

[41]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[42]  Christopher K. I. Williams,et al.  Fast Learning of Sprites using Invariant Features , 2005, BMVC.

[43]  Yang Wang,et al.  Spatiotemporal video segmentation based on graphical models , 2005, IEEE Transactions on Image Processing.

[44]  Nizar Bouguila,et al.  High-Dimensional Unsupervised Selection and Estimation of a Finite Generalized Dirichlet Mixture Model Based on Minimum Message Length , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[45]  A. Durio E. D. Isaia,et al.  A quick procedure for model selection in the case of mixture of normal densities , 2007, Comput. Stat. Data Anal..

[46]  Joseph F. Murray,et al.  Visual Recognition and Inference Using Dynamic Overcomplete Sparse Learning , 2007, Neural Computation.

[47]  Michael I. Jordan Graphical Models , 2003 .

[48]  Harold J. Kushner,et al.  Stochastic Approximation Algorithms and Applications , 1997, Applications of Mathematics.

[49]  Nizar Bouguila,et al.  Variational learning of a Dirichlet process of generalized Dirichlet distributions for simultaneous clustering and feature selection , 2013, Pattern Recognit..

[50]  Krista A. Ehinger,et al.  SUN database: Large-scale scene recognition from abbey to zoo , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[51]  Volker Tresp,et al.  Generative binary codes , 2003, Formal Pattern Analysis & Applications.

[52]  Joan Batlle,et al.  A new approach to outdoor scene description based on learning and top-down segmentation , 2001, Image Vis. Comput..

[53]  Catherine B. Hurley,et al.  Clustering Visualizations of Multidimensional Data , 2004 .

[54]  Tom Minka,et al.  Expectation Propagation for approximate Bayesian inference , 2001, UAI.

[55]  Yanxi Liu,et al.  Online Selection of Discriminative Tracking Features , 2005, IEEE Trans. Pattern Anal. Mach. Intell..

[56]  Yee Whye Teh,et al.  A Hierarchical Bayesian Language Model Based On Pitman-Yor Processes , 2006, ACL.

[57]  Marco Di Zio,et al.  A mixture of mixture models for a classification problem: The unity measure error , 2007, Comput. Stat. Data Anal..

[58]  Pietro Perona,et al.  A Probabilistic Approach to Object Recognition Using Local Photometry and Global Geometry , 1998, ECCV.

[59]  Dacheng Tao,et al.  Biologically Inspired Feature Manifold for Scene Classification , 2010, IEEE Transactions on Image Processing.

[60]  William T. Freeman,et al.  Learning to Estimate Scenes from Images , 1998, NIPS.

[61]  Jitendra Malik,et al.  Object Segmentation by Long Term Analysis of Point Trajectories , 2010, ECCV.

[62]  Daniel Hernández-Lobato,et al.  Generalized spike-and-slab priors for Bayesian group feature selection using expectation propagation , 2013, J. Mach. Learn. Res..

[63]  Yee Whye Teh,et al.  Dependent Normalized Random Measures , 2013, ICML.

[64]  Chong Wang,et al.  Online Variational Inference for the Hierarchical Dirichlet Process , 2011, AISTATS.

[65]  William T. Freeman,et al.  Efficient graphical models for processing images , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[66]  Bolei Zhou,et al.  Learning Deep Features for Scene Recognition using Places Database , 2014, NIPS.

[67]  Christopher K. I. Williams,et al.  Learning About Multiple Objects in Images: Factorial Learning without Factorial Search , 2002, NIPS.

[68]  Irfan A. Essa,et al.  Tree-based Classifiers for Bilayer Video Segmentation , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[69]  Geoffrey E. Hinton,et al.  Variational Learning for Switching State-Space Models , 2000, Neural Computation.

[70]  Mike Schuster,et al.  Better Generative Models for Sequential Data Problems: Bidirectional Recurrent Mixture Density Networks , 1999, NIPS.

[71]  Michael I. Jordan,et al.  A Unified Probabilistic Model for Global and Local Unsupervised Feature Selection , 2011, ICML.

[72]  Chong Wang,et al.  The IBP Compound Dirichlet Process and its Application to Focused Topic Modeling , 2010, ICML.

[73]  Zhen Li,et al.  Hierarchical Gaussianization for image classification , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[74]  Andrew Blake,et al.  An HMM-Based Segmentation Method for Traffic Monitoring Movies , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[75]  Stefan Schaal,et al.  Incremental Online Learning in High Dimensions , 2005, Neural Computation.

[76]  P. McNicholas,et al.  Extending mixtures of multivariate t-factor analyzers , 2011, Stat. Comput..

[77]  Anuj Srivastava,et al.  Universal Analytical Forms for Modeling Image Probabilities , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[78]  Nizar Bouguila,et al.  Variational Learning for Finite Dirichlet Mixture Models and Applications , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[79]  Masa-aki Sato,et al.  Online Model Selection Based on the Variational Bayes , 2001, Neural Computation.

[80]  Nizar Bouguila,et al.  Unsupervised Hybrid Feature Extraction Selection for High-Dimensional Non-Gaussian Data Clustering with Variational Inference , 2013, IEEE Transactions on Knowledge and Data Engineering.