Toward versatile structural modification for bayesian nonparametric time series models

Unsupervised learning techniques discover organizational structure in data, but to do so they must approach the problem with a priori assumptions. A fundamental trend in the development of these techniques has been the relaxation or elimination of the unwanted or arbitrary structural assumptions they impose. For systems that derive hidden Markov models (HMMs) from time series data, state-of-the-art techniques now assume only that the number of hidden states will be relatively small, a useful, flexible, and usually correct hypothesis. With unwanted structural constraints mitigated, we investigate a flexible means of introducing new, useful structural assumptions into an advanced HMM learning technique, assumptions that reflect details of our prior understanding of the problem. Our investigation, motivated by the unsupervised learning of view-based object models from video data, adapts a Bayesian nonparametric approach to inferring HMMs from data [1] to exhibit biases for nearly block diagonal transition dynamics, as well as for transitions between hidden states with similar emission models. We introduce aggressive Markov chain Monte Carlo sampling techniques for posterior inference in our generalized models, and demonstrate the technique in a collection of artificial and natural data settings, including the motivating object model learning problem.

[1]  Jun S. Liu,et al.  The Multiple-Try Method and Local Optimization in Metropolis Sampling , 2000 .

[2]  L. Baum,et al.  A Maximization Technique Occurring in the Statistical Analysis of Probabilistic Functions of Markov Chains , 1970 .

[3]  Doris Y. Tsao,et al.  A Cortical Region Consisting Entirely of Face-Selective Cells , 2006, Science.

[4]  Tai Sing Lee,et al.  Hierarchical Bayesian inference in the visual cortex. , 2003, Journal of the Optical Society of America. A, Optics, image science, and vision.

[5]  D. Dunson,et al.  Kernel stick-breaking processes. , 2008, Biometrika.

[6]  Benjamin B. Kimia,et al.  A Similarity-Based Aspect-Graph Approach to 3D Object Recognition , 2004, International Journal of Computer Vision.

[7]  Rajesh P. N. Rao,et al.  Bilinear Sparse Coding for Invariant Vision , 2005, Neural Computation.

[8]  D. Dunson,et al.  The local Dirichlet process , 2011, Annals of the Institute of Statistical Mathematics.

[9]  Eric P. Xing,et al.  Dynamic Non-Parametric Mixture Models and the Recurrent Chinese Restaurant Process: with Applications to Evolutionary Clustering , 2008, SDM.

[10]  Sven J. Dickinson,et al.  Selecting canonical views for view-based 3-D object recognition , 2004, ICPR 2004.

[11]  Philip S. Yu,et al.  Evolutionary Clustering by Hierarchical Dirichlet Process with Hidden Markov State , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[12]  Thomas L. Griffiths,et al.  Learning Systems of Concepts with an Infinite Relational Model , 2006, AAAI.

[13]  A. Sokal,et al.  Generalization of the Fortuin-Kasteleyn-Swendsen-Wang representation and Monte Carlo algorithm. , 1988, Physical review. D, Particles and fields.

[14]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[15]  Arnaud Doucet,et al.  Generalized Polya Urn for Time-varying Dirichlet Process Mixtures , 2007, UAI.

[16]  Manik Varma,et al.  Learning The Discriminative Power-Invariance Trade-Off , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[17]  Hossein Mobahi,et al.  Deep learning from temporal coherence in video , 2009, ICML '09.

[18]  D I Perrett,et al.  Organization and functions of cells responsive to faces in the temporal cortex. , 1992, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[19]  Thomas Dean,et al.  Learning invariant features using inertial priors , 2006, Annals of Mathematics and Artificial Intelligence.

[20]  J. Ponce,et al.  Segmenting, modeling, and matching video clips containing multiple moving objects , 2004, CVPR 2004.

[21]  Bartlett W. Mel SEEMORE: Combining Color, Shape, and Texture Histogramming in a Neurally Inspired Approach to Visual Object Recognition , 1997, Neural Computation.

[22]  J. Koenderink,et al.  The internal representation of solid shape with respect to vision , 1979, Biological Cybernetics.

[23]  Petri Toiviainen,et al.  MIDI toolbox : MATLAB tools for music research , 2004 .

[24]  Paul A. Viola,et al.  Robust Real-time Object Detection , 2001 .

[25]  Shimon Ullman,et al.  Image interpretation by a single bottom-up top-down cycle , 2008, Proceedings of the National Academy of Sciences.

[26]  E T Rolls,et al.  Invariant object recognition with trace learning and multiple stimuli present during training , 2007, Network.

[27]  L. G. Craton,et al.  The development of perceptual completion abilities: infants' perception of stationary, partially occluded objects. , 1996, Child development.

[28]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[29]  Scott Lindroth,et al.  Dynamic Nonparametric Bayesian Models for Analysis of Music , 2010 .

[30]  Carl E. Rasmussen,et al.  Factorial Hidden Markov Models , 1997 .

[31]  Andrew W. Moore,et al.  Fast State Discovery for HMM Model Selection and Learning , 2007, AISTATS.

[32]  Matthew J. Beal,et al.  Gene Expression Time Course Clustering with Countably Infinite Hidden Markov Models , 2006, UAI.

[33]  V. Hasselblad Finite mixtures of distributions from the exponential family , 1969 .

[34]  Tai Sing Lee,et al.  The Block Diagonal Infinite Hidden Markov Model , 2009, AISTATS.

[35]  W. Metzger,et al.  Laws of Seeing , 2006 .

[36]  Namunu Chinthaka Maddage Automatic structure detection for popular music , 2006, IEEE Multimedia.

[37]  Julian Eggert,et al.  Learning viewpoint invariant object representations using a temporal coherence principle , 2005, Biological Cybernetics.

[38]  Michael I. Jordan,et al.  An HDP-HMM for systems with state persistence , 2008, ICML '08.

[39]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[40]  Jason A. Duan,et al.  Generalized spatial dirichlet process models , 2007 .

[41]  N. Logothetis,et al.  Psychophysical and physiological evidence for viewer-centered object representations in the primate. , 1995, Cerebral cortex.

[42]  Radford M. Neal,et al.  A Split-Merge Markov chain Monte Carlo Procedure for the Dirichlet Process Mixture Model , 2004 .

[43]  David B. Dunson,et al.  The matrix stick-breaking process for flexible multi-task learning , 2007, ICML '07.

[44]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[45]  Eric T. Carlson,et al.  A neural code for three-dimensional object shape in macaque inferotemporal cortex , 2008, Nature Neuroscience.

[46]  Suzanna Becker,et al.  Implicit Learning in 3D Object Recognition: The Importance of Temporal Context , 1999, Neural Computation.

[47]  R. Baillargeon,et al.  Object segregation in 8-month-old infants , 1997, Cognition.

[48]  R. Vogels,et al.  Inferotemporal neurons represent low-dimensional configurations of parameterized shapes , 2001, Nature Neuroscience.

[49]  Silvio Savarese,et al.  Learning a dense multi-view representation for detection, viewpoint classification and synthesis of object categories , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[50]  N. Logothetis,et al.  View-dependent object recognition by monkeys , 1994, Current Biology.

[51]  S. Chib,et al.  Understanding the Metropolis-Hastings Algorithm , 1995 .

[52]  Terrence J. Sejnowski,et al.  Slow Feature Analysis: Unsupervised Learning of Invariances , 2002, Neural Computation.

[53]  Tai Sing Lee,et al.  Efficient belief propagation for higher-order cliques using linear constraint nodes , 2008, Comput. Vis. Image Underst..

[54]  Michael I. Jordan,et al.  Nonparametric Bayesian Learning of Switching Linear Dynamical Systems , 2008, NIPS.

[55]  E. Zohary,et al.  Inter-trial neuronal activity in inferior temporal cortex: a putative vehicle to generate long-term visual associations , 1998, Nature Neuroscience.

[56]  Kunihiko Fukushima,et al.  Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position , 1980, Biological Cybernetics.

[57]  K. Walker,et al.  View-based active appearance models , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[58]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[59]  T Poggio,et al.  View-based models of 3D object recognition: invariance to imaging transformations. , 1995, Cerebral cortex.

[60]  J. Sethuraman A CONSTRUCTIVE DEFINITION OF DIRICHLET PRIORS , 1991 .

[61]  Carl E. Rasmussen,et al.  The Infinite Gaussian Mixture Model , 1999, NIPS.

[62]  Peter Orbanz,et al.  Construction of Nonparametric Bayesian Models from Parametric Bayes Equations , 2009, NIPS.

[63]  D. Blackwell,et al.  Ferguson Distributions Via Polya Urn Schemes , 1973 .

[64]  James Bailey,et al.  Information theoretic measures for clusterings comparison: is a correction for chance necessary? , 2009, ICML '09.

[65]  E. Parzen On Estimation of a Probability Density Function and Mode , 1962 .

[66]  M. Escobar,et al.  Markov Chain Sampling Methods for Dirichlet Process Mixture Models , 2000 .

[67]  Kevin W. Bowyer,et al.  Aspect graphs: An introduction and survey of recent results , 1990, Int. J. Imaging Syst. Technol..

[68]  Michael I. Jordan,et al.  Hierarchical Dirichlet Processes , 2006 .

[69]  E. Rolls,et al.  View-invariant representations of familiar objects by neurons in the inferior temporal visual cortex. , 1998, Cerebral cortex.

[70]  David B. Dunson,et al.  Multi-task learning for sequential data via iHMMs and the nested Dirichlet process , 2007, ICML '07.

[71]  David B. Dunson,et al.  The dynamic hierarchical Dirichlet process , 2008, ICML '08.

[72]  Bodo Rosenhahn,et al.  Three-Dimensional Shape Knowledge for Joint Image Segmentation and Pose Tracking , 2007, International Journal of Computer Vision.

[73]  Joshua B. Tenenbaum,et al.  Separating Style and Content with Bilinear Models , 2000, Neural Computation.

[74]  Yee Whye Teh,et al.  Spatial Normalized Gamma Processes , 2009, NIPS.

[75]  A. Needham Object recognition and object segregation in 4.5-month-old infants. , 2001, Journal of experimental child psychology.

[76]  Adrian Barbu,et al.  Generalizing Swendsen-Wang to sampling arbitrary posterior probabilities , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[77]  Silvio Savarese,et al.  A multi-view probabilistic model for 3D object classes , 2009, CVPR.

[78]  Y. Miyashita,et al.  Neural organization for the long-term memory of paired associates , 1991, Nature.

[79]  T. Poggio,et al.  Hierarchical models of object recognition in cortex , 1999, Nature Neuroscience.

[80]  Yee Whye Teh,et al.  Beam sampling for the infinite hidden Markov model , 2008, ICML '08.

[81]  P. Fua,et al.  Pose estimation for category specific multiview object localization , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[82]  Lancelot F. James,et al.  Gibbs Sampling Methods for Stick-Breaking Priors , 2001 .

[83]  T. Ferguson A Bayesian Analysis of Some Nonparametric Problems , 1973 .

[84]  Edmund T. Rolls,et al.  Learning transform invariant object recognition in the visual system with multiple stimuli present during training , 2008, Neural Networks.

[85]  Luc Van Gool,et al.  Towards Multi-View Object Class Detection , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[86]  D. Perrett,et al.  Visual neurones responsive to faces in the monkey temporal cortex , 2004, Experimental Brain Research.

[87]  Andrew Zisserman,et al.  Video data mining using configurations of viewpoint invariant regions , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[88]  P. Fldik,et al.  Learning Invariance from Transformation Sequences , 1991, Neural Computation.

[89]  Andrew Zisserman,et al.  Object Level Grouping for Video Shots , 2004, International Journal of Computer Vision.

[90]  L. Hubert,et al.  Comparing partitions , 1985 .

[91]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[92]  Leslie G. Ungerleider,et al.  Object representations in the temporal cortex of monkeys and humans as revealed by functional magnetic resonance imaging. , 2009, Journal of neurophysiology.

[93]  Keiji Tanaka,et al.  Inferotemporal cortex and object vision. , 1996, Annual review of neuroscience.

[94]  L. Wasserman All of Nonparametric Statistics , 2005 .

[95]  Heinrich H. Bülthoff,et al.  View-based dynamic object recognition based on human perception , 2002, Object recognition supported by user interaction for service robots.

[96]  Perry R. Cook,et al.  Data-Driven Recomposition using the Hierarchical Dirichlet Process Hidden Markov Model , 2009, ICMC.

[97]  H. Akaike A new look at the statistical model identification , 1974 .

[98]  Elizabeth S. Spelke,et al.  Principles of Object Perception , 1990, Cogn. Sci..

[99]  Lawrence Carin,et al.  Hierarchical Bayesian Modeling of Topics in Time-Stamped Documents , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[100]  Allen M. Waxman,et al.  Adaptive 3-D Object Recognition from Multiple Views , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[101]  Leslie Pack Kaelbling,et al.  Automatic Class-Specific 3D Reconstruction from a Single Image , 2009 .

[102]  Rajesh P. N. Rao,et al.  Learning the Lie Groups of Visual Invariance , 2007, Neural Computation.

[103]  Thomas L. Griffiths,et al.  Hierarchical Topic Models and the Nested Chinese Restaurant Process , 2003, NIPS.

[104]  Bruno A. Olshausen,et al.  Learning Transformational Invariants from Time-Varying Natural Images , 2008, NIPS 2008.

[105]  Dileep George,et al.  Towards a Mathematical Theory of Cortical Micro-circuits , 2009, PLoS Comput. Biol..