Nonparametric bayesian models for machine learning

This thesis presents general techiques for inference in various nonparametric Bayesian models, furthers our understanding of the stochastic processes at the core of these models, and develops new models of data based on these findings. In particular, we develop new Monte Carlo algorithms for Dirichlet process mixtures based on a general framework. We extend the vocabulary of processes used for nonparametric Bayesian models by proving many properties of beta and gamma processes. In particular, we show how to perform probabilistic inference in hierarchies of beta and gamma processes, and how this naturally leads to improvements to the well known naive Bayes algorithm. We demonstrate the robustness and speed of the resulting methods by applying it to a classification task with 1 million training samples and 40,000 classes.

[1]  A. Khintchine Korrelationstheorie der stationären stochastischen Prozesse , 1934 .

[2]  P. Levy Théorie de l'addition des variables aléatoires , 1955 .

[3]  W. Ewens The sampling theory of selectively neutral alleles. , 1972, Theoretical population biology.

[4]  T. Rolski On random discrete distributions , 1980 .

[5]  C. J-F,et al.  THE COALESCENT , 1980 .

[6]  M. F.,et al.  Bibliography , 1985, Experimental Gerontology.

[7]  A. Shiryaev,et al.  Limit Theorems for Stochastic Processes , 1987 .

[8]  R. Fildes Journal of the American Statistical Association : William S. Cleveland, Marylyn E. McGill and Robert McGill, The shape parameter for a two variable graph 83 (1988) 289-300 , 1989 .

[9]  N. Hjort Nonparametric Bayes Estimators Based on Beta Processes in Models for Life History Data , 1990 .

[10]  M. Escobar,et al.  Bayesian Density Estimation and Inference Using Mixtures , 1995 .

[11]  R. Wolpert,et al.  Poisson/gamma random field models for spatial statistics , 1998 .

[12]  Robert L. Wolpert,et al.  Simulation of Lévy Random Fields , 1998 .

[13]  Yongdai Kim NONPARAMETRIC BAYESIAN ESTIMATORS FOR COUNTING PROCESSES , 1999 .

[14]  M. Newton,et al.  A recursive algorithm for nonparametric analysis with missing data , 1999 .

[15]  Eric R. Ziegel,et al.  Practical Nonparametric and Semiparametric Bayesian Statistics , 1998, Technometrics.

[16]  Radford M. Neal Markov Chain Sampling Methods for Dirichlet Process Mixture Models , 2000 .

[17]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[18]  N. Tsilevich Stationary Random Partitions of Positive Integers , 2000 .

[19]  O. Zeitouni,et al.  Asymptotics of Certain Coagulation-Fragmentation Processes and Invariant Poisson-Dirichlet Measures , 2001, math/0105111.

[20]  Lancelot F. James,et al.  Gibbs Sampling Methods for Stick-Breaking Priors , 2001 .

[21]  Alexander Gnedin,et al.  A Characterization of GEM Distributions , 2001, Combinatorics, Probability and Computing.

[22]  Jim Pitman,et al.  Poisson–Dirichlet and GEM Invariant Distributions for Split-and-Merge Transformations of an Interval Partition , 2002, Combinatorics, Probability and Computing.

[23]  A. Vershik,et al.  Harmonic analysis on the infinite symmetric group , 2003, math/0312270.

[24]  P. Diaconis,et al.  The Poisson-Dirichlet law is the unique invariant distribution for uniform split-merge transformations , 2003, math/0305313.

[25]  Radford M. Neal,et al.  Density Modeling and Clustering Using Dirichlet Diffusion Trees , 2003 .

[26]  D. B. Dahl An improved merge-split sampler for conjugate dirichlet process mixture models , 2003 .

[27]  Radford M. Neal,et al.  A Split-Merge Markov chain Monte Carlo Procedure for the Dirichlet Process Mixture Model , 2004 .

[28]  Thomas L. Griffiths,et al.  Infinite latent feature models and the Indian buffet process , 2005, NIPS.

[29]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[30]  J. Pitman Combinatorial Stochastic Processes , 2006 .

[31]  Michael I. Jordan,et al.  Hierarchical Dirichlet Processes , 2006 .

[32]  Padhraic Smyth,et al.  Learning Time-Intensity Profiles of Human Activity using Non-Parametric Bayesian Models , 2006, NIPS.

[33]  R. Fergus,et al.  Tiny images , 2007 .

[34]  Michalis K. Titsias,et al.  The Infinite Gamma-Poisson Feature Model , 2007, NIPS.

[35]  Michael I. Jordan,et al.  Hierarchical Beta Processes and the Indian Buffet Process , 2007, AISTATS.

[36]  Yee Whye Teh,et al.  Stick-breaking Construction for the Indian Buffet Process , 2007, AISTATS.

[37]  By W. R. GILKSt,et al.  Adaptive Rejection Sampling for Gibbs Sampling , 2010 .