Tailoring density estimation via reproducing kernel moment matching

Moment matching is a popular means of parametric density estimation. We extend this technique to nonparametric estimation of mixture models. Our approach works by embedding distributions into a reproducing kernel Hilbert space, and performing moment matching in that space. This allows us to tailor density estimators to a function class of interest (i.e., for which we would like to compute expectations). We show our density estimation approach is useful in applications such as message compression in graphical models, and image classification and retrieval.

[1]  E. Parzen On Estimation of a Probability Density Function and Mode , 1962 .

[2]  J. F. C. Kingman,et al.  Information and Exponential Families in Statistical Theory , 1980 .

[3]  Anne Lohrli Chapman and Hall , 1985 .

[4]  C. D. Kemp,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[5]  D. N. Geary Mixture Models: Inference and Applications to Clustering , 1989 .

[6]  Sayan Mukherjee,et al.  Support Vector Method for Multivariate Density Estimation , 1999, NIPS.

[7]  Theodore Johnson,et al.  Squashing flat files flatter , 1999, KDD '99.

[8]  Peter L. Bartlett,et al.  Rademacher and Gaussian Complexities: Risk Bounds and Structural Results , 2003, J. Mach. Learn. Res..

[9]  Tom Minka,et al.  Expectation Propagation for approximate Bayesian inference , 2001, UAI.

[10]  Ingo Steinwart,et al.  On the Influence of the Kernel on the Consistency of Support Vector Machines , 2002, J. Mach. Learn. Res..

[11]  Hayit Greenspan,et al.  Context-based image modelling , 2002, Object recognition supported by user interaction for service robots.

[12]  Timothy J. Robinson,et al.  Sequential Monte Carlo Methods in Practice , 2003 .

[13]  Chao He,et al.  Probability Density Estimation from Optimally Condensed Data Samples , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Benjamin Van Roy,et al.  On Constraint Sampling in the Linear Programming Approach to Approximate Dynamic Programming , 2004, Math. Oper. Res..

[15]  Leonidas J. Guibas,et al.  The Earth Mover's Distance as a Metric for Image Retrieval , 2000, International Journal of Computer Vision.

[16]  Miroslav Dudík,et al.  Performance Guarantees for Regularized Maximum Entropy Density Estimation , 2004, COLT.

[17]  Tony Jebara,et al.  Probability Product Kernels , 2004, J. Mach. Learn. Res..

[18]  Bernhard Schölkopf,et al.  A Kernel Method for the Two-Sample-Problem , 2006, NIPS.

[19]  Alexander J. Smola,et al.  Unifying Divergence Minimization and Statistical Inference Via Convex Duality , 2006, COLT.

[20]  Le Song,et al.  A Hilbert Space Embedding for Distributions , 2007, Discovery Science.

[21]  John Shawe-Taylor,et al.  A Framework for Probability Density Estimation , 2007, AISTATS.

[22]  Michael I. Jordan,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..