Fast Distribution To Real Regression

We study the problem of distribution to real-value regression, where one aims to regress a mapping $f$ that takes in a distribution input covariate $P\in \mathcal{I}$ (for a non-parametric family of distributions $\mathcal{I}$) and outputs a real-valued response $Y=f(P) + \epsilon$. This setting was recently studied, and a "Kernel-Kernel" estimator was introduced and shown to have a polynomial rate of convergence. However, evaluating a new prediction with the Kernel-Kernel estimator scales as $\Omega(N)$. This causes the difficult situation where a large amount of data may be necessary for a low estimation risk, but the computation cost of estimation becomes infeasible when the data-set is too large. To this end, we propose the Double-Basis estimator, which looks to alleviate this big data problem in two ways: first, the Double-Basis estimator is shown to have a computation complexity that is independent of the number of of instances $N$ when evaluating new predictions after training; secondly, the Double-Basis estimator is shown to have a fast rate of convergence for a general class of mappings $f\in\mathcal{F}$.

[1]  B. Laurent Efficient estimation of integral functionals of a density , 1996 .

[2]  T. Moon The expectation-maximization algorithm , 1996, IEEE Signal Process. Mag..

[3]  David Haussler,et al.  Exploiting Generative Models in Discriminative Classifiers , 1998, NIPS.

[4]  James O. Ramsay,et al.  Applied Functional Data Analysis: Methods and Case Studies , 2002 .

[5]  Adam Krzyzak,et al.  A Distribution-Free Theory of Nonparametric Regression , 2002, Springer series in statistics.

[6]  Henry W. Altland,et al.  Applied Functional Data Analysis , 2003, Technometrics.

[7]  Nuno Vasconcelos,et al.  A Kullback-Leibler Divergence Based Kernel for SVM Classification in Multimedia Applications , 2003, NIPS.

[8]  Tony Jebara,et al.  Probability Product Kernels , 2004, J. Mach. Learn. Res..

[9]  Le Song,et al.  A Hilbert Space Embedding for Distributions , 2007, IFIP Working Conference on Database Semantics.

[10]  Le Song,et al.  A Hilbert Space Embedding for Distributions , 2007, Discovery Science.

[11]  Benjamin Recht,et al.  Random Features for Large-Scale Kernel Machines , 2007, NIPS.

[12]  Z. Q. John Lu,et al.  Nonparametric Functional Data Analysis: Theory And Practice , 2007, Technometrics.

[13]  Alexandre B. Tsybakov,et al.  Introduction to Nonparametric Estimation , 2008, Springer series in statistics.

[14]  Yu. I. Ingster,et al.  Estimation and detection of functions from anisotropic Sobolev classes , 2011 .

[15]  Barnabás Póczos,et al.  Nonparametric Divergence Estimation with Applications to Machine Learning on Distributions , 2011, UAI.

[16]  T. Minka Estimating a Dirichlet distribution , 2012 .

[17]  Barnabás Póczos,et al.  Nonparametric kernel estimators for image classification , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Bernhard Schölkopf,et al.  Learning from Distributions via Support Measure Machines , 2012, NIPS.

[19]  Barnabás Póczos,et al.  Distribution-Free Distribution Regression , 2013, AISTATS.

[20]  Barnabás Póczos,et al.  Distribution to Distribution Regression , 2013, ICML.