High-Dimensional Density Ratio Estimation with Extensions to Approximate Likelihood Computation

The ratio between two probability density functions is an important component of various tasks, including selection bias correction, novelty detection and classification. Recently, several estimators of this ratio have been proposed. Most of these methods fail if the sample space is high-dimensional, and hence require a dimension reduction step, the result of which can be a significant loss of information. Here we propose a simple-toimplement, fully nonparametric density ratio estimator that expands the ratio in terms of the eigenfunctions of a kernel-based operator; these functions reflect the underlying geometry of the data (e.g., submanifold structure), often leading to better estimates without an explicit dimension reduction step. We show how our general framework can be extended to address another important problem, the estimation of a likelihood function in situations where that function cannot be wellapproximated by an analytical form. One is often faced with this situation when performing statistical inference with data from the sciences, due the complexity of the data and of the processes that generated those data. We emphasize applications where using existing likelihood-free methods of inference would be challenging due to the high dimensionality of the sample space, but where our spectral series method yields a reasonable estimate of the likelihood function. We provide theoretical guarantees and illustrate the effectiveness of our proposed method with numerical experiments. Appearing in Proceedings of the 17 International Conference on Artificial Intelligence and Statistics (AISTATS) 2014, Reykjavik, Iceland. JMLR: W&CP volume 33. Copyright 2014 by the authors.

[1]  Anna Margolis,et al.  A Literature Review of Domain Adaptation with Unlabeled Data , 2011 .

[2]  Motoaki Kawanabe,et al.  Direct density-ratio estimation with dimensionality reduction via least-squares hetero-distributional subspace search , 2011, Neural Networks.

[3]  W. M. Wood-Vasey,et al.  LIKELIHOOD-FREE COSMOLOGICAL INFERENCE WITH TYPE Ia SUPERNOVAE: APPROXIMATE BAYESIAN COMPUTATION FOR A COMPLETE TREATMENT OF UNCERTAINTY , 2012, 1206.2563.

[4]  Aniruddha R. Thakar,et al.  ERRATUM: “THE EIGHTH DATA RELEASE OF THE SLOAN DIGITAL SKY SURVEY: FIRST DATA FROM SDSS-III” (2011, ApJS, 193, 29) , 2011 .

[5]  Kathryn Roeder,et al.  A SPECTRAL GRAPH APPROACH TO DISCOVERING GENETIC ANCESTRY. , 2009, The annals of applied statistics.

[6]  R. Plevin,et al.  Approximate Bayesian Computation in Evolution and Ecology , 2011 .

[7]  Mikhail Belkin,et al.  Semi-Supervised Learning Using Sparse Eigenfunction Bases , 2009, AAAI Fall Symposium: Manifold Learning and Its Applications.

[8]  Takafumi Kanamori,et al.  Statistical outlier detection using direct density ratio estimation , 2011, Knowledge and Information Systems.

[9]  Gilles Blanchard,et al.  On the Convergence of Eigenspaces in Kernel Principal Component Analysis , 2005, NIPS.

[10]  Takafumi Kanamori,et al.  Statistical analysis of kernel-based least-squares density-ratio estimation , 2012, Machine Learning.

[11]  H. Minh,et al.  Some Properties of Gaussian Reproducing Kernel Hilbert Spaces and Their Implications for Function Approximation and Learning Theory , 2010 .

[12]  Huan Lin,et al.  Estimating the redshift distribution of photometric galaxy samples , 2008 .

[13]  Rong Jin,et al.  A Simple Algorithm for Semi-supervised Learning with Improved Generalization Error Bound , 2012, ICML.

[14]  Masashi Sugiyama,et al.  Computationally efficient multi-label classification by least-squares probabilistic classifier , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[15]  Nicolas Le Roux,et al.  Learning Eigenfunctions Links Spectral Embedding and Kernel PCA , 2004, Neural Computation.

[16]  Mikhail Belkin,et al.  On Learning with Integral Operators , 2010, J. Mach. Learn. Res..

[17]  H. Shimodaira,et al.  Improving predictive inference under covariate shift by weighting the log-likelihood function , 2000 .

[18]  Nathan Halko,et al.  Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions , 2009, SIAM Rev..

[19]  Lianfen Qian,et al.  Nonparametric Curve Estimation: Methods, Theory, and Applications , 1999, Technometrics.

[20]  Jean-Michel Marin,et al.  Approximate Bayesian computational methods , 2011, Statistics and Computing.

[21]  Christian P Robert,et al.  Molecular Ecology Ressources – subject area: Methodological Advances 1 2 Estimation of demo-genetic model probabilities with Approximate Bayesian 3 Computation using linear discriminant analysis on summary statistics , 2012 .

[22]  Jeffrey S. Racine,et al.  Nonparametric Econometrics: The np Package , 2008 .

[23]  M. Kawanabe,et al.  Direct importance estimation for covariate shift adaptation , 2008 .

[24]  Takafumi Kanamori,et al.  Conditional Density Estimation via Least-Squares Density Ratio Estimation , 2010, AISTATS.

[25]  Petros Drineas,et al.  On the Nyström Method for Approximating a Gram Matrix for Improved Kernel-Based Learning , 2005, J. Mach. Learn. Res..

[26]  A. N. Pettitt,et al.  Approximate Bayesian Computation for astronomical model analysis: a case study in galaxy demographics and morphological transformation at high redshift , 2012, 1202.1426.

[27]  Rachel Mandelbaum,et al.  PHOTOMETRIC REDSHIFT PROBABILITY DISTRIBUTIONS FOR GALAXIES IN THE SDSS DR8 , 2011, 1109.5192.

[28]  John Shawe-Taylor,et al.  HANDBOOK FOR THE GREAT08 CHALLENGE: AN IMAGE ANALYSIS COMPETITION FOR COSMOLOGICAL LENSING , 2008, 0802.1214.

[29]  Masashi Sugiyama,et al.  Density Ratio Estimation: A Comprehensive Review , 2010 .

[30]  Karsten M. Borgwardt,et al.  Covariate Shift by Kernel Mean Matching , 2009, NIPS 2009.

[31]  Mohamed-Ali Belabbas,et al.  Spectral methods in machine learning and new strategies for very large datasets , 2009, Proceedings of the National Academy of Sciences.

[32]  Takafumi Kanamori,et al.  A Least-squares Approach to Direct Importance Estimation , 2009, J. Mach. Learn. Res..