Converting High-Dimensional Regression to High-Dimensional Conditional Density Estimation

There is a growing demand for nonparametric conditional density estimators (CDEs) in fields such as astronomy and economics. In astronomy, for example, one can dramatically improve estimates of the parameters that dictate the evolution of the Universe by working with full conditional densities instead of regression (i.e., conditional mean) estimates. More generally, standard regression falls short in any prediction problem where the distribution of the response is more complex with multi-modality, asymmetry or heteroscedastic noise. Nevertheless, much of the work on high-dimensional inference concerns regression and classification only, whereas research on density estimation has lagged behind. Here we propose FlexCode, a fully nonparametric approach to conditional density estimation that reformulates CDE as a non-parametric orthogonal series problem where the expansion coefficients are estimated by regression. By taking such an approach, one can efficiently estimate conditional densities and not just expectations in high dimensions by drawing upon the success in high-dimensional regression. Depending on the choice of regression procedure, our method can adapt to a variety of challenging high-dimensional settings with different structures in the data (e.g., a large number of irrelevant components and nonlinear manifold structure) as well as different data types (e.g., functional data, mixed data types and sample sets). We study the theoretical and empirical performance of our proposed method, and we compare our approach with traditional conditional density estimators on simulated as well as real-world data, such as photometric galaxy data, Twitter data, and line-of-sight velocities in a galaxy cluster.

[1]  P. Vieu,et al.  NONPARAMETRIC REGRESSION ON FUNCTIONAL DATA: INFERENCE AND PRACTICAL ASPECTS , 2007 .

[2]  Ann B. Lee,et al.  Nonparametric Conditional Density Estimation in a High-Dimensional Regression Setting , 2016, 1604.00540.

[3]  Eibe Frank,et al.  Accurate photometric redshift probability density estimation – method comparison and application , 2015, 1503.08215.

[4]  Jianqing Fan,et al.  Estimation of conditional densities and sensitivity measures in nonlinear dynamical systems , 1996 .

[5]  Gisele L. Pappa,et al.  Exploring Multiple Evidences to Infer Users Location in Twitter , 2014 .

[6]  Martin J. Wainwright,et al.  Divide and Conquer Kernel Ridge Regression , 2013, COLT.

[7]  Rafael Izbicki,et al.  High-Dimensional Density Ratio Estimation with Extensions to Approximate Likelihood Computation , 2014, AISTATS.

[8]  Tsuyoshi Ichimura,et al.  A fast algorithm for computing least-squares cross-validations for nonparametric conditional kernel density functions , 2010, Comput. Stat. Data Anal..

[9]  N. Higham Computing the nearest correlation matrix—a problem from finance , 2002 .

[10]  G. Lecu'e,et al.  Selection of variables and dimension reduction in high-dimensional non-parametric regression , 2008, 0811.1115.

[11]  Liang Xiong,et al.  Kernels on Sample Sets via Nonparametric Divergence Estimates , 2012, 1202.0302.

[12]  L. Moscardini,et al.  Virial Scaling of Massive Dark Matter Halos: Why Clusters Prefer a High Normalization Cosmology , 2007, astro-ph/0702241.

[13]  Strong consistency of the kernel estimators of conditional density function , 1985 .

[14]  A. B. Lee,et al.  A unified framework for constructing, tuning and assessing photometric redshift density estimates in a selection bias setting , 2017, Monthly notices of the Royal Astronomical Society.

[15]  Sanjoy Dasgupta,et al.  A tree-based regressor that adapts to intrinsic dimension , 2012, J. Comput. Syst. Sci..

[16]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[17]  J. Lafferty,et al.  Sparse additive models , 2007, 0711.4555.

[18]  Z. Q. John Lu,et al.  Nonparametric Functional Data Analysis: Theory And Practice , 2007, Technometrics.

[19]  Le Song,et al.  Scalable Kernel Methods via Doubly Stochastic Gradients , 2014, NIPS.

[20]  Vikram Pudi,et al.  GEAR: Generic, Efficient, Accurate kNN-based Regression , 2010 .

[21]  Danica J. Sutherland,et al.  A MACHINE LEARNING APPROACH FOR DYNAMICAL MASS MEASUREMENTS OF GALAXY CLUSTERS , 2014, 1410.0686.

[22]  Stergios B. Fotopoulos,et al.  All of Nonparametric Statistics , 2007, Technometrics.

[23]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[24]  A. Fernandez-Soto,et al.  Error analysis of the photometric redshift technique , 2002 .

[25]  R. J. Brunner,et al.  TPZ: photometric redshift PDFs and ancillary information by using prediction trees and random forests , 2013, 1303.7269.

[26]  Sam Efromovich,et al.  Dimension Reduction and Adaptation in Conditional Density Estimation , 2010 .

[27]  S. Mallat A wavelet tour of signal processing , 1998 .

[28]  Rachel Mandelbaum,et al.  PHOTOMETRIC REDSHIFT PROBABILITY DISTRIBUTIONS FOR GALAXIES IN THE SDSS DR8 , 2011, 1109.5192.

[29]  S. Geer,et al.  High-dimensional additive modeling , 2008, 0806.4115.

[30]  P. Bickel,et al.  Local polynomial regression on unknown manifolds , 2007, 0708.0983.

[31]  Andrew W. Moore,et al.  Nonparametric Density Estimation: Toward Computational Tractability , 2003, SDM.

[32]  Iain Murray,et al.  Fast $\epsilon$-free Inference of Simulation Models with Bayesian Conditional Density Estimation , 2016, 1605.06376.

[33]  Hans C. van Houwelingen,et al.  The Elements of Statistical Learning, Data Mining, Inference, and Prediction. Trevor Hastie, Robert Tibshirani and Jerome Friedman, Springer, New York, 2001. No. of pages: xvi+533. ISBN 0‐387‐95284‐5 , 2004 .

[34]  J. Lafferty,et al.  Rodeo: Sparse, greedy nonparametric regression , 2008, 0803.1709.

[35]  Rob J Hyndman,et al.  Estimating and Visualizing Conditional Densities , 1996 .

[36]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[37]  Yun Yang,et al.  Minimax-optimal nonparametric regression in high dimensions , 2014, 1401.7278.

[38]  Takafumi Kanamori,et al.  Nonparametric Conditional Density Estimation Using Piecewise-Linear Solution Path of Kernel Quantile Regression , 2009, Neural Computation.

[39]  Iain Murray,et al.  Fast $\epsilon$-free Inference of Simulation Models with Bayesian Conditional Density Estimation , 2016 .

[40]  Jeffrey S. Racine,et al.  Cross-Validation and the Estimation of Conditional Probability Densities , 2004 .

[41]  Stanley R. Johnson,et al.  Varying Coefficient Models , 1984 .

[42]  Lianfen Qian,et al.  Nonparametric Curve Estimation: Methods, Theory, and Applications , 1999, Technometrics.

[43]  Alexander G. Gray,et al.  Fast Nonparametric Conditional Density Estimation , 2007, UAI.

[44]  Jeffrey S. Racine,et al.  Nonparametric Econometrics: The np Package , 2008 .

[45]  Samory Kpotufe,et al.  k-NN Regression Adapts to Local Intrinsic Dimension , 2011, NIPS.

[46]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[47]  Amparo Baíllo,et al.  Local linear regression for functional predictor and scalar response , 2009, J. Multivar. Anal..

[48]  Masashi Sugiyama,et al.  Direct conditional probability density estimation with sparse feature selection , 2014, Machine Learning.

[49]  Danica J. Sutherland,et al.  DYNAMICAL MASS MEASUREMENTS OF CONTAMINATED GALAXY CLUSTERS USING MACHINE LEARNING , 2015, 1509.05409.

[50]  Ann B. Lee,et al.  A Spectral Series Approach to High-Dimensional Nonparametric Regression , 2016, 1602.00355.

[51]  D. J. Nott,et al.  Approximate Bayesian computation via regression density estimation , 2012, 1212.1479.

[52]  P. Vieu,et al.  Nonparametric Conditional Density Estimation for Functional Data. Econometric Applications , 2011 .

[53]  Liang Peng,et al.  Approximating conditional density functions using dimension reduction , 2009 .

[54]  Takafumi Kanamori,et al.  Conditional Density Estimation via Least-Squares Density Ratio Estimation , 2010, AISTATS.

[55]  Nonparametric conditional density estimation of short-term interest rate movements: procedures, results and risk management implications , 2013 .

[56]  Vincent Rivoirard,et al.  Adaptive pointwise estimation of conditional density function , 2013, 1312.7402.

[57]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[58]  A note on nonparametric estimation of circular conditional densities , 2016 .

[59]  Vikas Chandrakant Raykar,et al.  Scalable machine learning for massive datasets: Fast summation algorithms , 2007 .

[60]  Sanjeev R. Kulkarni,et al.  A Nearest-Neighbor Approach to Estimating Divergence between Continuous Random Vectors , 2006, 2006 IEEE International Symposium on Information Theory.