Convex Nonparanormal Regression

Quantifying uncertainty in predictions or, more generally, estimating the posterior conditional distribution, is a core challenge in machine learning and statistics. We introduce Convex Nonparanormal Regression (CNR), a conditional nonparanormal approach for coping with this task. CNR involves a convex optimization of a posterior defined via a rich dictionary of pre-defined non linear transformations on Gaussians. It can fit an arbitrary conditional distribution, including multimodal and non-symmetric posteriors. For the special but powerful case of a piecewise linear dictionary, we provide a closed form of the posterior mean which can be used for point-wise predictions. Finally, we demonstrate the advantages of CNR over classical competitors using synthetic and real world data.

[1]  M. Haugh,et al.  An Introduction to Copulas , 2016 .

[2]  Paul W. Goldberg,et al.  Regression with Input-dependent Noise: A Gaussian Process Treatment , 1997, NIPS.

[3]  Pramod K. Varshney,et al.  A Parametric Copula-Based Framework for Hypothesis Testing Using Heterogeneous Data , 2011, IEEE Transactions on Signal Processing.

[4]  S. Srihari Mixture Density Networks , 1994 .

[5]  E. Tabak,et al.  A Family of Nonparametric Density Estimation Algorithms , 2013 .

[6]  Nicolai Meinshausen,et al.  Quantile Regression Forests , 2006, J. Mach. Learn. Res..

[7]  Marcus A. Brubaker,et al.  Normalizing Flows: Introduction and Ideas , 2019, ArXiv.

[8]  M. Davy,et al.  Copulas: a new insight into positive time-frequency distributions , 2003, IEEE Signal Processing Letters.

[9]  Ami Wiesel,et al.  Signal Detection in Complex Structured Para Normal Noise , 2017, IEEE Transactions on Signal Processing.

[10]  C. Klaassen,et al.  Efficient estimation in the bivariate normal copula model: normal margins are least favourable , 1997 .

[11]  Steven Kay,et al.  Fundamentals Of Statistical Signal Processing , 2001 .

[12]  Alan F. Murray,et al.  Confidence estimation methods for neural networks : a practical comparison , 2001, ESANN.

[13]  Sheldon M. Ross,et al.  Introduction to probability models , 1975 .

[14]  R. Koenker,et al.  Regression Quantiles , 2007 .

[15]  Stergios B. Fotopoulos,et al.  All of Nonparametric Statistics , 2007, Technometrics.

[16]  Martin J. Wainwright,et al.  Statistical guarantees for the EM algorithm: From population to sample-based analysis , 2014, ArXiv.

[17]  Larry A. Wasserman,et al.  The Nonparanormal: Semiparametric Estimation of High Dimensional Undirected Graphs , 2009, J. Mach. Learn. Res..

[18]  Tom Heskes,et al.  Practical Confidence and Prediction Intervals , 1996, NIPS.

[19]  Larry Wasserman,et al.  All of Statistics: A Concise Course in Statistical Inference , 2004 .

[20]  E. Tabak,et al.  DENSITY ESTIMATION BY DUAL ASCENT OF THE LOG-LIKELIHOOD ∗ , 2010 .

[21]  David A. Nix,et al.  Learning Local Error Bars for Nonlinear Regression , 1994, NIPS.

[22]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[23]  Shakir Mohamed,et al.  Variational Inference with Normalizing Flows , 2015, ICML.

[24]  Samy Bengio,et al.  Density estimation using Real NVP , 2016, ICLR.