The Highly Adaptive Lasso Estimator

Estimation of a regression functions is a common goal of statistical learning. We propose a novel nonparametric regression estimator that, in contrast to many existing methods, does not rely on local smoothness assumptions nor is it constructed using local smoothing techniques. Instead, our estimator respects global smoothness constraints by virtue of falling in a class of right-hand continuous functions with left-hand limits that have variation norm bounded by a constant. Using empirical process theory, we establish a fast minimal rate of convergence of our proposed estimator and illustrate how such an estimator can be constructed using standard software. In simulations, we show that the finite-sample performance of our estimator is competitive with other popular machine learning techniques across a variety of data generating mechanisms. We also illustrate competitive performance in real data examples using several publicly available data sets.

[1]  Abdelmonem A. Afifi,et al.  Statistical Analysis: A Computer Oriented Approach. , 1973 .

[2]  Aaas News,et al.  Book Reviews , 1893, Buffalo Medical and Surgical Journal.

[3]  A. V. D. Vaart,et al.  Oracle inequalities for multi-fold cross validation , 2006 .

[4]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[5]  Mark J. van der Laan,et al.  Cross-Validated Targeted Minimum-Loss-Based Estimation , 2011 .

[6]  J. Wellner,et al.  Inefficient estimators of the bivariate survival function for three models , 1995 .

[7]  David W. Aha,et al.  Instance‐based prediction of real‐valued attributes , 1989, Comput. Intell..

[8]  T. Stengos,et al.  Non‐linearities in cross‐country growth regressions: a semiparametric approach , 1999 .

[9]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[10]  J. Cavanaugh Biostatistics , 2005, Definitions.

[11]  Bernard R. Rosner,et al.  Fundamentals of Biostatistics. , 1992 .

[12]  Richard S. Johannes,et al.  Using the ADAP Learning Algorithm to Forecast the Onset of Diabetes Mellitus , 1988 .

[13]  Andreas Christmann,et al.  Support vector machines , 2008, Data Mining and Knowledge Discovery Handbook.

[14]  J Mark,et al.  A Generally Efficient Targeted Minimum Loss Based Estimator , 2017 .

[15]  S. Dudoit,et al.  Unified Cross-Validation Methodology For Selection Among Estimators and a General Cross-Validated Adaptive Epsilon-Net Estimator: Finite Sample Oracle Inequalities and Examples , 2003 .

[16]  E. Nadaraya On Estimating Regression , 1964 .

[17]  G. Neuhaus,et al.  On Weak Convergence of Stochastic Processes with Multidimensional Time Parameter , 1971 .

[18]  J. Freidman,et al.  Multivariate adaptive regression splines , 1991 .

[19]  Aad van der Vaart,et al.  The Cross-Validated Adaptive Epsilon-Net Estimator , 2006 .

[20]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[21]  G. S. Watson,et al.  Smooth regression analysis , 1964 .

[22]  M. J. van der Laan,et al.  Statistical Applications in Genetics and Molecular Biology Super Learner , 2010 .

[23]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[24]  Marti A. Hearst Trends & Controversies: Support Vector Machines , 1998, IEEE Intell. Syst..

[25]  W. Marsden I and J , 2012 .

[26]  Léon Bottou,et al.  Large-Scale Machine Learning with Stochastic Gradient Descent , 2010, COMPSTAT.

[27]  A. W. van der Vaart,et al.  A local maximal inequality under uniform entropy. , 2010, Electronic journal of statistics.