论文信息 - Convex Regression with Interpretable Sharp Partitions

Convex Regression with Interpretable Sharp Partitions

We consider the problem of predicting an outcome variable on the basis of a small number of covariates, using an interpretable yet non-additive model. We propose convex regression with interpretable sharp partitions (CRISP) for this task. CRISP partitions the covariate space into blocks in a data-adaptive way, and fits a mean model within each block. Unlike other partitioning methods, CRISP is fit using a non-greedy approach by solving a convex optimization problem, resulting in low-variance fits. We explore the properties of CRISP, and evaluate its performance in a simulation study and on a housing price data set.

[1] Jean Duchon,et al. Splines minimizing rotation-invariant semi-norms in Sobolev spaces , 1976, Constructive Theory of Functions of Several Variables.

[2] R. Pace,et al. Sparse spatial autoregressions , 1997 .

[3] J. Friedman. Multivariate adaptive regression splines , 1990 .

[4] B. Ripley,et al. Recursive Partitioning and Regression Trees , 2015 .

[5] B. Efron. How Biased is the Apparent Error Rate of a Prediction Rule , 1986 .

[6] Stephen P. Boyd,et al. Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[7] S. Pandey,et al. What Are Degrees of Freedom , 2008 .

[8] Leo Breiman,et al. Random Forests , 2001, Machine Learning.

[9] R. Tibshirani,et al. Sparsity and smoothness via the fused lasso , 2005 .

[10] P. Tseng. Convergence of a Block Coordinate Descent Method for Nondifferentiable Minimization , 2001 .

[11] Stephen P. Boyd,et al. Graph Implementations for Nonsmooth Convex Programs , 2008, Recent Advances in Learning and Control.

[12] Ashley Petersen,et al. Fused Lasso Additive Model , 2014, Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America.

[13] Ryan J. Tibshirani,et al. Fast and Flexible ADMM Algorithms for Trend Filtering , 2014, ArXiv.

[14] M. Yuan,et al. Model selection and estimation in regression with grouped variables , 2006 .

[15] Jianming Ye. On Measuring and Correcting the Effects of Data Mining and Model Selection , 1998 .

[16] Wei-Yin Loh,et al. Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[17] Yoram Singer,et al. Efficient Online and Batch Learning Using Forward Backward Splitting , 2009, J. Mach. Learn. Res..

[18] H. Akaike,et al. Information Theory and an Extension of the Maximum Likelihood Principle , 1973 .

[19] G. Schwarz. Estimating the Dimension of a Model , 1978 .

[20] Douglas W. Nychka,et al. Tools for Spatial Data , 2016 .

[21] Stephen P. Boyd,et al. Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[22] R. Tibshirani,et al. On the “degrees of freedom” of the lasso , 2007, 0712.0881.

[23] Stephen P. Boyd,et al. 1 Trend Filtering , 2009, SIAM Rev..

[24] R. Tibshirani,et al. Degrees of freedom in lasso problems , 2011, 1111.0653.