Measuring the Stability of Results From Supervised Statistical Learning

ABSTRACT Stability is a major requirement to draw reliable conclusions when interpreting results from supervised statistical learning. In this article, we present a general framework for assessing and comparing the stability of results, which can be used in real-world statistical learning applications as well as in simulation and benchmark studies. We use the framework to show that stability is a property of both the algorithm and the data-generating process. In particular, we demonstrate that unstable algorithms (such as recursive partitioning) can produce stable results when the functional form of the relationship between the predictors and the response matches the algorithm. Typical uses of the framework in practical data analysis would be to compare the stability of results generated by different candidate algorithms for a dataset at hand or to assess the stability of algorithms in a benchmark study. Code to perform the stability analyses is provided in the form of an R package. Supplementary material for this article is available online.

[1]  Achim Zeileis,et al.  A Toolkit for Stability Assessment of Tree-Based Learners , 2016 .

[2]  Steve Weston,et al.  Foreach Parallel Adaptor for the 'parallel' Package , 2015 .

[3]  Kurt Hornik,et al.  Misc Functions of the Department of Statistics, ProbabilityTheory Group (Formerly: E1071), TU Wien , 2015 .

[4]  Max Kuhn,et al.  caret: Classification and Regression Training , 2015 .

[5]  Victoria Stodden,et al.  Reproducing Statistical Results , 2015 .

[6]  Jean-Michel Poggi,et al.  Influence Measures for CART Classification Trees , 2015, J. Classif..

[7]  Achim Zeileis,et al.  partykit : A Toolkit for Recursive Partytioning , 2015 .

[8]  Achim Zeileis,et al.  Partykit: a modular toolkit for recursive partytioning in R , 2015, J. Mach. Learn. Res..

[9]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[10]  A. Zeileis,et al.  Gaining insight with recursive partitioning of generalized linear models , 2013 .

[11]  Max Kuhn,et al.  Applied Predictive Modeling , 2013 .

[12]  Bin Yu,et al.  Estimation Stability With Cross-Validation (ESCV) , 2013, 1303.3128.

[13]  Mousumi Banerjee,et al.  Identifying representative trees from ensembles , 2012, Statistics in medicine.

[14]  Lei Cheng,et al.  Some new deformation formulas about variance and covariance , 2012, 2012 Proceedings of International Conference on Modelling, Identification and Control.

[15]  Galit Shmueli,et al.  To Explain or To Predict? , 2010, 1101.0891.

[16]  William N. Venables,et al.  Modern Applied Statistics with S , 2010 .

[17]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[18]  Hadley Wickham,et al.  ggplot2 - Elegant Graphics for Data Analysis (2nd Edition) , 2017 .

[19]  Kurt Hornik,et al.  Open-source machine learning: R meets Weka , 2009, Comput. Stat..

[20]  Gilles R. Ducharme,et al.  Computational Statistics and Data Analysis a Similarity Measure to Assess the Stability of Classification Trees , 2022 .

[21]  G. Tutz,et al.  An introduction to recursive partitioning: rationale, application, and characteristics of classification and regression trees, bagging, and random forests. , 2009, Psychological methods.

[22]  Achim Zeileis,et al.  Conditional variable importance for random forests , 2008, BMC Bioinformatics.

[23]  K. Hornik,et al.  Model-Based Recursive Partitioning , 2008 .

[24]  Yannis Theodoridis,et al.  A general framework for estimating similarity of datasets and decision trees: exploring semantic similarity of decision trees , 2008, SDM.

[25]  Hadley Wickham,et al.  Reshaping Data with the reshape Package , 2007 .

[26]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[27]  Sung-Hyuk Cha Comprehensive Survey on Distance/Similarity Measures between Probability Density Functions , 2007 .

[28]  K. Hornik,et al.  Unbiased Recursive Partitioning: A Conditional Inference Framework , 2006 .

[29]  Sayan Mukherjee,et al.  Learning theory: stability is sufficient for generalization and necessary and sufficient for consistency of empirical risk minimization , 2006, Adv. Comput. Math..

[30]  N. Meinshausen,et al.  High-dimensional graphs and variable selection with the Lasso , 2006, math/0608017.

[31]  Kurt Hornik,et al.  The Design and Analysis of Benchmark Experiments , 2005 .

[32]  Annette M. Molinaro,et al.  Prediction error estimation: a comparison of resampling methods , 2005, Bioinform..

[33]  Larry Wasserman,et al.  All of Statistics: A Concise Course in Statistical Inference , 2004 .

[34]  Gabriele Soffritti,et al.  The comparison between classification trees through proximity measures , 2004, Comput. Stat. Data Anal..

[35]  T. Poggio,et al.  General conditions for predictivity in learning theory , 2004, Nature.

[36]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[37]  Peter D. Turney Technical note: Bias and the quantification of stability , 1995, Machine Learning.

[38]  Larry Wasserman,et al.  All of Statistics , 2004 .

[39]  André Elisseeff,et al.  Stability and Generalization , 2002, J. Mach. Learn. Res..

[40]  A. Hedayat,et al.  Statistical Methods in Assessing Agreement , 2002 .

[41]  Stability-Based Model Selection , 2002, NIPS.

[42]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[43]  Leo Breiman,et al.  Statistical Modeling: The Two Cultures (with comments and a rejoinder by the author) , 2001 .

[44]  Leo Breiman,et al.  Statistical Modeling: The Two Cultures (with comments and a rejoinder by the author) , 2001, Statistical Science.

[45]  F Dannegger,et al.  Tree stability diagnostics and some remedies for instability. , 2000, Statistics in medicine.

[46]  W. Shannon,et al.  Combining classification trees using MLE. , 1999, Statistics in medicine.

[47]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[48]  L. Breiman Heuristics of instability and stabilization in model selection , 1996 .

[49]  J. R. Quinlan Constructing Decision Trees , 1993 .