Graphical tools for model selection in generalized linear models

Model selection techniques have existed for many years; however, to date, simple, clear and effective methods of visualising the model building process are sparse. This article describes graphical methods that assist in the selection of models and comparison of many different selection criteria. Specifically, we describe for logistic regression, how to visualize measures of description loss and of model complexity to facilitate the model selection dilemma. We advocate the use of the bootstrap to assess the stability of selected models and to enhance our graphical tools. We demonstrate which variables are important using variable inclusion plots and show that these can be invaluable plots for the model building process. We show with two case studies how these proposed tools are useful to learn more about important variables in the data and how these tools can assist the understanding of the model building process.

[1]  Enis Siniksaran A geometric interpretation of Mallows' Cp statistic and an alternative plot in variable selection , 2008, Comput. Stat. Data Anal..

[2]  Emil Spjøtvoll,et al.  Alternatives to plotting CP in multiple regression , 1977 .

[3]  Elvezio Ronchetti,et al.  Robustness Aspects of Model Choice , 1997 .

[4]  P. Bühlmann,et al.  The group lasso for logistic regression , 2008 .

[5]  Elvezio Ronchetti,et al.  A Robust Version of Mallows's C P , 1994 .

[6]  S. Müller,et al.  On Model Selection Curves , 2010 .

[7]  Mee Young Park,et al.  Penalized logistic regression for detecting gene interactions. , 2008, Biostatistics.

[8]  Andreas Krause,et al.  A Picture is Worth a Thousand Tables , 2012, Springer US.

[9]  B. McNamara,et al.  Which carers of family members at the end of life need more support from health services and why? , 2010, Social science & medicine.

[10]  Geoffrey R. Loftus,et al.  A picture is worth a thousandp values: On the irrelevance of hypothesis testing in the microcomputer age , 1993 .

[11]  G. Kitagawa,et al.  Generalised information criteria in model selection , 1996 .

[12]  Tso-Jung Yen,et al.  Discussion on "Stability Selection" by Meinshausen and Buhlmann , 2010 .

[13]  Lu Tian,et al.  A Perturbation Method for Inference on Regularized Regression Estimates , 2011, Journal of the American Statistical Association.

[14]  A. Janssen,et al.  How do bootstrap and permutation tests work , 2003 .

[15]  Samuel Müller,et al.  Outlier Robust Model Selection in Linear Regression , 2005 .

[16]  D. Currow,et al.  Hospital and emergency department use in the last year of life: a baseline for future modifications to end‐of‐life care , 2011, The Medical journal of Australia.

[17]  S. Müller,et al.  Model Selection in Linear Mixed Models , 2013, 1306.2427.

[18]  G. Schwarz Estimating the Dimension of a Model , 1978 .