Local Versus Global Models for Classification Problems

It is generally argued that predictive or decision making steps in statistics are separate from the model building or inferential steps. In many problems, however, predictive accuracy matters more in some parts of the data space than in others, and it is appropriate to aim for greater model effectiveness in those regions. If the relevant parts of the space depend on the use to which the model is to be put, then the best model will depend also on this intended use. We illustrate using examples from supervised classification.

[1]  D. Cox Some problems connected with statistical inference , 1958 .

[2]  David J. Hand,et al.  Discrimination and Classification , 1982 .

[3]  R. Tibshirani,et al.  Local Likelihood Estimation , 1987 .

[4]  Robert E. Schapire The Strength of Weak Learnability , 1989, COLT 1989.

[5]  Yoav Freund,et al.  Boosting a weak learning algorithm by majority , 1995, COLT '90.

[6]  G. McLachlan Discriminant Analysis and Statistical Pattern Recognition , 1992 .

[7]  John A. Nelder,et al.  The statistics of linear models: back to basics , 1995 .

[8]  J. B. Copas,et al.  Local Likelihood Based on Kernel Censoring , 1995 .

[9]  Brian D. Ripley,et al.  Pattern Recognition and Neural Networks , 1996 .

[10]  M. C. Jones,et al.  Locally parametric nonparametric density estimation , 1996 .

[11]  Christopher J. Merz,et al.  UCI Repository of Machine Learning Databases , 1996 .

[12]  C. Loader Local Likelihood Density Estimation , 1996 .

[13]  David J. Hand,et al.  Construction and Assessment of Classification Rules , 1997 .

[14]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[15]  D. J. Newman,et al.  UCI Repository of Machine Learning Database , 1998 .

[16]  Niall M. Adams,et al.  Defining the Goals to Optimise Data Mining Performance , 1998, KDD.

[17]  Niall M. Adams,et al.  Comparing classifiers when the misallocation costs are uncertain , 1999, Pattern Recognit..

[18]  Andrew R. Webb,et al.  Statistical Pattern Recognition , 1999 .

[19]  J. Friedman Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .

[20]  Niall M. Adams,et al.  Improving the Practice of Classifier Performance Assessment , 2000, Neural Computation.

[21]  Niall M. Adams,et al.  Defining attributes for scorecard construction in credit scoring , 2000 .

[22]  D. Hand Modelling consumer credit risk , 2001 .

[23]  Peter Hall,et al.  Relative efficiencies of kernel and local likelihood density estimators , 2002 .

[24]  Jonathan N. Crook,et al.  Credit Scoring and Its Applications , 2002, SIAM monographs on mathematical modeling and computation.

[25]  Jerome H. Friedman,et al.  On Bias, Variance, 0/1—Loss, and the Curse-of-Dimensionality , 2004, Data Mining and Knowledge Discovery.

[26]  R. Schapire The Strength of Weak Learnability , 1990, Machine Learning.