Evaluating models for classifying customers in retail banking collections

When seeking to establish a repayment strategy with delinquent borrowers, it is useful to determine how they are likely to behave, so that an optimal use of resources can be made. We examine two behavioural classifications (‘settle immediately’ versus ‘not settle immediately’, and ‘make some repayment’ versus ‘make no repayment’) and apply a variety of rules for predicting into which class each customer is likely to belong. Since no such rule will yield perfect predictions, the way in which performance is evaluated is crucial in choosing a good rule, and hence subsequently in obtaining accurate predictions of likely future behaviour. We examine some popular standard performance evaluation criteria, showing that they have major weaknesses. We describe and illustrate the use of an alternative measure that overcomes these weaknesses.

[1]  A Zinovieff,et al.  Use and Abuse of Remedial Therapy , 1973, Proceedings of the Royal Society of Medicine.

[2]  A. A. Baker Hospital Advisory Service , 1971 .

[3]  David J. Hand,et al.  Classifier Technology and the Illusion of Progress , 2006, math/0606441.

[4]  D. Goodworth An Examination of the Role of Occupational Therapy Outside Hospitals , 1974 .

[5]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[6]  Nello Cristianini,et al.  Support vector machines , 2009 .

[7]  David J Hand,et al.  Breast Cancer Diagnosis from Proteomic Mass Spectrometry Data: A Comparative Evaluation , 2008, Statistical applications in genetics and molecular biology.

[8]  Mark R. Wade,et al.  Construction and Assessment of Classification Rules , 1999, Technometrics.

[9]  C. Partridge,et al.  Physiotherapy in the community. , 1987, The Journal of the Royal College of General Practitioners.

[10]  Christina Gloeckner,et al.  Modern Applied Statistics With S , 2003 .

[11]  David J. Hand,et al.  ROC Curves for Continuous Data , 2009 .

[12]  David J. Hand,et al.  Mining Supervised Classification Performance Studies: A Meta-Analytic Investigation , 2008, J. Classif..

[13]  Steven Salzberg,et al.  On Comparing Classifiers: Pitfalls to Avoid and a Recommended Approach , 1997, Data Mining and Knowledge Discovery.

[14]  Geoffrey J. McLachlan,et al.  Discriminant Analysis and Statistical Pattern Recognition: McLachlan/Discriminant Analysis & Pattern Recog , 2005 .

[15]  G. McLachlan Discriminant Analysis and Statistical Pattern Recognition , 1992 .

[16]  J. Friedman Regularized Discriminant Analysis , 1989 .

[17]  Robert P. W. Duin,et al.  A note on comparing classifiers , 1996, Pattern Recognit. Lett..

[18]  J. Mathews,et al.  Lumbar traction: a double-blind controlled study for sciatica. , 1975, Rheumatology and rehabilitation.

[19]  David J. Hand,et al.  Discrimination and Classification , 1982 .

[20]  P. Deb Finite Mixture Models , 2008 .

[21]  Geoffrey J. McLachlan,et al.  Finite Mixture Models , 2019, Annual Review of Statistics and Its Application.

[22]  Care C. Koch ADVICE TO THE PATIENT , 1938 .

[23]  David J. Hand,et al.  Statistical fraud detection: A review , 2002 .

[24]  G. Lloyd-Roberts,et al.  Periarthritis of the Shoulder , 1959 .

[25]  P. Ellwood,et al.  Handbook of Physical Medicine and Rehabilitation , 1966 .

[26]  Andreas Christmann,et al.  Support vector machines , 2008, Data Mining and Knowledge Discovery Handbook.

[27]  V. Carstairs,et al.  The handicapped and impaired in Great Britain. , 1972, Health bulletin.

[28]  J. Wade Davis,et al.  Statistical Pattern Recognition , 2003, Technometrics.

[29]  A. Cohen,et al.  Finite Mixture Distributions , 1982 .

[30]  P O Pharoah,et al.  Twins and cerebral palsy , 2001, Acta paediatrica (Oslo, Norway : 1992). Supplement.

[31]  David J. Hand Discriminant Analysis, Linear , 2005 .

[32]  David J. Hand,et al.  Measuring classifier performance: a coherent alternative to the area under the ROC curve , 2009, Machine Learning.

[33]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[34]  Virginia Held The Caring Person , 2005 .

[35]  D. Hand,et al.  A k-nearest-neighbour classifier for assessing consumer credit risk , 1996 .

[36]  David W. Hosmer,et al.  Applied Logistic Regression , 1991 .

[37]  Leon G. Higley,et al.  Forensic Entomology: An Introduction , 2009 .