Choosing Data-Mining Methods for Multiple Classification: Representational and Performance Measurement Implications for Decision Support

Data-mining techniques are designed for classification problems in which each observation is a member of one and only one category. We formulate ten data representations that could be used to extend those methods to problems in which observations may be full members of multiple categories. We propose an audit matrix methodology for evaluating the performance of three popular data-mining techniques--linear discriminant analysis, neural networks, and decision tree induction-- using the representations that each technique can accommodate. We then empirically test our approach on an actual surgical data set. Tree induction gives the lowest rate of false positive predictions, and a version of discriminant analysis yields the lowest rate of false negatives for multiple category problems, but neural networks give the best overall results for the largest multiple classification cases. There is substantial room for improvement in overall performance for all techniques.

[1]  Ting-Peng Liang,et al.  A composite approach to inducing knowledge for expert systems design , 1992 .

[2]  G Bashein,et al.  A Comprehensive Computer System for Anesthetic Record Retrieval , 1985, Anesthesia and analgesia.

[3]  Tarun K. Sen,et al.  An Evaluation of the Corporate Takeover Model Using Neural Networks , 1994 .

[4]  M. R. Mickey,et al.  Estimation of Error Rates in Discriminant Analysis , 1968 .

[5]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[6]  G. Grudnitski,et al.  A Neural Network Analysis of Mortgage Choice , 1995 .

[7]  Kurt Fanning,et al.  A Comparative Analysis of Artificial Neural Networks Using Financial Distress Prediction , 1994 .

[8]  W. Krzanowski The Performance of Fisher's Linear Discriminant Function Under Non-Optimal Conditions , 1977 .

[9]  Ingoo Han,et al.  Bankruptcy prediction using case-based reasoning, neural networks, and discriminant analysis , 1997 .

[10]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[11]  Melody Y. Kiang,et al.  Managerial Applications of Neural Networks: The Case of Bank Failure Predictions , 1992 .

[12]  J. E. Boritz,et al.  Predicting Corporate Failure Using a Neural Network Approach , 1995 .

[13]  M. Shaw,et al.  Using an Expert System with Inductive Learning to Evaluate Business Loans , 1988 .

[14]  Ingoo Han,et al.  An empirical investigation of some data effects on the classification accuracy of probit, ID3, and neural networks* , 1992 .

[15]  Rajendra P. Srivastava,et al.  Detection of management fraud: a neural network approach , 1995, Proceedings the 11th Conference on Artificial Intelligence for Applications.

[16]  David B. Paradice,et al.  Inductive modeling of expert decision making in loan evaluation: a decision strategy perspective , 1997, Decis. Support Syst..

[17]  Michael J. Shaw,et al.  Inductive Learning for International Financial Analysis: A Layered Approach , 1993, J. Manag. Inf. Syst..

[18]  Bijayananda Naik,et al.  Using Rule Induction for Expert System Development: The Case of Asset Writedowns , 1994 .

[19]  Sholom M. Weiss,et al.  An Empirical Comparison of Pattern Recognition, Neural Nets, and Machine Learning Classification Methods , 1989, IJCAI.

[20]  Padhraic Smyth,et al.  From Data Mining to Knowledge Discovery in Databases , 1996, AI Mag..

[21]  Kurt Fanning,et al.  Neural Network Detection of Management Fraud Using Published Financial Data , 1998 .

[22]  Casimir A. Kulikowski,et al.  Computer Systems That Learn: Classification and Prediction Methods from Statistics, Neural Nets, Machine Learning and Expert Systems , 1990 .

[23]  Sholom M. Weiss,et al.  Computer Systems That Learn , 1990 .

[24]  Richard A. Johnson,et al.  Applied Multivariate Statistical Analysis , 1983 .

[25]  James V. Hansen,et al.  Inducing rules for expert system development: an example using default and bankruptcy data , 1988 .

[26]  Kar Yan Tam,et al.  A Comparative Analysis of Inductive-Learning Algorithms , 1993 .

[27]  Chris Carter,et al.  Assessing Credit Card Applications Using Machine Learning , 1987, IEEE Expert.

[28]  J. R. Quinlan Constructing Decision Trees , 1993 .

[29]  C. Charalambous,et al.  The Prediction of Earnings Using Financial Statement Information: Empirical Evidence With Logit Models and Artificial Neural Networks , 1996 .