Data envelopment analysis models for probabilistic classification

Abstract We propose and test three different probabilistic classification techniques using data envelopment analysis (DEA). The first two techniques assume parametric exponential and half-normal inefficiency probability distributions. The third technique uses a hybrid DEA and probabilistic neural network approach. We test the proposed methods using simulated and real-world datasets. We compare them with cost-sensitive support vector machines and traditional probabilistic classifiers that minimize Bayesian misclassification cost risk. The results of our experiments indicate that the hybrid approach performs as well as or better than other techniques when misclassification costs are asymmetric. The performance of exponential inefficiency distribution DEA classifiers is similar or better than that of traditional probabilistic neural networks. We illustrate that there are certain classification problems where probabilistic DEA based classifiers may provide superior performance compared to competing classification techniques.

[1]  Toshiyuki Sueyoshi,et al.  Extended DEA-Discriminant Analysis , 2001, Eur. J. Oper. Res..

[2]  Mehdi Toloo,et al.  Finding the best asset financing alternative: A DEA-WEO approach , 2014 .

[3]  Parag C. Pendharkar,et al.  Interactive classification using data envelopment analysis , 2014, Ann. Oper. Res..

[4]  Mehdi Toloo,et al.  A novel method for selecting a single efficient unit in data envelopment analysis without explicit inputs/outputs , 2017, Ann. Oper. Res..

[5]  Toshiyuki Sueyoshi,et al.  DEA-discriminant analysis in the view of goal programming , 1999, Eur. J. Oper. Res..

[6]  Ali Emrouznejad,et al.  A semi-oriented radial measure for measuring the efficiency of decision making units with negative data, using DEA , 2010, Eur. J. Oper. Res..

[7]  Chao Yang,et al.  Feature selection for classification with class-separability strategy and data envelopment analysis , 2014, Neurocomputing.

[8]  E. Parzen On Estimation of a Probability Density Function and Mode , 1962 .

[9]  Parag C. Pendharkar,et al.  Probabilistic Approaches for Credit Screening and bankruptcy Prediction , 2011, Intell. Syst. Account. Finance Manag..

[10]  Parag C. Pendharkar,et al.  DEA based dimensionality reduction for classification problems satisfying strict non-satiety assumption , 2011, Eur. J. Oper. Res..

[11]  Parag C. Pendharkar,et al.  A hybrid radial basis function and data envelopment analysis neural network for classification , 2011, Comput. Oper. Res..

[12]  Rajiv D. Banker,et al.  The Use of Categorical Variables in Data Envelopment Analysis , 1986 .

[13]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[14]  Sudhir Nanda,et al.  A misclassification cost-minimizing evolutionary–neural classification approach , 2006 .

[15]  Inseong Song,et al.  Predicting New Customers' Risk Type in the Credit Card Market , 2009 .

[16]  Xi Chen,et al.  Quantitative models for direct marketing: A review from systems perspective , 2009, Eur. J. Oper. Res..

[17]  Parag Pendharkar,et al.  Fuzzy classification using the data envelopment analysis , 2012, Knowl. Based Syst..

[18]  Mehdi Toloo,et al.  Evaluation efficiency of large-scale data set with negative data: an artificial neural network approach , 2015, The Journal of Supercomputing.

[19]  R. Dyson,et al.  Reducing Weight Flexibility in Data Envelopment Analysis , 1988 .

[20]  Toshiyuki Sueyoshi,et al.  Mixed integer programming approach of extended DEA-discriminant analysis , 2004, Eur. J. Oper. Res..

[21]  Mehdi Toloo,et al.  Performance assessment in production systems without explicit inputs: an application to basketball players , 2016 .

[22]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[23]  Parag C. Pendharkar Linear models for cost-sensitive classification , 2015, Expert Syst. J. Knowl. Eng..

[24]  Rajiv D. Banker,et al.  Hypothesis tests using data envelopment analysis , 1996 .

[25]  A. Charnes,et al.  Some Models for Estimating Technical and Scale Inefficiencies in Data Envelopment Analysis , 1984 .

[26]  Mehdi Toloo,et al.  Obviating some of the theoretical barriers of data envelopment analysis-discriminant analysis: an application in predicting cluster membership of customers , 2015, J. Oper. Res. Soc..

[27]  Robert J. Kauffman,et al.  Measuring Gains in Operational Efficiency from Information Technology: A Study of the Positran Deployment at Hardee'S Inc. , 1990, J. Manag. Inf. Syst..

[28]  Ming-Chang Lee,et al.  Business Bankruptcy Prediction Based on Survival Analysis Approach , 2014 .

[29]  Parag C. Pendharkar A potential use of data envelopment analysis for the inverse classification problem , 2002 .

[30]  Parag C. Pendharkar A data envelopment analysis-based approach for data preprocessing , 2005, IEEE Transactions on Knowledge and Data Engineering.

[31]  Dick R. Wittink,et al.  Estimating and validating asymmetric heterogeneous loss functions applied to health care fund raising , 1996 .

[32]  Jesús T. Pastor,et al.  Radial DEA models without inputs or without outputs , 1999, Eur. J. Oper. Res..

[33]  Mehdi Toloo,et al.  The most efficient unit without explicit inputs: An extended MILP-DEA model , 2013 .

[34]  R. Banker Maximum likelihood, consistency and data envelopment analysis: a statistical foundation , 1993 .

[35]  Donald F. Specht,et al.  Probabilistic neural networks , 1990, Neural Networks.