Bank efficiency assessment using a hybrid approach of random forests and data envelopment analysis

This study introduces a three-stage integrated framework consisting of data envelopment analysis (DEA), random forest, and logistic regression to examine and predict the impact of environmental variables on banks' performance. This framework identified five important environmental variables and their effects on bank performance when applied to 151 banks in Middle East and North African (MENA) countries over the period 2008-2010.

[1]  Yvonne Vergouwe,et al.  Prognosis and prognostic research: what, why, and how? , 2009, BMJ : British Medical Journal.

[2]  P. W. Wilson,et al.  Estimation and inference in two-stage, semi-parametric models of production processes , 2007 .

[3]  Jeewon Choi,et al.  A framework for benchmarking service process using data envelopment analysis and decision tree , 2007, Expert Syst. Appl..

[4]  Subhash C. Ray,et al.  Resource-Use Efficiency in Public Schools: A Study of Connecticut Data , 1991 .

[5]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[6]  Jean-Michel Sahut,et al.  Banking distress in MENA countries and the role of mergers as a strategic policy to resolve distress , 2011 .

[7]  Abraham Charnes,et al.  Measuring the efficiency of decision making units , 1978 .

[8]  F. Harrell,et al.  Prognostic/Clinical Prediction Models: Multivariable Prognostic Models: Issues in Developing Models, Evaluating Assumptions and Adequacy, and Measuring and Reducing Errors , 2005 .

[9]  B. Thompson,et al.  Use of Structure Coefficients in Published Multiple Regression Articles: β is not Enough , 2001 .

[10]  M. Pal,et al.  Random forests for land cover classification , 2003, IGARSS 2003. 2003 IEEE International Geoscience and Remote Sensing Symposium. Proceedings (IEEE Cat. No.03CH37477).

[11]  Emmanuel Thanassoulis,et al.  Data Envelopment Analysis:the mathematical programming approach to efficiency analysis , 2008 .

[12]  Reza Tavakkoli-Moghaddam,et al.  An integrated Data Envelopment Analysis-Artificial Neural Network-Rough Set Algorithm for assessment of personnel efficiency , 2011, Expert Syst. Appl..

[13]  T. Yorozu,et al.  Electron Spectroscopy Studies on Magneto-Optical Media and Plastic Substrate Interface , 1987, IEEE Translation Journal on Magnetics in Japan.

[14]  R. Darlington,et al.  Multiple regression in psychological research and practice. , 1968, Psychological bulletin.

[15]  John Ruggiero,et al.  On the measurement of technical efficiency in the public sector , 1996 .

[16]  Harold O. Fried,et al.  Incorporating the Operating Environment Into a Nonparametric Measure of Technical Efficiency , 1999 .

[17]  Leigh Tooth,et al.  A review of two journals found that articles using multivariable logistic regression frequently did not report commonly recommended assumptions. , 2004, Journal of clinical epidemiology.

[18]  Robert E. Ployhart,et al.  A multidimensional approach for evaluating variables in organizational research and practice , 2007 .

[19]  Rima Turk-Ariss Competitive behavior in Middle East and North Africa banking systems , 2009 .

[20]  D. R. Cutler,et al.  Utah State University From the SelectedWorks of , 2017 .

[21]  Thomas Agoritsas,et al.  Performance of logistic regression modeling: beyond the number of events per variable, the role of data structure. , 2011, Journal of clinical epidemiology.

[22]  Ali Emrouznejad,et al.  Data envelopment analysis with classification and regression tree – a case of banking efficiency , 2010, Expert Syst. J. Knowl. Eng..

[23]  J. Pereira,et al.  Modeling spatial patterns of fire occurrence in Mediterranean Europe using Multiple Regression and Random Forest , 2012 .

[24]  F. Harrell,et al.  Regression modelling strategies for improved prognostic prediction. , 1984, Statistics in medicine.

[25]  William W. Cooper,et al.  Evaluating Program and Managerial Efficiency: An Application of Data Envelopment Analysis to Program Follow Through , 1981 .

[26]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[27]  A. Charnes,et al.  Some Models for Estimating Technical and Scale Inefficiencies in Data Envelopment Analysis , 1984 .

[28]  Rajiv D. Banker,et al.  Efficiency Analysis for Exogenously Fixed Inputs and Outputs , 1986, Oper. Res..

[29]  B. Casu,et al.  A comparative study of efficiency in European banking , 2003 .

[30]  Rajiv D. Banker,et al.  The Use of Categorical Variables in Data Envelopment Analysis , 1986 .

[31]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[32]  James M. LeBreton,et al.  History and Use of Relative Importance Indices in Organizational Research , 2004 .

[33]  M. Ariff,et al.  Cost and profit efficiency of Chinese banks: A non-parametric analysis , 2008 .

[34]  Jean-Michel Poggi,et al.  Variable selection using random forests , 2010, Pattern Recognit. Lett..

[35]  H. Ishwaran Variable importance in binary regression trees and forests , 2007, 0711.2434.

[36]  Tze San Ong,et al.  A comparison on efficiency of domestic and foreign banks in Malaysia: a DEA approach , 2011 .

[37]  Subhash C. Ray,et al.  Data envelopment analysis, nondiscretionary inputs and efficiency: an alternative interpretation , 1988 .

[38]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[39]  Jesús T. Pastor,et al.  Linear programming approaches to the measurement and analysis of productive efficiency , 1994 .

[40]  D. Budescu,et al.  The dominance analysis approach for comparing predictors in multiple regression. , 2003, Psychological methods.

[41]  H. O. Fried,et al.  Accounting for Environmental Effects and Statistical Noise in Data Envelopment Analysis , 2002 .

[42]  M. Young The technical writer's handbook : writing with style and clarity , 1989 .

[43]  Sarah M. Estelle,et al.  Three-Stage DEA Models for Incorporating Exogenous Inputs , 2009, Comput. Oper. Res..

[44]  P. W. Wilson,et al.  Statistical Inference in Nonparametric Frontier Models: The State of the Art , 1999 .

[45]  A. Anouze Evaluating productive efficiency:comparative study of commercial banks in Gulf countries , 2010 .

[46]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[47]  U. Grömping Estimators of Relative Importance in Linear Regression Based on Variance Decomposition , 2007 .

[48]  J. Maxwell A Treatise on Electricity and Magnetism , 1873, Nature.

[49]  Ramón Díaz-Uriarte,et al.  Gene selection and classification of microarray data using random forest , 2006, BMC Bioinformatics.

[50]  Achim Zeileis,et al.  BMC Bioinformatics BioMed Central Methodology article Conditional variable importance for random forests , 2008 .