Data envelopment analysis and data mining to efficiency estimation and evaluation

PurposeThis paper aims to assess the application of seven statistical and data mining techniques to second-stage data envelopment analysis (DEA) for bank performance.Design/methodology/approachDifferent statistical and data mining techniques are used to second-stage DEA for bank performance as a part of an attempt to produce a powerful model for bank performance with effective predictive ability. The projected data mining tools are classification and regression trees (CART), conditional inference trees (CIT), random forest based on CART and CIT, bagging, artificial neural networks and their statistical counterpart, logistic regression.FindingsThe results showed that random forests and bagging outperform other methods in terms of predictive power.Originality/valueThis is the first study to assess the impact of environmental factors on banking performance in Middle East and North Africa countries.

[1]  J. Morgan,et al.  Problems in the Analysis of Survey Data, and a Proposal , 1963 .

[2]  Abraham Charnes,et al.  Measuring the efficiency of decision making units , 1978 .

[3]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[4]  A. Charnes,et al.  Some Models for Estimating Technical and Scale Inefficiencies in Data Envelopment Analysis , 1984 .

[5]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[6]  Geoffrey E. Hinton,et al.  Learning representations by back-propagation errors, nature , 1986 .

[7]  Geoffrey E. Hinton,et al.  Learning representations of back-propagation errors , 1986 .

[8]  Subhash C. Ray,et al.  Data envelopment analysis, nondiscretionary inputs and efficiency: an alternative interpretation , 1988 .

[9]  David W. Hosmer,et al.  Applied Logistic Regression , 1991 .

[10]  Subhash C. Ray,et al.  Resource-Use Efficiency in Public Schools: A Study of Connecticut Data , 1991 .

[11]  M. LeBlanc,et al.  Relative risk trees for censored survival data. , 1992, Biometrics.

[12]  Allen N. Berger,et al.  Efficiency of financial institutions: International survey and directions for future research , 1997 .

[13]  Emmanuel Thanassoulis,et al.  Introduction to the theory and application of data envelopment analysis , 2001 .

[14]  B. Thompson,et al.  Use of Structure Coefficients in Published Multiple Regression Articles: β is not Enough , 2001 .

[15]  Emmanuel Thanassoulis,et al.  Introduction to the Theory and Application of Data Envelopment Analysis: A Foundation Text with Integrated Software , 2001 .

[16]  H. O. Fried,et al.  Accounting for Environmental Effects and Statistical Noise in Data Envelopment Analysis , 2002 .

[17]  D. Budescu,et al.  The dominance analysis approach for comparing predictors in multiple regression. , 2003, Psychological methods.

[18]  C. Ling,et al.  AUC: a Statistically Consistent and more Discriminating Measure than Accuracy , 2003, IJCAI.

[19]  B. Casu,et al.  A comparative study of efficiency in European banking , 2003 .

[20]  James M. LeBreton,et al.  History and Use of Relative Importance Indices in Organizational Research , 2004 .

[21]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[22]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[23]  C. Sutton Classification and Regression Trees, Bagging, and Boosting , 2005 .

[24]  K. Hornik,et al.  Unbiased Recursive Partitioning: A Conditional Inference Framework , 2006 .

[25]  K. Hornik,et al.  A Lego System for Conditional Inference , 2006 .

[26]  Achim Zeileis,et al.  Bias in random forest variable importance measures: Illustrations, sources and a solution , 2007, BMC Bioinformatics.

[27]  Amparo Alonso-Betanzos,et al.  A Very Fast Learning Method for Neural Networks Based on Sensitivity Analysis , 2006, J. Mach. Learn. Res..

[28]  YongSeog Kim,et al.  Toward a successful CRM: variable selection, sampling, and ensemble , 2006, Decis. Support Syst..

[29]  Denis Larocque,et al.  Discrete‐time survival trees , 2007 .

[30]  Jeewon Choi,et al.  A framework for benchmarking service process using data envelopment analysis and decision tree , 2007, Expert Syst. Appl..

[31]  U. Grömping Estimators of Relative Importance in Linear Regression Based on Variance Decomposition , 2007 .

[32]  Fotios Pasiouras,et al.  The Effect of Board Size and Composition on the Efficiency of UK Banks , 2008 .

[33]  M. Ariff,et al.  Cost and profit efficiency of Chinese banks: A non-parametric analysis , 2008 .

[34]  Hui Li,et al.  Data mining method for listed companies' financial distress prediction , 2008, Knowl. Based Syst..

[35]  Ali Emrouznejad,et al.  Evaluation of research in efficiency and productivity: A survey and analysis of the first 30 years , 2008 .

[36]  Andreas Burger,et al.  PRODUCTIVITY IN BANKS: MYTHS AND TRUTHS OF THE COST INCOME RATIO , 2008 .

[37]  I-Cheng Yeh,et al.  The comparisons of data mining techniques for the predictive accuracy of probability of default of credit card clients , 2009, Expert Syst. Appl..

[38]  Denis Larocque,et al.  Discrete-time survival trees and forests with time-varying covariates , 2009 .

[39]  Fadzlan Sufian,et al.  Determinants of bank efficiency during unstable macroeconomic environment: Empirical evidence from Malaysia , 2009 .

[40]  N. Hermes,et al.  The impact of financial liberalization on bank efficiency: evidence from Latin America and Asia , 2007 .

[41]  Ali Emrouznejad,et al.  Data envelopment analysis with classification and regression tree – a case of banking efficiency , 2010, Expert Syst. J. Knowl. Eng..

[42]  Philip Molyneux,et al.  Total factor productivity and shareholder returns in banking. , 2010 .

[43]  A. Anouze Evaluating productive efficiency:comparative study of commercial banks in Gulf countries , 2010 .

[44]  Ali Emrouznejad,et al.  COOPER-framework: A unified process for non-parametric projects , 2010, Eur. J. Oper. Res..

[45]  Sarah M. Estelle,et al.  Three-Stage DEA Models for Incorporating Exogenous Inputs , 2009, Comput. Oper. Res..

[46]  Geraldo da Silva e Souza,et al.  Evolution of bank efficiency in Brazil: A DEA approach , 2010, Eur. J. Oper. Res..

[47]  Dan Zhu,et al.  A hybrid approach for efficient ensembles , 2010, Decis. Support Syst..

[48]  G. S. K. Niazi,et al.  Impact of financial reforms on efficiency of state-owned, private and foreign banks in Pakistan , 2010 .

[49]  Fotios Pasiouras,et al.  Assessing Bank Efficiency and Performance with Operational Research and Artificial Intelligence Techniques: A Survey , 2009, Eur. J. Oper. Res..

[50]  Kristiaan Kerstens,et al.  Bank Productivity and Performance Groups: A Decomposition Approach Based Upon the Luenberger Productivity Indicator , 2011, Eur. J. Oper. Res..

[51]  Shujie Yao,et al.  World Financial Crisis and Efficiency of Chinese Commercial Banks , 2011 .

[52]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[53]  Wei-Kang Wang,et al.  Designing a knowledge-based system for benchmarking: A DEA approach , 2011, Knowl. Based Syst..

[54]  A. George Assaf,et al.  Technical efficiency in Saudi banks , 2011, Expert Syst. Appl..

[55]  Tze San Ong,et al.  A comparison on efficiency of domestic and foreign banks in Malaysia: a DEA approach , 2011 .

[56]  P. Molyneux,et al.  Determinants of efficiency in South East Asian banking , 2011 .

[57]  Rasoul Rezvanian,et al.  Cost efficiency, technological progress and productivity growth of Chinese banking pre- and post-WTO accession , 2011 .

[58]  Reza Tavakkoli-Moghaddam,et al.  An integrated Data Envelopment Analysis-Artificial Neural Network-Rough Set Algorithm for assessment of personnel efficiency , 2011, Expert Syst. Appl..

[59]  J. Nankervis,et al.  Are there any cost and profit efficiency gains in financial conglomeration? Evidence from the accession countries , 2011 .

[60]  Ali Emrouznejad,et al.  Flexible measures in production process: A DEA-based approach , 2011, RAIRO Oper. Res..

[61]  Y. Wang,et al.  Bank holding company diversification and production efficiency , 2012 .

[62]  Evgeny A. Antipov,et al.  Mass Appraisal of Residential Apartments: An Application of Random Forest for Valuation and a CART-Based Approach for Model Diagnostics , 2010, Expert Syst. Appl..

[63]  Ming-Fu Hsu,et al.  Credit risk assessment and decision making by a fusion approach , 2012, Knowl. Based Syst..

[64]  Jonchi Shyu,et al.  Measuring the true managerial efficiency of bank branches in Taiwan: A three-stage DEA analysis , 2012, Expert Syst. Appl..

[65]  Chih-Ching Yang Service, investment, and risk management performance in commercial banks , 2012 .

[66]  Naveen Kumar,et al.  Data Mining for Business Intelligence–Concepts, Techniques, and Applications in Microsoft Office Excel® with XLMiner® , 2012 .

[67]  Kent Matthews,et al.  Efficiency convergence properties of Indonesian banks 1992–2007 , 2012 .

[68]  A. Akin,et al.  Managerial and Technical Inefficiencies of Foreign and Domestic Banks in Turkey During the 2008 Global Crisis , 2013 .

[69]  Alireza Amirteimoori,et al.  Production planning in data envelopment analysis without explicit inputs , 2013, RAIRO Oper. Res..

[70]  C. Girardone,et al.  Financial freedom and bank efficiency: Evidence from the European Union , 2013 .

[71]  K. Matthews Risk Management and Managerial Efficiency in Chinese Banks: A Network DEA Framework , 2011 .

[72]  Norazlina Abd. Wahab,et al.  Efficiency of Islamic banks during the financial crisis: An analysis of Middle Eastern and Asian countries , 2014 .

[73]  J. Retolaza,et al.  Efficiency in Spanish banking: A multistakeholder approach analysis , 2014 .

[74]  J. R. Sobrino,et al.  Main determinants of efficiency and implications on banking concentration in the European Union , 2014 .

[75]  Alireza Amirteimoori,et al.  A DEA model for two-stage parallel-series production processes , 2014, RAIRO Oper. Res..

[76]  Bora Aktan,et al.  Efficiency and risk in commercial banking: empirical evidence from East Asian countries , 2014 .

[77]  Ali Emrouznejad,et al.  Neural network DEA for measuring the efficiency of mutual funds , 2014, Int. J. Appl. Decis. Sci..

[78]  Xiaohui Hou,et al.  Market structure, risk taking, and the efficiency of Chinese commercial banks , 2014 .

[79]  Mehdi Toloo,et al.  Evaluation efficiency of large-scale data set with negative data: an artificial neural network approach , 2015, The Journal of Supercomputing.

[80]  Peter F. Wanke,et al.  Financial distress drivers in Brazilian banks: A dynamic slacks approach , 2015, Eur. J. Oper. Res..

[81]  R. Yadav,et al.  Technical Efficiency of Malaysia’s Development Financial Institutions: Application of Two-Stage DEA Analysis , 2015 .

[82]  Shaikh Hamzah Shaikh Abdul Razak,et al.  Efficiency assessment of banking sector in Yemen using data envelopment window analysis: A comparative analysis of Islamic and conventional banks , 2015 .

[83]  F. Sufian,et al.  Determinants of revenue efficiency of Islamic banks: Empirical evidence from the Southeast Asian countries , 2015 .

[84]  P. Singh,et al.  Dynamics of scale effficiency of Indian banks: A deterministic frontier approach , 2016 .

[85]  C. Barros,et al.  Predicting efficiency in Malaysian Islamic banks: A two-stage TOPSIS and neural networks approach , 2016 .

[86]  Maoguo Wu,et al.  Efficiency evaluation of banks in China: A dynamic two-stage slacks-based measure approach , 2016 .

[87]  Denis Larocque,et al.  An integrated approach of data envelopment analysis and boosted generalized linear mixed models for efficiency assessment , 2017, Ann. Oper. Res..

[88]  Maha Alandejani,et al.  Nonperforming loans in the GCC banking sectors: Does the Islamic finance matter? , 2017 .

[89]  Yong Tan,et al.  The Impacts of Risk-Taking Behaviour and Competition on Technical Efficiency: Evidence from the Chinese Banking Industry , 2017 .

[90]  Mohammad Dulal Miah,et al.  Efficiency and stability: A comparative study between islamic and conventional banks in GCC countries , 2017 .

[91]  Jonathan Crook,et al.  Dynamic Prediction of Financial Distress Using Malmquist DEA , 2017, Expert Syst. Appl..

[92]  T. L. Nguyen Diversification and bank efficiency in six ASEAN countries , 2018, Global Finance Journal.

[93]  Charalampos Stasinakis,et al.  Two-stage DEA-Truncated Regression: Application in banking efficiency and financial development , 2018, Expert Syst. Appl..

[94]  E. Alzate Modelos de mezclas Bernoulli con regresión logística: una aplicación en la valoración de carteras de crédito , 2020 .

[95]  Cardona Alzate,et al.  Predicción y selección de variables con bosques aleatorios en presencia de variables correlacionadas , 2020 .

[96]  Andrés Camilo,et al.  Evaluación de la eficiencia relativa de los sistemas de producción porcícolas del departamento de Cundinamarca, utilizando análisis envolvente de datos (DEA) , 2020 .