Aggregation of inputs and outputs prior to Data Envelopment Analysis under big data

The main goal of this paper is to explore the possible solutions to a ‘big data’ problem related to the very large dimensions of input–output data. In particular, we focus on the cases of severe ‘curse of dimensionality’ problem that require dimension-reduction prior to using Data Envelopment Analysis. To achieve this goal, we have presented some theoretical grounds and performed a new to the literature simulation study where we explored the price-based aggregation as a solution to address the problem of very large dimensions.

[1]  Emmanuel Thanassoulis,et al.  Weights restrictions and value judgements in Data Envelopment Analysis: Evolution, development and future directions , 1997, Ann. Oper. Res..

[2]  Thomas M. Stoker Empirical Approaches to the Problem, of Aggregation Over Individuals , 2011 .

[3]  M. Farrell The Measurement of Productive Efficiency , 1957 .

[4]  R. Frisch Necessary and Sufficient Conditions regarding the form of an Index Number Which Shall Meet Certain of Fisher's Tests , 1930 .

[5]  Valentin Zelenyuk,et al.  Input aggregation and technical efficiency , 2002 .

[6]  Wassily W. Leontief Die Bilanz der russischen Volkswirtschaft : eine methodologische Untersuchung , 1925 .

[7]  Léopold Simar,et al.  WHEN BIAS KILLS THE VARIANCE: CENTRAL LIMIT THEOREMS FOR DEA AND FDH EFFICIENCY SCORES , 2014, Econometric Theory.

[8]  Agha Iqbal Ali,et al.  Streamlined computation for data envelopment analysis , 1993 .

[9]  Ron Kohavi,et al.  Emerging trends in business analytics , 2002, CACM.

[10]  L. Simar,et al.  Linearly interpolated FDH efficiency score for nonconvex frontiers , 2006 .

[11]  W. Cooper,et al.  Data Envelopment Analysis: A Comprehensive Text with Models, Applications, References and DEA-Solver Software , 1999 .

[12]  B. Park,et al.  THE FDH ESTIMATOR FOR PRODUCTIVITY EFFICIENCY SCORES , 2000, Econometric Theory.

[13]  Emmanuel Thanassoulis,et al.  Applied data envelopment analysis , 1991 .

[14]  L. R. Christensen,et al.  THE ECONOMIC THEORY OF INDEX NUMBERS AND THE MEASUREMENT OF INPUT, OUTPUT, AND PRODUCTIVITY , 1982 .

[15]  Loren W. Tauer Input aggregation and computed technical efficiency , 2001 .

[16]  A. Charnes,et al.  Some Models for Estimating Technical and Scale Inefficiencies in Data Envelopment Analysis , 1984 .

[17]  J. Tind,et al.  Convex Input and Output Projections of Nonconvex Production Possibility Sets , 2000 .

[18]  Nicole Adler,et al.  Improving discrimination in data envelopment analysis: PCA-DEA or variable reduction , 2010, Eur. J. Oper. Res..

[19]  A. C. Thomas,et al.  Linear input aggregation bias in nonparametric technical efficiency measurement , 1994 .

[20]  W. M. Gorman Community Preference Fields , 1953 .

[21]  Léopold Simar,et al.  Central Limit Theorems for Aggregate Efficiency , 2018, Oper. Res..

[22]  Paul W. Wilson,et al.  Dimension reduction in nonparametric models of production , 2017, Eur. J. Oper. Res..

[23]  Léopold Simar,et al.  Central Limit Theorems for Conditional Efficiency Measures and Tests of the ‘Separability’ Condition in Non�?Parametric, Two�?Stage Models of Production , 2018 .

[24]  L. R. Christensen,et al.  MULTILATERAL COMPARISONS OF OUTPUT, INPUT, AND PRODUCTIVITY USING SUPERLATIVE INDEX NUMBERS* , 1982 .

[25]  Joe Zhu,et al.  Measuring performance of two-stage network structures by DEA: A review and future perspective , 2010 .

[26]  O. H. Brownlee,et al.  ACTIVITY ANALYSIS OF PRODUCTION AND ALLOCATION , 1952 .

[27]  R. Dyson,et al.  Reducing Weight Flexibility in Data Envelopment Analysis , 1988 .

[28]  W. Seitz Productive Efficiency in the Steam-Electric Generating Industry , 1971, Journal of Political Economy.

[29]  Léopold Simar,et al.  Statistical Approaches for Non‐parametric Frontier Models: A Guided Tour , 2015 .

[30]  W. E. Diewert,et al.  Essays in The theory and measurement of consumer behaviour in honour of Sir Richard Stone : The economic theory of index numbers: a survey , 1981 .

[31]  Chiang Kao,et al.  Network data envelopment analysis: A review , 2014, Eur. J. Oper. Res..

[32]  E. Mammen,et al.  On estimation of monotone and concave frontier functions , 1999 .

[33]  B. Balk Decompositions of Fisher indexes , 2004 .

[34]  A. Charnes,et al.  Data Envelopment Analysis Theory, Methodology and Applications , 1995 .

[35]  T. Ueda,et al.  APPLICATION OF PRINCIPAL COMPONENT ANALYSIS FOR PARSIMONIOUS SUMMARIZATION OF DEA INPUTS AND/OR OUTPUTS , 1997 .

[36]  N. Petersen Data Envelopment Analysis on a Relaxed Set of Assumptions , 1990 .

[37]  W. Härdle,et al.  Applied Multivariate Statistical Analysis , 2003 .

[38]  Bernard Marr,et al.  Big Data: Using SMART Big Data, Analytics and Metrics To Make Better Decisions and Improve Performance , 2015 .

[39]  W. Diewert Fisher ideal output, input, and productivity indexes revisited , 1992 .

[40]  Joe Zhu,et al.  Within-group common benchmarking using DEA , 2017, Eur. J. Oper. Res..

[41]  Rolf Färe,et al.  The relative efficiency of Illinois electric utilities , 1983 .

[42]  Veda C. Storey,et al.  Business Intelligence and Analytics: From Big Data to Big Impact , 2012, MIS Q..

[43]  Rajiv D. Banker,et al.  Efficiency Analysis for Exogenously Fixed Inputs and Outputs , 1986, Oper. Res..

[44]  Ali Emrouznejad,et al.  Evaluation of research in efficiency and productivity: A survey and analysis of the first 30 years , 2008 .

[45]  John Ruggiero,et al.  Impact Assessment of Input Omission on Dea , 2005, Int. J. Inf. Technol. Decis. Mak..

[46]  L. V. Kantorovich,et al.  Mathematical Methods of Organizing and Planning Production , 1960 .

[47]  Yao Chen,et al.  DEANN: A healthcare analytic methodology of data envelopment analysis and artificial neural networks for the prediction of organ recipient functional status , 2016 .

[48]  Richard S. Barr,et al.  Parallel and hierarchical decomposition approaches for solving large-scale Data Envelopment Analysis models , 1997, Ann. Oper. Res..

[49]  I. Fisher The Best form of Index Number , 1921, Quarterly Publications of the American Statistical Association.

[50]  W. Diewert,et al.  A characterization of the Törnqvist price index , 2001 .

[51]  I. Fisher,et al.  The making of index numbers , 1967 .

[52]  Subhash C. Ray,et al.  Data Envelopment Analysis: Theory and Techniques for Economics and Operations Research , 2004 .

[53]  George B. Dantzig,et al.  Optimal Solution of a Dynamic Leontief Model with Substitution , 1955 .

[54]  Rajiv D. Banker,et al.  The Use of Categorical Variables in Data Envelopment Analysis , 1986 .

[55]  S. Afriat Efficiency Estimation of Production Function , 1972 .

[56]  Dominique Deprins,et al.  Measuring Labor-Efficiency in Post Offices , 2006 .

[57]  Neil F. Doherty,et al.  Operational research from Taylorism to Terabytes: A research agenda for the analytics age , 2015, Eur. J. Oper. Res..

[58]  Seok-Oh Jeong,et al.  ASYMPTOTIC DISTRIBUTION OF CONICAL-HULL ESTIMATORS OF DIRECTIONAL EDGES , 2010 .

[59]  W. Diewert A Note on Aggregation and Elasticities of Substitution , 1974 .

[60]  R. Shephard Cost and production functions , 1953 .

[61]  W. Erwin Diewert,et al.  The Measurement of Waste within the Production Sector of an Open Economy , 1983 .

[62]  Ragnar Frisch,et al.  Annual Survey of General Economic Theory: The Problem of Index Numbers , 1936 .

[63]  Léopold Simar,et al.  On estimation of monotone and convex boundaries , 1995 .

[64]  R. R. Russell,et al.  Technological Change, Technological Catch-up, and Capital Deepening: Relative Contributions to Growth and Convergence , 2002 .

[65]  G. Debreu The Coefficient of Resource Utilization , 1951 .

[66]  W. Diewert Aggregation Problems in the Measurement of Capital , 1980 .

[67]  W. M. Gorman SEPARABLE UTILITY AND AGGREGATION , 1959 .

[68]  Walter Diewert,et al.  Exact and superlative index numbers , 1976 .

[69]  A. Konuş,et al.  The Problem of the True Index of the Cost of Living , 1939 .

[70]  L. Törnqvist The Bank of Finland's consumption price index , 1936 .

[71]  Rolf Färe,et al.  Aggregation bias and its bounds in measuring technical efficiency , 2004 .

[72]  G. Dantzig Programming of Interdependent Activities: II Mathematical Model , 1949 .

[73]  D. Primont,et al.  Multi-Output Production and Duality: Theory and Applications , 1994 .

[74]  P. W. Wilson,et al.  Estimation and Inference in Nonparametric Frontier Models: Recent Developments and Perspectives , 2013 .

[75]  A. Tsybakov,et al.  Efficient Estimation of Monotone Boundaries , 1995 .

[76]  Léopold Simar,et al.  Testing Hypotheses in Nonparametric Models of Production , 2016 .

[77]  Chia-Yen Lee,et al.  LASSO variable selection in data envelopment analysis with small datasets , 2020 .

[78]  Richard Blundell,et al.  Heterogeneity and aggregation , 2005 .

[79]  W. Diewert Superlative Index Numbers and Consistency in Aggregation , 1978 .

[80]  Abraham Charnes,et al.  Measuring the efficiency of decision making units , 1978 .

[81]  Boaz Golany,et al.  Evaluation of deregulated airline networks using data envelopment analysis combined with principal component analysis with an application to Western Europe , 2001, Eur. J. Oper. Res..

[82]  Bert M. Balk,et al.  Price and Quantity Index Numbers: Models for Measuring Aggregate Change and Difference , 2012 .

[83]  P. W. Wilson,et al.  ASYMPTOTICS AND CONSISTENT BOOTSTRAPS FOR DEA ESTIMATORS IN NONPARAMETRIC FRONTIER MODELS , 2008, Econometric Theory.

[84]  H. Theil,et al.  Linear Aggregation of Economic Relations. , 1955 .

[85]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[86]  F. Førsund,et al.  On the Origins of Data Envelopment Analysis , 2000 .

[87]  R. Färe,et al.  Nonparametric Cost Approach to Scale Efficiency , 1985 .

[88]  R. Färe,et al.  Productivity Growth, Technical Progress, and Efficiency Change in Industrialized Countries , 1994 .

[89]  B. Balk,et al.  Industrial Price, Quantity, and Productivity Indices: The Micro-Economic Theory and an Application , 1998 .

[90]  W. Diewert THE MEASUREMENT OF PRODUCTIVITY , 1992 .

[91]  R. Banker Maximum likelihood, consistency and data envelopment analysis: a statistical foundation , 1993 .

[92]  B. Park,et al.  A NOTE ON THE CONVERGENCE OF NONPARAMETRIC DEA ESTIMATORS FOR PRODUCTION EFFICIENCY SCORES , 1998, Econometric Theory.

[93]  José H. Dulá,et al.  A computational study of DEA with massive data sets , 2008, Comput. Oper. Res..

[94]  Léopold Simar,et al.  Advanced Robust and Nonparametric Methods in Efficiency Analysis: Methodology and Applications , 2007 .

[95]  Peter Bogetoft,et al.  DEA on relaxed convexity assumptions , 1996 .

[96]  Lawrence M. Seiford,et al.  Data envelopment analysis (DEA) - Thirty years on , 2009, Eur. J. Oper. Res..

[97]  Jie Wu,et al.  Efficiency evaluation based on data envelopment analysis in the big data context , 2017, Comput. Oper. Res..

[98]  Cláudia S. Sarrico,et al.  Pitfalls and protocols in DEA , 2001, Eur. J. Oper. Res..

[99]  Walter Diewert,et al.  An Application of the Shephard Duality Theorem: A Generalized Leontief Production Function , 1971, Journal of Political Economy.

[100]  Joe Zhu,et al.  Data envelopment analysis: Prior to choosing a model , 2014 .

[101]  W. Seitz The measurement of efficiency relative to a frontier production function. , 1970 .

[102]  Mehdi Toloo,et al.  Data envelopment analysis and big data , 2019, Eur. J. Oper. Res..