Data science and productivity: A bibliometric review of data science applications and approaches in productivity evaluations

Abstract This paper provides a comprehensive review of the applications of data science techniques and methodologies in productivity. The paper is structured as a combination of a bibliometric analysis and an empirical review. In the bibliometric analysis, the sources, authorship, and documents are reviewed and discussed. Visualisation aids, including summative tables and figures, are incorporated. In the empirical review, the corpus of 533 articles identified are reviewed based on the application areas of data science approaches and the primary methodology of the papers, and the selected most impactful and relevant papers in each methodological category are discussed in detail. The objective of this paper is to provide an overview of the current predominant trends and patterns in data science and productivity, explore how the interplay has been manifested, and provide an outlook on future research orientations.

[1]  Mohan V. Tatikonda,et al.  The Role of Operational Capabilities in Enhancing New Venture Survival: A Longitudinal Study , 2013 .

[2]  Chris Charalambous,et al.  Human resource management and performance: A neural network analysis , 2007, Eur. J. Oper. Res..

[3]  Osama Moselhi,et al.  Significance ranking of parameters impacting construction labour productivity , 2012 .

[4]  Luis Miguel Doncel,et al.  A novel machine learning approach for evaluation of public policies: An application in relation to the performance of university researchers , 2019 .

[5]  Wil M. P. van der Aalst,et al.  Data Science in Action , 2016 .

[6]  K. S. March,et al.  Toward a Definition , 2019, Women’s Informal Associations in Developing Countries.

[7]  A. Assaf Accounting for technological differences in modelling the performance of airports: a Bayesian approach , 2011 .

[8]  S. Malmquist Index numbers and indifference surfaces , 1953 .

[9]  Ryszard S. Michalski,et al.  A Theory and Methodology of Inductive Learning , 1983, Artificial Intelligence.

[10]  Robert A. Fairthorne Empirical hyperbolic distributions (Bradford-Zipf-Mandelbrot) for bibliometric description and prediction , 1969 .

[11]  Peter Appiahene,et al.  Evaluation of information technology impact on bank’s performance: The Ghanaian experience , 2019, International Journal of Engineering Business Management.

[12]  Mike G. Tsionas,et al.  On the estimation of total factor productivity: A novel Bayesian non-parametric approach , 2019, Eur. J. Oper. Res..

[13]  Jian Liu,et al.  Quality-driven workforce performance evaluation based on robust regression and ANOMR/ANOMRV chart , 2013 .

[14]  Atul K. Jain,et al.  Uncertainty analysis of terrestrial net primary productivity and net biome productivity in China during 1901–2005 , 2016 .

[15]  Jan vom Brocke,et al.  The Effect of Big Data and Analytics on Firm Performance: An Econometric Analysis Considering Industry Characteristics , 2018, J. Manag. Inf. Syst..

[16]  Jie Wu,et al.  An SBM-DEA model with parallel computing design for environmental efficiency evaluation in the big data context: a transportation system application , 2016, Annals of Operations Research.

[17]  Panayotis G. Michaelides,et al.  Globally flexible functional forms: The neural distance function , 2010, Eur. J. Oper. Res..

[18]  Tom Fawcett,et al.  Data Science and its Relationship to Big Data and Data-Driven Decision Making , 2013, Big Data.

[19]  M. W. Nielsen,et al.  Gender diversity in the management field: Does it matter for research outcomes? , 2019, Research Policy.

[20]  R. Rajesh,et al.  Forecasting supply chain resilience performance using grey prediction , 2016, Electron. Commer. Res. Appl..

[21]  L. R. Christensen,et al.  THE ECONOMIC THEORY OF INDEX NUMBERS AND THE MEASUREMENT OF INPUT, OUTPUT, AND PRODUCTIVITY , 1982 .

[22]  Walter F. Stenning,et al.  AN EMPIRICAL STUDY , 2003 .

[23]  Patrick Bean,et al.  Determinants of energy productivity in 39 countries: An empirical investigation , 2017 .

[24]  Kweku-Muata Osei-Bryson,et al.  Analyzing the impact of information technology investments using regression and data mining techniques , 2006, J. Enterp. Inf. Manag..

[25]  Abdullah Al Mamun,et al.  Untangling crop management and environmental influences on wheat yield variability in Bangladesh: An application of non-parametric approaches , 2015 .

[26]  Robert N. Broadus Toward a definition of “bibliometrics” , 1987, Scientometrics.

[27]  Saro Lee,et al.  Application of Decision-Tree Model to Groundwater Productivity-Potential Mapping , 2015 .

[28]  Andrea De Mauro,et al.  A formal definition of Big Data based on its essential features , 2016 .

[29]  Subal C. Kumbhakar,et al.  Productivity and efficiency estimation: A semiparametric stochastic cost frontier approach , 2015, Eur. J. Oper. Res..

[30]  Anu P. Anil,et al.  TQM practices and its performance effects – an integrated model , 2019, International Journal of Quality & Reliability Management.

[31]  Cebrail Çiflikli,et al.  Implementing a data mining solution for enhancing carpet manufacturing productivity , 2010, Knowl. Based Syst..

[32]  Sonia Rebai,et al.  A graphically based machine learning approach to predict secondary schools performance in Tunisia , 2020 .

[33]  Valentin Zelenyuk,et al.  Performance of hospital services in Ontario: DEA with truncated regression approach , 2016 .

[34]  Farouq Alhourani,et al.  Factors affecting the implementation rates of energy and productivity recommendations in small and medium sized companies , 2009 .

[35]  F. Liu,et al.  DEA Malmquist productivity measure: Taiwanese semiconductor companies , 2008 .

[36]  C.A.K. Lovell,et al.  Multilateral Productivity Comparisons When Some Outputs are Undesirable: A Nonparametric Approach , 1989 .

[37]  Keith D. Shepherd,et al.  The diversity of rural livelihoods and their influence on soil fertility in agricultural systems of East Africa - A typology of smallholder farms , 2010 .

[38]  Peter Wanke,et al.  Chinese bank efficiency during the global financial crisis: A combined approach using satisficing DEA and Support Vector Machines☆ , 2018 .

[39]  Jatinder N. D. Gupta,et al.  An integrative evaluation framework for intelligent decision support systems , 2009, Eur. J. Oper. Res..

[40]  He-Boong Kwon,et al.  Exploring the predictive potential of artificial neural networks in conjunction with DEA in railroad performance modeling , 2017 .

[41]  Stephen Morrow,et al.  Measuring efficiency and productivity in professional football teams: evidence from the English Premier League , 2007, Central Eur. J. Oper. Res..

[42]  Burak Eksioglu,et al.  An empirical study of RFID productivity in the U.S. retail supply chain , 2015 .

[43]  Nils J. Nilsson,et al.  Artificial Intelligence: A New Synthesis , 1997 .

[44]  Carlos Pestana Barros,et al.  An evaluation of European airlines’ operational performance , 2009 .

[45]  D. Ramesh,et al.  ANALYSIS OF CROP YIELD PREDICTION USING DATA MINING TECHNIQUES , 2015 .

[46]  Massimo Aria,et al.  bibliometrix: An R-tool for comprehensive science mapping analysis , 2017, J. Informetrics.

[47]  Ming-Lu Wu,et al.  Balancing productivity and consumer satisfaction for profitability: Statistical and fuzzy regression analysis , 2007, Eur. J. Oper. Res..

[48]  M. Yousef Ibrahim,et al.  Utilisation of data mining in mining industry: Improvement of the shearer loader productivity in underground mines , 2012, IEEE 10th International Conference on Industrial Informatics.

[49]  Yulin Fang,et al.  Individual, social and situational determinants of telecommuter productivity , 2005, Inf. Manag..

[50]  Valentin Zelenyuk,et al.  Data envelopment analysis, truncated regression and double-bootstrap for panel data with application to Chinese banking , 2018, Eur. J. Oper. Res..

[51]  Robert A. Fairthorne,et al.  Empirical hyperbolic distributions (Bradford-Zipf-Mandelbrot) for bibliometric description and prediction , 1969, J. Documentation.

[52]  Mike G. Tsionas,et al.  A Bayesian semiparametric approach to stochastic frontiers and productivity , 2019, Eur. J. Oper. Res..

[53]  Aminah Robinson Fayek,et al.  Predicting Industrial Construction Labor Productivity using Fuzzy Expert Systems , 2005 .

[54]  Jikun Huang,et al.  The Evolving Structure of Chinese R&D Funding and its Implications for the Productivity of Agricultural Biotechnology Research , 2020, Journal of Agricultural Economics.

[55]  Glenn Parry,et al.  Improving productivity in Hollywood with data science: Using emotional arcs of movies to drive product and service innovation in entertainment industries , 2020, J. Oper. Res. Soc..

[56]  C. Lovell,et al.  A note on the Malmquist productivity index , 1995 .

[57]  G. Cainelli,et al.  Spatial agglomeration and productivity in Italy: A panel smooth transition regression approach , 2015 .

[58]  Daniel W. Halpin,et al.  Productivity and Cost Regression Models for Pile Construction , 2005 .

[59]  Hans Bjurek The Malmquist Total Factor Productivity Index , 1996 .

[60]  S. Fawcett,et al.  Data Science, Predictive Analytics, and Big Data: A Revolution that Will Transform Supply Chain Design and Management , 2013 .

[61]  Lianbiao Cui,et al.  Environmental performance evaluation with big data: theories and methods , 2016, Annals of Operations Research.

[62]  Joey F. George,et al.  Toward the development of a big data analytics capability , 2016, Inf. Manag..

[63]  Kweku-Muata Osei-Bryson,et al.  Using Data Envelopment Analysis (DEA) for monitoring efficiency-based performance of productivity-driven organizations: Design and implementation of a decision support system , 2013 .

[64]  Rajiv D. Banker,et al.  Evaluating Contextual Variables Affecting Productivity Using Data Envelopment Analysis , 2008, Oper. Res..

[65]  Ali Azadeh,et al.  A flexible ANN-GA-multivariate algorithm for assessment and optimization of machinery productivity in complex production units , 2015 .

[66]  Peter D. Kemp,et al.  Modelling the productivity of naturalised pasture in the North Island, New Zealand: a decision tree approach , 2005 .

[67]  Miaohan Tang,et al.  Efficiency estimation and reduction potential of the Chinese construction industry via SE-DEA and artificial neural network , 2020 .

[68]  Wen Yi,et al.  Comparing the Random Forest with the Generalized Additive Model to Evaluate the Impacts of Outdoor Ambient Environmental Factors on Scaffolding Construction Productivity , 2018, Journal of Construction Engineering and Management.