Data mining and machine learning in the context of sustainable evaluation: a literature review

Measuring and evaluating the sustainable performance of an organization has become an important and challenging topic because it involves the economic, social and environmental dimensions, helping the development of policies and becoming strategic factors in the decision-making process. However, difficulties are still encountered by managers in adequately assessing sustainability at the corporate level. In this perspective, data mining and machine learning are presented as techniques for extracting potentially useful information for generation of knowledge. Therefore, the purpose of this article is to identify, by means of a literature review, different approaches used to assist in the evaluation of sustainable performance. The method called Methodi Ordinatio was used for the review and, for the analysis, the software tools: VOSviewer e RStudio. By means of the methodological procedure adopted, 33 significant articles were identified for analysis from the Web of Science, Scopus and Science Direct databases, in which mainly the applied techniques were addressed. In this sense, this study seeks to stimulate research on the use of DM and ML to help in the sustainable context, being essential to assist in the Sustainable Development Goals.

[1]  Luis Mauricio Resende,et al.  Avanços na composição da Methodi Ordinatio para revisão sistemática de literatura , 2018, Ciência da Informação.

[2]  P. Jordan,et al.  Systematic review as a research method in post-graduate nursing education , 2016 .

[3]  Torgeir Welo,et al.  Development of Manufacturing Sustainability Assessment Using Systems Thinking , 2015 .

[4]  Vesela Veleva,et al.  Do Indicators Help Create Sustainable Communities? , 2003 .

[5]  Antonio Carlos de Francisco,et al.  Data Mining and Machine Learning to Promote Smart Cities: A Systematic Review from 2000 to 2018 , 2019, Sustainability.

[6]  Rebecca Orsi,et al.  Use of multiple cluster analysis methods to explore the validity of a community outcomes concept map. , 2017, Evaluation and program planning.

[7]  Christos Vlachokostas,et al.  Environmental, social and economic information management for the evaluation of sustainability in urban areas: A system of indicators for Thessaloniki, Greece , 2010 .

[8]  Malin Song,et al.  How would big data support societal development and environmental sustainability? Insights and practices , 2017 .

[9]  Siarhei Manzhynski,et al.  Sustainability performance in the Baltic Sea Region , 2016 .

[10]  Ludo Waltman,et al.  Software survey: VOSviewer, a computer program for bibliometric mapping , 2009, Scientometrics.

[11]  Torgeir Welo,et al.  On the Applicability of Sustainability Assessment Tools in Manufacturing , 2015 .

[12]  N. C. Chauhan,et al.  Interesting association rule mining with consistent and inconsistent rule detection from big sales data in distributed environment , 2017 .

[13]  Massimo Aria,et al.  bibliometrix: An R-tool for comprehensive science mapping analysis , 2017, J. Informetrics.

[14]  Tzu Liang Tseng,et al.  Sustainable service and energy provision based on agile rule induction , 2016 .

[15]  D. Heredia,et al.  Student Dropout Predictive Model Using Data Mining Techniques , 2015, IEEE Latin America Transactions.

[16]  Jlm Jan Hensen,et al.  Evaluating energy performance in non-domestic buildings : a review , 2016 .

[17]  Jian Pei,et al.  Data Mining: Concepts and Techniques, 3rd edition , 2006 .

[18]  Lorna Elizabeth Wildgaard,et al.  A critical cluster analysis of 44 indicators of author-level performance , 2015, J. Informetrics.

[19]  Joseph Sarkis,et al.  Supplier selection for sustainable operations: A triple-bottom-line approach using a Bayesian framework , 2015 .

[20]  Mrutyunjaya Panda,et al.  Intelligent data analysis for sustainable smart grids using hybrid classification by genetic algorithm based discretization , 2017, Intell. Decis. Technol..

[21]  Kwangsoo Kim,et al.  Monitoring emerging technologies for technology planning using technical keyword based analysis from patent data , 2017 .

[22]  Yue Wang,et al.  Evolutionary features of academic articles co-keyword network and keywords co-occurrence network: Based on two-mode affiliation network , 2016 .

[23]  George T. S. Ho,et al.  Mining logistics data to assure the quality in a sustainable food supply chain: A case in the red wine industry , 2014 .

[24]  Michel Magnan,et al.  Is Environmental Governance Substantive or Symbolic? An Empirical Investigation , 2012, Journal of Business Ethics.

[25]  Seokho Chi,et al.  Sustainable Road Management in Texas : network-level flexible pavement structural condition analysis using data mining techniques , 2012 .

[26]  Mohamad Y. Jaber,et al.  A Quantitative Approach for Assessing Sustainability Performance of Corporations , 2018, Ecological Economics.

[27]  G. Büyüközkan,et al.  Sustainability performance evaluation: Literature review and future directions. , 2018, Journal of environmental management.

[28]  S. Vitell,et al.  A Global Analysis of Corporate Social Performance: The Effects of Cultural and Geographic Environments , 2012 .

[29]  Murat Kucukvar,et al.  Sustainability assessment of U.S. manufacturing sectors: an economic input output-based frontier approach , 2013 .

[30]  Miguel Molina-Solana,et al.  Meta-association rules for mining interesting associations in multiple datasets , 2016, Appl. Soft Comput..

[31]  Marko Debeljak,et al.  Modelling forest growing stock from inventory data: A data mining approach , 2014 .

[32]  Lior Rokach,et al.  Data Mining with Decision Trees - Theory and Applications. 2nd Edition , 2013, Series in Machine Perception and Artificial Intelligence.

[33]  Younes Oulad Sayad,et al.  Predictive modeling of wildfires: A new dataset and machine learning approach , 2019, Fire Safety Journal.

[34]  S. Nilsson,et al.  Use Of Rough Sets Analysis To Classify Siberian Forest Ecosystems According To Net Primary Production Of Phytomass , 2000 .

[35]  Oluwarotimi Williams Samuel,et al.  Intelligent computing system based on pattern recognition and data mining algorithms , 2017, Sustain. Comput. Informatics Syst..

[36]  Suleman Atique,et al.  Determinants and development of a web-based child mortality prediction model in resource-limited settings: A data mining approach , 2017, Comput. Methods Programs Biomed..

[37]  Aziz Guergachi,et al.  Mining sustainability indicators to classify hydrocarbon development , 2011, Knowl. Based Syst..

[38]  Cassiano Moro Piekarski,et al.  Mapping of main research lines concerning life cycle studies on packaging systems in Brazil and in the world , 2018, The International Journal of Life Cycle Assessment.

[39]  James L. Wescoat,et al.  Cluster analysis of urban water supply and demand: Toward large-scale comparative sustainability planning , 2016 .

[40]  Jiang Wu,et al.  Identification of key energy efficiency drivers through global city benchmarking: A data driven approach , 2017 .

[41]  Antonio Carlos de Francisco,et al.  Organizational Sustainability Practices: A Study of the Firms Listed by the Corporate Sustainability Index , 2018 .

[42]  Lianbiao Cui,et al.  Environmental performance evaluation with big data: theories and methods , 2016, Annals of Operations Research.

[43]  Jing Wang,et al.  Food safety pre-warning system based on data mining for a sustainable food supply chain , 2017 .

[44]  Samuel T. Ariaratnam,et al.  Understanding the effects of environmental factors on building energy efficiency designs and credits: Case studies using data mining and real-time data , 2017 .

[45]  T. V. D. Vaart,et al.  When Does Corporate Sustainability Performance Pay off? The Impact of Country-Level Sustainability Performance , 2018 .

[46]  Luis Mauricio Resende,et al.  Methodi Ordinatio: a proposed methodology to select and rank relevant scientific papers encompassing the impact factor, number of citation, and year of publication , 2015, Scientometrics.

[47]  Angappa Gunasekaran,et al.  The impact of big data on world-class sustainable manufacturing , 2015, The International Journal of Advanced Manufacturing Technology.

[48]  Y. Everingham,et al.  Accurate prediction of sugarcane yield using a random forest algorithm , 2016, Agronomy for Sustainable Development.

[49]  Viktor Pocajt,et al.  A differential multi-criteria analysis for the assessment of sustainability performance of European countries: Beyond country ranking , 2017 .

[50]  Holger Wallbaum,et al.  Lessons from seven sustainability indicator programs in developing countries of Asia , 2011 .

[51]  Sandra Rolim Ensslin,et al.  Uma análise bibliométrica da literatura sobre estratégia e avaliação de desempenho , 2012 .

[52]  Sahil Shah,et al.  Predicting stock and stock price index movement using Trend Deterministic Data Preparation and machine learning techniques , 2015, Expert Syst. Appl..

[53]  S. Vinodh,et al.  Evaluation of sustainability using fuzzy association rules mining , 2011 .

[54]  Danlin Yu,et al.  The dynamics of public safety in cities: A case study of Shanghai from 2010 to 2025 , 2017 .

[55]  Wei Chen,et al.  Visual analysis of user-driven association rule mining , 2017, J. Vis. Lang. Comput..

[56]  Alex Alves Freitas,et al.  An extensive evaluation of seven machine learning methods for rainfall prediction in weather derivatives , 2017, Expert Syst. Appl..

[57]  Jeremy Hall,et al.  Sustainable development and entrepreneurship: Past contributions and future directions , 2010 .

[58]  Antonio Carlos de Francisco,et al.  How to identify opportunities for improvement in the use of reverse logistics in clothing industries? A case study in a Brazilian cluster , 2019, Journal of Cleaner Production.

[59]  Alvaro Jose Abackerli,et al.  A new web-based method for automatic selection of articles for systematic literature reviews , 2017, IEEE Latin America Transactions.

[60]  Alfonso Capozzoli,et al.  Data mining for energy analysis of a large data set of flats , 2017 .

[61]  Gül E. Okudan Kremer,et al.  Text mining-based categorization and user perspective analysis of environmental sustainability indicators for manufacturing and service systems , 2017 .

[62]  A. Gunasekaran,et al.  Can big data and predictive analytics improve social and environmental sustainability? , 2017, Technological Forecasting and Social Change.

[63]  Benchaphun Ekasingh,et al.  Searching for simplified farmers' crop choice models for integrated watershed management in Thailand: A data mining approach , 2009, Environ. Model. Softw..

[64]  Marcus Wagner,et al.  The role of corporate sustainability performance for economic performance: A firm-level analysis of moderation effects , 2010 .

[65]  Danilo Bertoni,et al.  Integrating agricultural sustainability into policy planning: A geo-referenced framework based on Rough Set theory , 2015 .

[66]  Ming Tang,et al.  A Bibliometric Analysis and Visualization of Medical Big Data Research , 2018 .

[67]  Sungjoo Lee,et al.  Keyword selection and processing strategy for applying text mining to patent analysis , 2015, Expert Syst. Appl..

[68]  Vytautas Martinaitis,et al.  Evaluation of energy efficiency measures sustainability by decision tree method , 2014 .

[69]  Sandra Milena Merchan,et al.  Analysis of Data Mining Techniques for Constructing a Predictive Model for Academic Performance , 2016, IEEE Latin America Transactions.

[70]  Min Wu,et al.  Accurate fuzzy predictive models through complexity reduction based on decision of needed fuzzy rules , 2019, Neurocomputing.

[71]  Loet Leydesdorff,et al.  A review of theory and practice in scientometrics , 2015, Eur. J. Oper. Res..

[72]  Malin Song,et al.  Quantitative Analysis of Foreign Trade and Environmental Efficiency in China , 2016 .

[73]  Peter Seele,et al.  Predictive Sustainability Control: A review assessing the potential to transfer big data driven ‘predictive policing’ to corporate sustainability management , 2017 .

[74]  Rupert J. Baumgartner,et al.  Toward supply chain-wide sustainability assessment: a conceptual framework and an aggregation method to assess supply chain performance , 2016 .

[75]  Vipin Kumar,et al.  Monitoring global forest cover using data mining , 2011, TIST.

[76]  Rajagopalan Srinivasan,et al.  Sustainability trends in the process industries: A text mining-based analysis , 2014, Comput. Ind..

[77]  Nour El Islem Karabadji,et al.  An evolutionary scheme for decision tree construction , 2017, Knowl. Based Syst..

[78]  Jairo R. Montoya-Torres,et al.  Making real progress toward more sustainable societies using decision support models and tools: introduction to the special volume , 2015 .

[79]  Joaquim Melendez,et al.  Identifying services for short-term load forecasting using data driven models in a Smart City platform , 2017 .

[80]  Zhiqiang Ge,et al.  Data Mining and Analytics in the Process Industry: The Role of Machine Learning , 2017, IEEE Access.

[81]  Aurenice da Cruz Figueira,et al.  Identification of rules induced through decision tree algorithm for detection of traffic accidents with victims: a study case from Brazil , 2017 .

[82]  Lian Duan,et al.  Big data analytics and business analytics , 2015 .

[83]  Mette Andersen,et al.  Corporate social responsibility in global supply chains , 2009 .

[84]  Robson Parmezan Bonidia,et al.  Data Mining in Sports: A Systematic Review , 2018, IEEE Latin America Transactions.

[85]  Tao Li,et al.  A principal component analysis based three-dimensional sustainability assessment model to evaluate corporate sustainable performance , 2018, Journal of Cleaner Production.

[86]  S. Feng,et al.  Categorization of indicators for sustainable manufacturing , 2013 .

[87]  Gilson Brito Alves Lima,et al.  Sustainability Analysis in Electrical Energy Companies by Similarity Technique to Ideal Solution , 2017, IEEE Latin America Transactions.

[88]  S. Seuring,et al.  Conducting content‐analysis based literature reviews in supply chain management , 2012 .

[89]  Bernard Zenko,et al.  Estimating the risk of fire outbreaks in the natural environment , 2012, Data Mining and Knowledge Discovery.

[90]  Marko Bohanec,et al.  Decision-making framework with double-loop learning through interpretable black-box machine learning models , 2017, Ind. Manag. Data Syst..

[91]  Bingcheng Wang,et al.  Analyzing sustainability of Chinese mining cities using an association rule mining approach , 2016 .

[92]  Denise M. Rousseau Envisioning Evidence-Based Management , 2012 .

[93]  A. Kolk,et al.  Extrinsic and Intrinsic Drivers of Corporate Social Performance: Evidence from Foreign and Domestic Firms in Mexico , 2009 .

[94]  Dan Roth,et al.  A text mining framework for advancing sustainability indicators , 2014, Environ. Model. Softw..

[95]  Christian N. Madu,et al.  Urban sustainability management: A deep learning perspective , 2017 .

[96]  Leila Mendes da Luz,et al.  Environmental profile analysis of MDF panels production: study in a brazilian technological condition , 2014 .

[97]  Jeanne M Link Publish or perish…but where? What is the value of impact factors? , 2015, Nuclear medicine and biology.

[98]  Joao Jose,et al.  Adaptive Data Mining: Preliminary Studies , 2014, IEEE Latin America Transactions.