A text mining framework for advancing sustainability indicators

Assessing and tracking sustainability indicators (SI) is challenging because studies are often expensive and time consuming, the resulting indicators are difficult to track, and they usually have limited social input and acceptance, a critical element of sustainability. The central premise of this work is to explore the feasibility of identifying, tracking and reporting SI by analyzing unstructured digital news articles with text mining methods. Using San Mateo County, California, as a case study, a non-mutually exclusive supervised classification algorithm with natural language processing techniques is applied to analyze sustainability content in news articles and compare the results with SI reports created by Sustainable San Mateo County (SSMC) using traditional methods. Results showed that the text mining approach could identify all of the indicators highlighted as important in the reports and that the method has potential for identifying region-specific SI, as well as providing insights on the underlying causes of sustainability problems. Region specific sustainability indicators were identified by analyzing news.Text mining allowed the tracking and reporting of the sustainability indicators.Text mining provided insights into sustainability issues in San Mateo County, CA.Developed a document classification algorithm to handle overlapping indicators.Chronic problems considered less newsworthy proved more difficult to track.

[1]  Stephen Morse Post-(sustainable) development? , 2009 .

[2]  John Elder,et al.  Practical Text Mining and Statistical Analysis for Non-structured Text Data Applications , 2012 .

[3]  D. Booher,et al.  Indicators for Sustainable Communities: A Strategy Building on Complexity Theory and Distributed Intelligence , 2000 .

[4]  James R. Beniger,et al.  Media Content as Social Indicators , 1978 .

[5]  Björn-Olav Dozo,et al.  Quantitative Analysis of Culture Using Millions of Digitized Books , 2010 .

[6]  Diane Holt,et al.  Media coverage of sustainable development issues – attention cycles or punctuated equilibrium? , 2012 .

[7]  Vesela Veleva,et al.  Do Indicators Help Create Sustainable Communities? , 2003 .

[8]  Tong Zhang,et al.  Fundamentals of Predictive Text Mining , 2010, Texts in Computer Science.

[9]  Alex A. Freitas,et al.  A survey of hierarchical classification across different application domains , 2010, Data Mining and Knowledge Discovery.

[10]  B. Moldan,et al.  How to understand and measure environmental sustainability: Indicators and targets , 2012 .

[11]  John Urry,et al.  Mediating global citizenship. , 2000 .

[12]  F. Chapin,et al.  A safe operating space for humanity , 2009, Nature.

[13]  Holger Wallbaum,et al.  Lessons from seven sustainability indicator programs in developing countries of Asia , 2011 .

[14]  Frank Figge,et al.  What the Papers Say: Trends in Sustainability: A Comparative Analysis of 115 Leading National Newspapers Worldwide , 2009 .

[15]  James W. Dearing,et al.  The Anatomy of Agenda‐Setting Research , 1993 .

[16]  Anthony J. Jakeman,et al.  Modelling and software as instruments for advancing sustainability , 2008 .

[17]  Marta Sabou,et al.  Media Watch on Climate Change -- Visual Analytics for Aggregating and Managing Environmental Knowledge from Online Sources , 2013, 2013 46th Hawaii International Conference on System Sciences.

[18]  A. Carvalho Representing the politics of the greenhouse effect: , 2005 .

[19]  Bruce V. Lewenstein,et al.  Selling Science: How the Press Covers Science and Technology , 1988 .

[20]  Paul James,et al.  Accounting for sustainability: combining qualitative and quantitative research in developing ‘indicators’ of sustainability , 2010 .

[21]  Samaneh Shokravi,et al.  Values in socio-environmental modelling: Persuasion for action or excuse for inaction , 2014, Environ. Model. Softw..

[22]  S. Solomon,et al.  Irreversible climate change due to carbon dioxide emissions , 2009, Proceedings of the National Academy of Sciences.

[23]  Mary C. Hill,et al.  Integrated environmental modeling: A vision and roadmap for the future , 2013, Environ. Model. Softw..

[24]  J. Mueller,et al.  War, presidents, and public opinion , 1973 .

[25]  M. Benton,et al.  The Agenda Setting Function of the Mass Media At Three Levels of "Information Holding" , 1976 .

[26]  Ee-Peng Lim,et al.  Hierarchical text classification and evaluation , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[27]  A. Dahl Achievements and gaps in indicators for sustainability , 2012 .

[28]  Kalev Leetaru,et al.  Culturomics 2.0: Forecasting large-scale human behavior using global news media tone in time and space , 2011, First Monday.

[29]  Tong Zhang,et al.  Fundamentals of Predictive Text Mining , 2010, Texts in Computer Science.

[30]  Ming-Wei Chang,et al.  Importance of Semantic Representation: Dataless Classification , 2008, AAAI.

[31]  Ronen Feldman,et al.  Book Reviews: The Text Mining Handbook: Advanced Approaches to Analyzing Unstructured Data by Ronen Feldman and James Sanger , 2008, CL.

[32]  John Thøgersen,et al.  Media attention and the market for 'green' consumer products , 2006 .

[33]  E. Adinyira,et al.  A Review of Urban Sustainability Assessment Methodologies , 2009 .

[34]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[35]  Gilad Mishne,et al.  Predicting Movie Sales from Blogger Sentiment , 2006, AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs.

[36]  S. Losco Sustainable Urbanism Urban design with nature , 2014 .

[37]  Jie Zhou,et al.  Stock Price Forecasting by Combining News Mining and Time Series Analysis , 2009, 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology.

[38]  D. Bryant,et al.  Environmental indicators : a systematic approach to measuring and reporting on environmental policy performance in the context of sustainable development , 1995 .

[39]  Shenghuo Zhu,et al.  Text categorization via generalized discriminant analysis , 2008, Inf. Process. Manag..

[40]  Laurence Smith,et al.  The role of expert opinion in environmental modelling , 2012, Environ. Model. Softw..

[41]  Anja Yli-Viikari,et al.  Confusing messages of sustainability indicators , 2009 .