Investigating the indoor environmental quality of different workplaces through web-scraping and text-mining of Glassdoor reviews

ABSTRACT The analysis of occupants’ perception can improve building indoor environmental quality (IEQ). Going beyond conventional surveys, this study presents an innovative analysis of occupants’ feedback about the IEQ of different workplaces based on web-scraping and text-mining of online job reviews. A total of 1,158,706 job reviews posted on Glassdoor about 257 large organizations (with more than 10,000 employees) are scraped and analyzed. Within these reviews, 10,593 include complaints about at least one IEQ aspect. The analysis of this large number of feedbacks referring to several workplaces is the first of its kind and leads to two main results: (1) IEQ complaints mostly arise in workplaces that are not office buildings, especially regarding poor thermal and indoor air quality conditions in warehouses, stores, kitchens, and trucks; (2) reviews containing IEQ complaints are more negative than reviews without IEQ complaints. The first result highlights the need for IEQ investigations beyond office buildings. The second result strengthens the potential detrimental effect that uncomfortable IEQ conditions can have on job satisfaction. This study demonstrates the potential of User-Generated Content and text-mining techniques to analyze the IEQ of workplaces as an alternative to conventional surveys, for scientific and practical purposes.

[1]  M. Moezzi,et al.  Text mining for occupant perspectives on the physical workplace , 2011 .

[2]  Hannah Villeneuve,et al.  Listen to the guests: Text-mining Airbnb reviews to explore indoor environmental quality , 2020 .

[3]  Jakub Kolarik,et al.  Design and application of occupant voting systems for collecting occupant feedback on indoor environmental quality of buildings – A review , 2020 .

[4]  H. Levene On a Matching Problem Arising in Genetics , 1949 .

[5]  Eric Sundstrom,et al.  Office Noise, Satisfaction, and Performance , 1994 .

[6]  Colin Fay,et al.  Text Mining with R: A Tidy Approach , 2018 .

[7]  Marilyne Andersen,et al.  Unweaving the human response in daylighting design , 2015 .

[8]  Ian Sutherland,et al.  Determinants of Guest Experience in Airbnb: A Topic Modeling Approach Using LDA , 2020, Sustainability.

[9]  Miyoung Jeong,et al.  Roles of negative emotions in customers’ perceived helpfulness of hotel reviews on a user-generated review website: A text mining approach , 2017 .

[10]  John D. Macomber,et al.  Healthy Buildings: How Indoor Spaces Drive Performance and Productivity , 2020 .

[11]  R. Law,et al.  Determinants of hotel guests’ satisfaction from the perspective of online hotel reviewers , 2019, International Journal of Culture, Tourism and Hospitality Research.

[12]  Xun Xu,et al.  The antecedents of customer satisfaction and dissatisfaction toward various types of hotels: A text mining approach , 2016 .

[13]  A Hedge,et al.  Sick building syndrome: a study of 4373 office workers. , 1987, The Annals of occupational hygiene.

[14]  Sérgio Moro,et al.  What drives job satisfaction in IT companies? , 2020 .

[15]  Roberto Lamberts,et al.  A review of human thermal comfort in the built environment , 2015 .

[16]  Vadlamani Ravi,et al.  A survey of the applications of text mining in financial domain , 2016, Knowl. Based Syst..

[17]  G. Newsham,et al.  Windows, view, and office characteristics predict physical and psychological discomfort , 2010 .

[18]  G. Newsham,et al.  A model of satisfaction with open-plan office conditions: COPE field findings , 2007 .

[19]  I. Altman,et al.  Handbook of environmental psychology , 1987 .

[20]  Mark Sanderson,et al.  Christopher D. Manning, Prabhakar Raghavan, Hinrich Schütze, Introduction to Information Retrieval, Cambridge University Press 2008. ISBN-13 978-0-521-86571-5, xxi + 482 pages , 2010, Natural Language Engineering.

[21]  P. Fanger,et al.  The effects of outdoor air supply rate in an office on perceived air quality, sick building syndrome (SBS) symptoms and productivity. , 2000, Indoor air.

[22]  Mohamed M. Mostafa,et al.  More than words: Social networks' text mining for consumer brand sentiments , 2013, Expert Syst. Appl..

[23]  Noriko Tomuro,et al.  Natural Language Processing in Game Studies Research , 2012 .

[24]  W. Kruskal,et al.  Use of Ranks in One-Criterion Variance Analysis , 1952 .

[25]  A. K. Mishra,et al.  Field studies on human thermal comfort — An overview , 2013 .

[26]  Jurui Zhang,et al.  What’s yours is mine: exploring customer voice on Airbnb using text-mining approaches , 2019, Journal of Consumer Marketing.

[27]  A. K. Mishra,et al.  Thermal comfort of heterogeneous and dynamic indoor conditions - An overview , 2016 .

[28]  Yeonjae Jung,et al.  Mining the voice of employees: A text mining approach to identifying and analyzing job satisfaction factors from online employee reviews , 2019, Decis. Support Syst..

[29]  A. Tasci,et al.  Large sample size, significance level, and the effect size: Solutions to perils of using big data for academic research , 2017 .

[30]  Xin Jin,et al.  What do Airbnb users care about? An analysis of online review comments , 2019, International Journal of Hospitality Management.

[31]  S. Schiavon,et al.  Improved long-term thermal comfort indices for continuous monitoring , 2020, Energy and Buildings.

[32]  Mohammed Arif,et al.  Occupant productivity and office indoor environment quality: A review of the literature , 2016 .

[33]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[34]  F. Okumus,et al.  Understanding Satisfied and Dissatisfied Hotel Customers: Text Mining of Online Hotel Reviews , 2016 .

[35]  Maedot S. Andargie,et al.  Review of multi‐domain approaches to indoor environmental perception and behaviour , 2020, Building and Environment.

[36]  Hadley Wickham,et al.  ggplot2 - Elegant Graphics for Data Analysis (2nd Edition) , 2017 .

[37]  D. Darling,et al.  A Test of Goodness of Fit , 1954 .

[38]  Seul Ki Lee,et al.  Topic Modeling of Online Accommodation Reviews via Latent Dirichlet Allocation , 2020 .

[39]  Wu He,et al.  International Journal of Information Management Social Media Competitive Analysis and Text Mining: a Case Study in the Pizza Industry , 2022 .

[40]  A. Vargha,et al.  A Critique and Improvement of the CL Common Language Effect Size Statistics of McGraw and Wong , 2000 .

[41]  Stuart J. Barnes,et al.  Mining meaning from online ratings and reviews: Tourist satisfaction analysis using latent dirichlet allocation , 2017 .

[42]  Anca D. Galasiu,et al.  Occupant preferences and satisfaction with the luminous environment and control systems in daylit offices: a literature review , 2006 .

[43]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[44]  Lisa Heschong,et al.  Daylighting Impacts on Human Performance in School , 2002 .

[45]  Dong Hyun Kim,et al.  First SenseLab studies with primary school children: exposure to different environmental configurations in the experience room , 2019, Intelligent Buildings International.

[46]  Jan Kietzmann,et al.  A great place to work!?: Understanding crowdsourced employer branding , 2017 .

[47]  Prageeth Jayathissa,et al.  Humans-as-a-sensor for buildings: Intensive longitudinal indoor comfort models , 2020, Buildings.

[48]  L. T. Wong,et al.  An evaluation model for indoor environmental quality (IEQ) acceptance in residential buildings , 2009 .

[49]  P. Günter,et al.  Survey of applications , 1989 .

[50]  Bona Kim,et al.  Analysis of satisfiers and dissatisfiers in online hotel reviews on social media , 2016 .

[51]  Gail Brager,et al.  Post-occupancy evaluation: State-of-the-art analysis and state-of-the-practice review , 2018 .

[52]  D. Clements–Croome Creating the productive workplace: places to work creatively , 2017 .

[53]  Shanshan Qi,et al.  Tracking the evolution of a destination's image by text-mining online reviews - the case of Macau , 2017 .

[54]  Classrooms Daylight,et al.  Windows and Offices: A Study of Office Worker Performance and the Indoor Environment - CEC PIER 2003 , 2015 .

[55]  Finn Årup Nielsen,et al.  A New ANEW: Evaluation of a Word List for Sentiment Analysis in Microblogs , 2011, #MSM.

[56]  P. Bluyssen Towards an integrated analysis of the indoor environmental factors and its effects on occupants , 2019, Intelligent Buildings International.

[57]  A. Rosenfeld,et al.  Estimates of Improved Productivity and Health from Better Indoor Environments , 1997 .

[58]  Julia K. Day,et al.  Oh behave! Survey stories and lessons learned from building occupants in high-performance buildings , 2017 .

[59]  S. Jia Motivation and satisfaction of Chinese and U.S. tourists in restaurants: A cross-cultural text mining of online reviews , 2020, Tourism Management.

[60]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[61]  P. Wargocki,et al.  Literature survey on how different factors influence human comfort in indoor environments , 2011 .

[62]  Derek J. Croome Creating the Productive Workplace , 1999 .

[63]  P. Bluyssen,et al.  Developing home occupant archetypes: First results of mixed-methods study to understand occupant comfort behaviours and energy use in homes , 2019, Building and Environment.

[64]  Xiaofeng Li,et al.  Evaluation of perceived indoor environmental quality of five-star hotels in China: An application of online review analysis , 2017 .

[65]  Xun Xu,et al.  Predicting overall customer satisfaction: Big data evidence from hotel online textual reviews , 2019, International Journal of Hospitality Management.

[66]  Galit Shmueli,et al.  Research Commentary - Too Big to Fail: Large Samples and the p-Value Problem , 2013, Inf. Syst. Res..