Topic Modeling as a Strategy of Inquiry in Organizational Research: A Tutorial With an Application Example on Organizational Culture

Research has emphasized the limitations of qualitative and quantitative approaches to studying organizational phenomena. For example, in-depth interviews are resource-intensive, while questionnaires with closed-ended questions can only measure predefined constructs. With the recent availability of large textual data sets and increased computational power, text mining has become an attractive method that has the potential to mitigate some of these limitations. Thus, we suggest applying topic modeling, a specific text mining technique, as a new and complementary strategy of inquiry to study organizational phenomena. In particular, we outline the potentials of structural topic modeling for organizational research and provide a step-by-step tutorial on how to apply it. Our application example builds on 428,492 reviews of Fortune 500 companies from the online platform Glassdoor, on which employees can evaluate organizations. We demonstrate how structural topic models allow to inductively identify topics that matter to employees and quantify their relationship with employees’ perception of organizational culture. We discuss the advantages and limitations of topic modeling as a research method and outline how future research can apply the technique to study organizational phenomena.

[1]  D. Blei,et al.  Exploiting affinities between topic modeling and the sociological perspective on culture: Application to newspaper coverage of U.S. government arts funding , 2013 .

[2]  Margaret E. Phillips Industry Mindsets: Exploring the Cultures of Two Macro-Organizational Settings , 1994 .

[3]  Jan vom Brocke,et al.  Text Mining For Information Systems Researchers: An Annotated Topic Modeling Tutorial , 2016, Commun. Assoc. Inf. Syst..

[4]  B. Mark Organizational culture. , 1996, Annual review of nursing research.

[5]  R. Light Measures of response agreement for qualitative data: Some generalizations and alternatives. , 1971 .

[6]  N. Morey,et al.  Organizational Culture: The Management Approach , 2008 .

[7]  Linda Shields,et al.  Content Analysis , 2015 .

[8]  Rhonda K. Reger,et al.  A Content Analysis of the Content Analysis Literature in Organization Studies: Research Themes, Data Sources, and Methodological Refinements , 2007 .

[9]  Anindya Datta,et al.  Simultaneously Discovering and Quantifying Risk Types from Textual Risk Disclosures , 2014, Manag. Sci..

[10]  Nicholas Berente,et al.  Big Data & Inductive Theory Development: Towards Computational Grounded Theory? , 2014, AMCIS.

[11]  Andrew McCallum,et al.  Optimizing Semantic Coherence in Topic Models , 2011, EMNLP.

[12]  Jennifer A. Chatman,et al.  Assessing the Relationship between Industry Characteristics and Organizational Culture: How Different can You Be? , 1994 .

[13]  Patrick Pantel,et al.  From Frequency to Meaning: Vector Space Models of Semantics , 2010, J. Artif. Intell. Res..

[14]  J. Kennedy,et al.  Culture , Leadership , and Organizations : The GLOBE Study of 62 Societies , 2022 .

[15]  James M. LeBreton,et al.  Importance of Personality and Job-Specific Affect for Predicting Job Attitudes and Withdrawal Behavior , 2004 .

[16]  Lucy R. Ford,et al.  It’s Not Me, It’s You , 2014 .

[17]  D. Fields Taking the Measure of Work: A Guide to Validated Scales for Organizational Research and Diagnosis , 2002 .

[18]  Margaret E. Roberts,et al.  A Model of Text for Experimentation in the Social Sciences , 2016 .

[19]  J. R. Firth,et al.  A Synopsis of Linguistic Theory, 1930-1955 , 1957 .

[20]  Toward a Typology of Business Process Management Professionals: Identifying Patterns of Competences through Latent Semantic Analysis , 2014 .

[21]  Jan vom Brocke,et al.  Utilizing big data analytics for information systems research: challenges, promises and guidelines , 2016, Eur. J. Inf. Syst..

[22]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[23]  Tim Scott,et al.  Instruments for exploring organizational culture: A review of the literature , 2009 .

[24]  John Elder,et al.  Practical Text Mining and Statistical Analysis for Non-structured Text Data Applications , 2012 .

[25]  Charlene A. Yauch,et al.  Complementary Use of Qualitative and Quantitative Cultural Assessment Methods , 2003 .

[26]  Ling Liu,et al.  Manipulation of online reviews: An analysis of ratings, readability, and sentiments , 2012, Decis. Support Syst..

[27]  Timo Honkela,et al.  Text Mining in Qualitative Research , 2009 .

[28]  Dragomir R. Radev,et al.  How to Analyze Political Attention with Minimal Assumptions and Costs , 2010 .

[29]  Dionisios N. Sotiropoulos,et al.  A computational model for mining consumer perceptions in social media , 2017, Decis. Support Syst..

[30]  Elena Gorbacheva,et al.  The Role of Gender in Business Process Management Competence Supply , 2016, Business & Information Systems Engineering.

[31]  Ken Kelley,et al.  Sample size for multiple regression: obtaining regression coefficients that are accurate, not simply significant. , 2003, Psychological methods.

[32]  Elena Gorbacheva,et al.  Towards a typology of business process management professionals: identifying patterns of competences through latent semantic analysis , 2016, Enterp. Inf. Syst..

[33]  R. Weber Basic content analysis, 2nd ed. , 1990 .

[34]  Margaret E. Roberts,et al.  stm: An R Package for Structural Topic Models , 2019, Journal of Statistical Software.

[35]  Paul A. Pavlou,et al.  Overcoming the J-shaped distribution of product reviews , 2009, CACM.

[36]  John,et al.  Reclaiming Qualitative Methods for Organizational Research: A Preface. , 1979 .

[37]  Usama M. Fayyad,et al.  Knowledge Discovery in Databases: An Overview , 1997, ILP.

[38]  Steven C. Currall,et al.  Combining Qualitative and Quantitative Methodologies to Study Group Processes:An Illustrative Study of Acorporate Board of Directors , 1999 .

[39]  David M. Blei,et al.  Probabilistic topic models , 2012, Commun. ACM.

[40]  Carmen R. Wilson VanVoorhis,et al.  Understanding Power and Rules of Thumb for Determining Sample Sizes , 2007 .

[41]  R. Weber Basic Content Analysis , 1986 .

[42]  Izak Benbasat,et al.  Development of an Instrument to Measure the Perceptions of Adopting an Information Technology Innovation , 1991, Inf. Syst. Res..

[43]  Scott Tonidandel,et al.  Big Data Methods , 2018 .

[44]  Charles A. Scherbaum,et al.  Estimating Statistical Power and Required Sample Sizes for Organizational Research Using Multilevel Modeling , 2009 .

[45]  Zellig S. Harris,et al.  Distributional Structure , 1954 .

[46]  Björn-Olav Dozo,et al.  Quantitative Analysis of Culture Using Millions of Digitized Books , 2010 .

[47]  Hal Daumé,et al.  Incorporating Lexical Priors into Topic Models , 2012, EACL.

[48]  Anand Kumar,et al.  Text mining and ontologies in biomedicine: Making sense of raw text , 2005, Briefings Bioinform..

[49]  Lei Zhang,et al.  A Survey of Opinion Mining and Sentiment Analysis , 2012, Mining Text Data.

[50]  Kevin G. Corley,et al.  Seeking Qualitative Rigor in Inductive Research , 2013 .

[51]  G. Gordon Industry Determinants of Organizational Culture , 1991 .

[52]  Vas Taras,et al.  Half a century of measuring culture: Review of approaches, challenges, and limitations based on the analysis of 121 instruments for quantifying culture , 2009 .

[53]  Bing Liu,et al.  Opinion spam and analysis , 2008, WSDM '08.

[54]  Weiguo Fan,et al.  Tapping the power of text mining , 2006, CACM.