Analyzing Meaning in Big Data: Performing a Map Analysis Using Grammatical Parsing and Topic Modeling

Social scientists have recently started discussing the utilization of text-mining tools as being fruitful for scaling inductively grounded close reading. We aim to progress in this direction and provide a contemporary contribution to the literature. By focusing on map analysis, we demonstrate the potential of text-mining tools for text analysis that approaches inductive but still formal in-depth analysis. We propose that a combination of text-mining tools addressing different layers of meaning facilitates a closer analysis of the dynamics of manifest and latent meanings than is currently acknowledged. To illustrate our approach, we combine grammatical parsing and topic modeling to operationalize communication structures within sentences and the semantic surroundings of these communication structures. We use a reliable and downloadable software application to analyze the dynamic interlacement of two layers of meaning over time. We do so by analyzing 15,371 newspaper articles on corporate responsibility published in the United States from 1950 to 2013.

[1]  Kathleen M. Carley Artificial Intelligence within Sociology , 1996 .

[2]  Richard Marens Generous in victory? American managerial autonomy, labour relations and the invention of Corporate Social Responsibility , 2012 .

[3]  John L. Campbell Why would corporations behave in socially responsible ways? an institutional theory of corporate social responsibility , 2007 .

[4]  J. Moon,et al.  Institutional complementarity between corporate governance and Corporate Social Responsibility: a comparative institutional analysis of three capitalisms , 2012 .

[5]  Matthew Hayes,et al.  A Progressive Supervised-learning Approach to Generating Rich Civil Strife Data , 2015 .

[6]  David Lazer,et al.  A Frame of Mind: Using Statistical Models for Detection of Framing and Agenda Setting Campaigns , 2015, ACL.

[7]  Ian Palmer,et al.  Managerial accounts of downsizing , 1997 .

[8]  Daniel Gildea,et al.  Automatic Labeling of Semantic Roles , 2000, ACL.

[9]  Mihai Surdeanu,et al.  The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[10]  Tong Zhang,et al.  Named Entity Recognition through Classifier Combination , 2003, CoNLL.

[11]  Roberto Franzosi,et al.  Computer-Assisted Coding of Textual Data , 1990 .

[12]  P. Berger,et al.  The Social Construction of Reality , 1966 .

[13]  Roel Popping Qualitative Decisions in Quantitative Text Analysis Research , 2012 .

[14]  Andrew McCallum,et al.  An Introduction to Conditional Random Fields for Relational Learning , 2007 .

[15]  Yong Wang,et al.  Modality analysis: a semantic grammar for imputations of intentionality in texts , 2010 .

[16]  P. Bearman,et al.  Lexical shifts, substantive changes, and continuity in State of the Union discourse, 1790–2014 , 2015, Proceedings of the National Academy of Sciences.

[17]  J. Hollingsworth,et al.  Contemporary Capitalism: From National Embeddedness to Spatial and Institutional Nestedness , 1997 .

[18]  Carl W. Roberts,et al.  Other Than Counting Words: A Linguistic Approach to Content Analysis , 1989 .

[19]  Frank Dobbin,et al.  Introduction: The International Diffusion of Liberalism , 2006, International Organization.

[20]  D. R. Heise,et al.  The Syntax of Social Life: The Theory and Method of Comparative Narratives. , 1989 .

[21]  James A. Evans,et al.  Machine Translation: Mining Text for Social Theory , 2016 .

[22]  Gabriel Abend,et al.  The Moral Background: An Inquiry into the History of Business Ethics , 2014 .

[23]  C. W. Roberts,et al.  Semantic text analysis and the measurement of ideological developments within fledgling democracies , 2015 .

[24]  Matthew L. Jockers,et al.  Quantitative formalism: an experiment , 2011 .

[25]  M. T. Kennedy,et al.  Getting Counted: Markets, Media, and Reality , 2008 .

[26]  Harold D. Lasswell,et al.  The comparative study of symbols : an introduction , 1952 .

[27]  Stefania Vicari,et al.  Measuring collective action frames: A linguistic approach to frame analysis , 2010 .

[28]  Danqi Chen,et al.  A Fast and Accurate Dependency Parser using Neural Networks , 2014, EMNLP.

[29]  Kathleen M. Carley Coding Choices for Textual Analysis: A Comparison of Content Analysis and Map Analysis , 1993 .

[30]  Ralf Klabunde,et al.  Computerlinguistik und Sprachtechnologie : eine Einführung , 2010 .

[31]  I. De Sola Pool,et al.  Trends in content analysis , 1960 .

[32]  Tunga Güngör,et al.  Part-of-Speech Tagging , 2005 .

[33]  Stephen G. Kobourov,et al.  Spring Embedders and Force Directed Graph Drawing Algorithms , 2012, ArXiv.

[34]  Ronald L. Breiger,et al.  Graphing the grammar of motives in National Security Strategies: Cultural interpretation, automated text analysis and the drama of global politics , 2013 .

[35]  J. Mohr,et al.  Formal studies of culture: Issues, challenges, and current trends , 2018, Poetics.

[36]  Shion Guha,et al.  Machine Learning and Grounded Theory Method: Convergence, Divergence, and Combination , 2016, GROUP.

[37]  Dan Roth,et al.  The Importance of Syntactic Parsing and Inference in Semantic Role Labeling , 2008, CL.

[38]  Jens Lehmann,et al.  DBpedia - A large-scale, multilingual knowledge base extracted from Wikipedia , 2015, Semantic Web.

[39]  Gregory Jackson,et al.  Corporate Social Responsibility and Institutional Theory: New Perspectives on Private Governance , 2012 .

[40]  D. Blei,et al.  Exploiting affinities between topic modeling and the sociological perspective on culture: Application to newspaper coverage of U.S. government arts funding , 2013 .

[41]  Matthew L. Jockers,et al.  Significant themes in 19th-century literature , 2013 .

[42]  Csr Young,et al.  How to Do Things With Words , 2009 .

[43]  J. Cornelissen,et al.  Putting Framing in Perspective: A Review of Framing and Frame Analysis across the Management and Organizational Literature , 2014 .

[44]  Abagail McWilliams,et al.  Corporate Social Responsibility: a Theory of the Firm Perspective , 2001 .

[45]  Pat Langley,et al.  Elements of Machine Learning , 1995 .

[46]  Peer C. Fiss,et al.  Putting Communication Front and Center in Institutional Theory and Analysis , 2015 .

[47]  Dan Klein,et al.  Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network , 2003, NAACL.

[48]  Sebastian G. M. Händschke,et al.  Global and local orientation in organisational actorhood: A comparative study of large corporations from Germany, the United Kingdom, and the United States , 2018, European Journal of Cultural and Political Sociology.

[49]  Beatrice Santorini Part-of-speech tagging guidelines for the penn treebank project , 1990 .

[50]  W. Streeck How Will Capitalism End , 2020 .

[51]  M. Vidal Reworking Postfordism: Labor Process Versus Employment Relations , 2011 .

[52]  Roberto Franzosi,et al.  From Words to Numbers: A Generalized and Linguistics-Based Coding Procedure for Collecting Textual Data , 1989 .

[53]  Steven J. Kahl,et al.  Discursive strategies and radical technological change: Multilevel discourse analysis of the early computer (1947–1958) , 2016 .

[54]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[55]  Roberto Franzosi,et al.  Ways of Measuring Agency , 2012 .

[56]  David M. Blei,et al.  Probabilistic topic models , 2012, Commun. ACM.

[57]  Nello Cristianini,et al.  Network analysis of narrative content in large corpora , 2013, Natural Language Engineering.

[58]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[59]  M. Vidal Postfordism as a dysfunctional accumulation regime: a comparative analysis of the USA, the UK and Germany , 2013 .

[60]  O. K. Pedersen Institutional Competitiveness: How Nations came to Compete , 2010 .

[61]  Stefanie Hiss From Implicit to Explicit Corporate Social Responsibility: Institutional Change as a Fight for Myths , 2009, Business Ethics Quarterly.

[62]  Margaret Frye,et al.  A mixed-methods framework for analyzing text data: Integrating computational techniques with qualitative methods in demography , 2017 .

[63]  Mark Steyvers,et al.  Finding scientific topics , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[64]  Ali Shojaie,et al.  Using Twitter for Demographic and Social Science Research: Tools for Data Collection and Processing , 2014, Sociological methods & research.

[65]  W. A. Martin,et al.  Parsing , 1980, ACL.

[66]  Petko Bogdanov,et al.  Introduction—Topic models: What they are and why they matter , 2013 .

[67]  Roberto Franzosi,et al.  Quantitative Narrative Analysis , 2009 .

[68]  Christopher D. Manning,et al.  Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling , 2005, ACL.

[69]  G. Davis Managed by the Markets: How Finance Re-Shaped America , 2009 .

[70]  Richard Biernacki Reinventing evidence in social inquiry : decoding facts and variables , 2012 .

[71]  Robert Waldersee,et al.  Espoused Values and Organizational Change Themes , 1995 .

[72]  Giorgio Satta,et al.  Theory of Parsing , 2010 .

[73]  B. Amable Morals and politics in the ideology of neo-liberalism , 2011 .

[74]  Ronald L. Breiger,et al.  Ontologies, methodologies, and new uses of Big Data in the social and cultural sciences , 2015 .

[75]  Monica M. Lee,et al.  Coding, counting and cultural cartography , 2015 .

[76]  Eric Abrahamson,et al.  Employee-management Techniques: Transient Fads or Trending Fashions? , 2008 .

[77]  Heeyoung Lee,et al.  Deterministic Coreference Resolution Based on Entity-Centric, Precision-Ranked Rules , 2013, CL.

[78]  Christopher D. Manning,et al.  Generating Typed Dependency Parses from Phrase Structure Parses , 2006, LREC.

[79]  Igor Mel’čuk,et al.  Dependency Syntax: Theory and Practice , 1987 .

[80]  Renate E. Meyer,et al.  Meaning Structures in a Contested Issue Field: A Topographic Map of Shareholder Value in Austria , 2010 .

[81]  Ronald L. Breiger,et al.  Toward a computational hermeneutics , 2015 .

[82]  J. Mohr Measuring Meaning Structures , 1998 .

[83]  C. Bail The cultural environment: measuring culture with big data , 2014, Theory and Society.

[84]  R. Kaplan Who has been regulating whom, business or society? The mid-20th-century institutionalization of ‘corporate responsibility’ in the USA , 2015 .

[85]  Charles J. Fillmore,et al.  Frames and the semantics of understanding , 1985 .

[86]  Kathleen M. Carley Extracting team mental models through textual analysis , 1997 .

[87]  R. Wadhwani,et al.  Changing Landscapes: The Construction of Meaning and Value in a New Market Category—Modern Indian Art , 2010 .

[88]  I. Maignan,et al.  Corporate Social Responsibility in Europe and the U.S.: Insights from Businesses’ Self-presentations , 2002 .

[89]  C. W. Roberts,et al.  A Conceptual Framework for Quantitative Text Analysis , 2000 .

[90]  John Hale,et al.  A Statistical Approach to Anaphora Resolution , 1998, VLC@COLING/ACL.

[91]  A. Brotherton Changing landscapes. , 2011, Journal of human nutrition and dietetics : the official journal of the British Dietetic Association.

[92]  Ernst C. Osinga,et al.  Big Data and Data Science Methods for Management Research , 2016 .

[93]  John W. Mohr,et al.  Meanings and relations: An introduction to the study of language, discourse and networks , 2010 .

[94]  F. Dobbin,et al.  The Misapplication of Mr. Michael Jensen: How Agency Theory Brought Down the Economy and Why it Might Again , 2010 .

[95]  Yunheng Ji MORPHOLOGY , 1937, A Grammar of Italian Sign Language (LIS).

[96]  Ronald W. Langacker,et al.  An Introduction to Cognitive Grammar , 1986, Cogn. Sci..

[97]  Katharine A. Rendle,et al.  The promises of computational ethnography: Improving transparency, replicability, and validity for realist approaches to ethnographic analysis , 2018 .

[98]  Klaus Krippendorff,et al.  Content Analysis: An Introduction to Its Methodology , 1980 .

[99]  Edith Bolling Anaphora Resolution , 2006 .

[100]  Susan Conrad,et al.  Corpus Linguistics: Investigating Language Structure and Use , 1998 .

[101]  N. Fligstein,et al.  Seeing Like the Fed: Culture, Cognition, and Framing in the Failure to Anticipate the Financial Crisis of 2008 , 2017 .

[102]  R. Shamir Socially Responsible Private Regulation: World‐Culture or World‐Capitalism? , 2011 .

[103]  B. C. Vickery,et al.  Ontologies , 1997, J. Inf. Sci..

[104]  D. Adger,et al.  Syntax , 2014, Wiley interdisciplinary reviews. Cognitive science.

[105]  Justin Grimmer,et al.  Elevated threat levels and decreased expectations: How democracy handles terrorist threats , 2013 .

[106]  Ronald L. Breiger,et al.  Capturing distinctions while mining text data: Toward low-tech formalization for text analysis , 2018, Poetics.

[107]  Maciej Eder,et al.  Stylometry, network analysis, and Latin literature , 2014, DH.

[108]  Shion Guha,et al.  Comparing grounded theory and topic modeling: Extreme divergence or unlikely convergence? , 2017, J. Assoc. Inf. Sci. Technol..

[109]  Wouter van Atteveldt,et al.  Parsing, Semantic Networks, and Political Authority Using Syntactic Analysis to Extract Semantic Relations from Dutch Newspaper Articles , 2008, Political Analysis.

[110]  Claudia C. Cogliser,et al.  Construct Validation Using Computer-Aided Text Analysis (CATA) , 2010 .

[111]  Laura K. Nelson,et al.  Computational Grounded Theory: A Methodological Framework , 2020 .

[112]  J. Moon,et al.  'Implicit' and 'Explicit' CSR: A Conceptual Framework for a Comparative Understanding of Corporate Social Responsibility , 2008 .

[113]  C. Hardy,et al.  Discourse and Institutions , 2004 .