Mining Marketing Meaning from Online Chatter: Strategic Brand Analysis of Big Data Using Latent Dirichlet Allocation

Online chatter, or user-generated content, constitutes an excellent emerging source for marketers to mine meaning at a high temporal frequency. This article posits that this meaning consists of extracting the key latent dimensions of consumer satisfaction with quality and ascertaining the valence, labels, validity, importance, dynamics, and heterogeneity of those dimensions. The authors propose a unified framework for this purpose using unsupervised latent Dirichlet allocation. The sample of user-generated content consists of rich data on product reviews across 15 firms in five markets over four years. The results suggest that a few dimensions with good face validity and external validity are enough to capture quality. Dynamic analysis enables marketers to track dimensions’ importance over time and allows for dynamic mapping of competitive brand positions on those dimensions over time. For vertically differentiated markets (e.g., mobile phones, computers), objective dimensions dominate and are similar across markets, heterogeneity is low across dimensions, and stability is high over time. For horizontally differentiated markets (e.g., shoes, toys), subjective dimensions dominate but vary across markets, heterogeneity is high across dimensions, and stability is low over time.

[1]  J. Fleiss Measuring nominal scale agreement among many raters. , 1971 .

[2]  J. R. Landis,et al.  The measurement of observer agreement for categorical data. , 1977, Biometrics.

[3]  Martin F. Porter,et al.  An algorithm for suffix stripping , 1997, Program.

[4]  B. Klein,et al.  The Role of Market Forces in Assuring Contractual Performance , 1981, Journal of Political Economy.

[5]  D. Aaker,et al.  The Strategic Role of Product Quality , 1987 .

[6]  Gerard J. Tellis,et al.  Competitive Price and Quality Under Asymmetric Information , 1987 .

[7]  Michel Wedel,et al.  Latent class metric conjoint analysis , 1992 .

[8]  P. Lenk,et al.  A latent class procedure for the structural analysis of two-way compositional data , 1993 .

[9]  M. Newton Approximate Bayesian-inference With the Weighted Likelihood Bootstrap , 1994 .

[10]  R. Rust,et al.  Return on Quality (ROQ): Making Service Quality Financially Accountable , 1995 .

[11]  S. Chib Marginal Likelihood from the Gibbs Output , 1995 .

[12]  W. Kamakura,et al.  Modeling Preference and Structural Heterogeneity in Consumer Choice , 1996 .

[13]  Michel Wedel,et al.  An Exponential-Family Multidimensional Scaling Mixture Methodology , 1996 .

[14]  J. Cable Market Share Behavior and Mobility: An Analysis and Time-Series Application , 1997, Review of Economics and Statistics.

[15]  W. DeSarbo,et al.  A Parametric Multidimensional Unfolding Procedure for Incomplete Nonmetric Preference/Choice Set Data in Marketing Research , 1997 .

[16]  Lillian Lee,et al.  Measures of Distributional Similarity , 1999, ACL.

[17]  H. Kundel,et al.  Measurement of observer agreement. , 2003, Radiology.

[18]  Michael L. Littman,et al.  Measuring praise and criticism: Inference of semantic orientation from association , 2003, TOIS.

[19]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[20]  Mark Steyvers,et al.  Finding scientific topics , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[21]  David J. C. MacKay,et al.  Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[22]  Aleks Jakulin,et al.  Discrete Component Analysis , 2005, SLSFS.

[23]  P. N. Golder,et al.  How Does Objective Quality Affect Perceived Quality? Short-Term Effects, Long-Term Effects, and Asymmetries , 2006 .

[24]  John D. Lafferty,et al.  Dynamic topic models , 2006, ICML.

[25]  Michael I. Jordan,et al.  Hierarchical Dirichlet Processes , 2006 .

[26]  Michael Greenacre,et al.  A Comparison of Different Methods for Representing Categorical Data , 2006 .

[27]  G. Tellis,et al.  The Value of Quality: Stock Market Returns to Published Quality Reviews , 2007 .

[28]  Panagiotis G. Ipeirotis,et al.  Show me the money!: deriving the pricing power of product features by mining consumer reviews , 2007, KDD '07.

[29]  Wayne S. DeSarbo,et al.  A Clusterwise Bilinear Multidimensional Scaling Methodology for Simultaneous Segmentation and Positioning Analyses , 2008 .

[30]  G. Tellis,et al.  Does Quality Win? Network Effects versus Quality in High-Tech Markets , 2009 .

[31]  Ruslan Salakhutdinov,et al.  Evaluation methods for topic models , 2009, ICML '09.

[32]  Yulan He,et al.  Joint sentiment/topic model for sentiment analysis , 2009, CIKM.

[33]  Priscilla S. Markwood,et al.  The Long Tail: Why the Future of Business is Selling Less of More , 2006 .

[34]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[35]  David M. Blei,et al.  Probabilistic topic models , 2012, Commun. ACM.

[36]  Justin Grimmer,et al.  A Bayesian Hierarchical Topic Model for Political Texts: Measuring Expressed Agendas in Senate Press Releases , 2010, Political Analysis.

[37]  Eric T. Bradlow,et al.  Automated Marketing Research Using Online Customer Reviews , 2011 .

[38]  Alice H. Oh,et al.  Aspect and sentiment unification model for online review analysis , 2011, WSDM '11.

[39]  Jacob Goldenberg,et al.  Mine Your Own Business: Market-Structure Surveillance Through Text Mining , 2012, Mark. Sci..

[40]  Puneet Manchanda,et al.  Marketing Activity, Blogging and Sales , 2012 .

[41]  Gerard J. Tellis,et al.  Does Chatter Really Matter? Dynamics of User-Generated Content and Stock Performance , 2011, Mark. Sci..

[42]  Michael Vitale,et al.  The Wisdom of Crowds , 2015, Cell.