Scalable Inference of Customer Similarities from Interactions Data Using Dirichlet Processes

Under the sociological theory of homophily, people who are similar to one another are more likely to interact with one another. Marketers often have access to data on interactions among customers from which, with homophily as a guiding principle, inferences could be made about the underlying similarities. However, larger networks face a quadratic explosion in the number of potential interactions that need to be modeled. This scalability problem renders probability models of social interactions computationally infeasible for all but the smallest networks. In this paper, we develop a probabilistic framework for modeling customer interactions that is both grounded in the theory of homophily and is flexible enough to account for random variation in who interacts with whom. In particular, we present a novel Bayesian nonparametric approach, using Dirichlet processes, to moderate the scalability problems that marketing researchers encounter when working with networked data. We find that this framework is a powerful way to draw insights into latent similarities of customers, and we discuss how marketers can apply these insights to segmentation and targeting activities.

[1]  F. Feinberg,et al.  Assessing Heterogeneity in Discrete Choice Models Using a Dirichlet Process Prior , 2004 .

[2]  M. Kendall,et al.  Kendall's advanced theory of statistics , 1995 .

[3]  Harikesh S. Nair,et al.  Social Ties and User Generated Content: Evidence from an Online Social Network , 2011 .

[4]  A. O'Hagan,et al.  Kendall's Advanced Theory of Statistics, Vol. 2b: Bayesian Inference. , 1996 .

[5]  C. Antoniak Mixtures of Dirichlet Processes with Applications to Bayesian Nonparametric Problems , 1974 .

[6]  Sangyoung Song,et al.  Neighborhood effects and trial on the internet: Evidence from online grocery retailing , 2007 .

[7]  D. Rubin Bayesianly Justifiable and Relevant Frequency Calculations for the Applied Statistician , 1984 .

[8]  Ds Leslie Discussion of the article by Handcock, Raftery and Tantrum , 2007 .

[9]  Sougata Mukherjea,et al.  Social ties and their relevance to churn in mobile telecom networks , 2008, EDBT '08.

[10]  Peter J. Lenk,et al.  Simulation Pseudo-Bias Correction to the Harmonic Mean Estimator of Integrated Likelihoods , 2009 .

[11]  Linda L. Price,et al.  The market maven: A diffuser of marketplace information. , 1987 .

[12]  D. Hunter,et al.  Goodness of Fit of Social Network Models , 2008 .

[13]  Erik Brynjolfsson,et al.  Global Village or Cyberbalkans: Modeling and Measuring the Integration of Electronic Communities , 2005, Manag. Sci..

[14]  Peter H. Reingen,et al.  Brand Congruence in Interpersonal Relations: A Social Network Analysis , 1984 .

[15]  Frank M. Bass,et al.  A New Product Growth for Model Consumer Durables , 2004, Manag. Sci..

[16]  Grady D. Bruce,et al.  Group Influence and Brand Choice Congruence , 1972 .

[17]  W. Bearden,et al.  Reference Group Influence on Product and Brand Purchase Decisions , 1982 .

[18]  George A. Akerlof Social Distance and Social Decisions , 1997 .

[19]  Christopher Lettl,et al.  Distinctive Roles of Lead Users and Opinion Leaders in the Social Networks of Schoolchildren , 2009 .

[20]  Peter E. Rossi,et al.  Bayesian Statistics and Marketing: Rossi/Bayesian Statistics and Marketing , 2006 .

[21]  Jerome B. Kernan,et al.  Analysis of Referral Networks in Marketing: Methods and Illustration , 1986 .

[22]  J. Ford,et al.  A Reexamination of Group Influence on Member Brand Preference , 1980 .

[23]  F. Bass A new product growth model for consumer durables , 1976 .

[24]  Jari Saramäki,et al.  A comparative study of social network models: Network evolution models and nodal attribute models , 2008, Soc. Networks.

[25]  Jon M. Kleinberg,et al.  The link-prediction problem for social networks , 2007, J. Assoc. Inf. Sci. Technol..

[26]  Florian Stahl,et al.  Modeling Multiple Relationships in Social Networks , 2011 .

[27]  Peter H. Reingen,et al.  Social Ties and Word-of-Mouth Referral Behavior , 1987 .

[28]  Mark E. Johnson Multivariate Statistical Simulation: Johnson/Multivariate , 1987 .

[29]  David Lazer,et al.  Inferring friendship network structure by using mobile phone data , 2009, Proceedings of the National Academy of Sciences.

[30]  Eric T. Bradlow,et al.  The Little Engines That Could: Modeling the Performance of World Wide Web Search Engines , 2000 .

[31]  D. Blackwell,et al.  Ferguson Distributions Via Polya Urn Schemes , 1973 .

[32]  Martin Suter,et al.  Small World , 2002 .

[33]  Jan de Leeuw,et al.  Richness Curves for Evaluating Market Segmentation , 1992 .

[34]  Peter D. Hoff,et al.  Latent Space Approaches to Social Network Analysis , 2002 .

[35]  Lancelot F. James,et al.  Gibbs Sampling Methods for Stick-Breaking Priors , 2001 .

[36]  David Godes,et al.  Firm-Created Word-of-Mouth Communication: Evidence from a Field Test , 2009, Mark. Sci..

[37]  T. Ferguson A Bayesian Analysis of Some Nonparametric Problems , 1973 .

[38]  Leo Katz,et al.  A new status index derived from sociometric analysis , 1953 .

[39]  Radford M. Neal Markov Chain Sampling Methods for Dirichlet Process Mixture Models , 2000 .

[40]  David R. Bell,et al.  Spatiotemporal Analysis of Imitation Behavior across New Buyers at an Online Grocery Retailer , 2010 .

[41]  C. Whan Park,et al.  Students and Housewives: Differences in Susceptibility to Reference Group Influence , 1977 .

[42]  Michael A. West,et al.  Computing Nonparametric Hierarchical Models , 1998 .

[43]  Christopher J. Nachtsheim,et al.  A New Family of Multivariate Distributions with Applications to Monte Carlo Studies , 1988 .

[44]  Greg M. Allenby,et al.  Modeling Interdependent Consumer Preferences , 2003 .

[45]  Mark E. Johnson,et al.  Multivariate Statistical Simulation , 1989, International Encyclopedia of Statistical Science.

[46]  David C. Schmittlein,et al.  Predicting Future Random Events Based on Past Performance , 1981 .

[47]  Xiao-Li Meng,et al.  POSTERIOR PREDICTIVE ASSESSMENT OF MODEL FITNESS VIA REALIZED DISCREPANCIES , 1996 .

[48]  Thomas W. Valente,et al.  Opinion Leadership and Social Contagion in New Product Diffusion , 2011, Mark. Sci..

[49]  David C. Schmittlein,et al.  Generalizing the NBD Model for Customer Purchases: What Are the Implications and Is It Worth the Effort? , 1988 .

[50]  Jean-Paul Chilès,et al.  Wiley Series in Probability and Statistics , 2012 .

[51]  Peter S. Fader,et al.  Forecasting new product trial in a controlled test market environment , 2003 .

[52]  Sw. Banerjee,et al.  Hierarchical Modeling and Analysis for Spatial Data , 2003 .

[53]  M. Wedel,et al.  Analyzing Brand Competition across Subcategories , 2004 .

[54]  Peter S. Fader,et al.  Modeling the 'Pseudodeductible' in Insurance Claims Decisions , 2006, Manag. Sci..

[55]  Peter D. Hoff,et al.  Bilinear Mixed-Effects Models for Dyadic Data , 2005 .

[56]  Gueorgi Kossinets,et al.  Empirical Analysis of an Evolving Social Network , 2006, Science.

[57]  Jennifer Widom,et al.  SimRank: a measure of structural-context similarity , 2002, KDD.

[58]  M. Escobar,et al.  Bayesian Density Estimation and Inference Using Mixtures , 1995 .

[59]  A. Raftery,et al.  Model‐based clustering for social networks , 2007 .

[60]  Samuel Kotz,et al.  The Laplace Distribution and Generalizations: A Revisit with Applications to Communications, Economics, Engineering, and Finance , 2001 .

[61]  P. Blau Inequality and Heterogeneity: A Primitive Theory of Social Structure , 1978 .

[62]  Alan E. Gelfand,et al.  Bayesian statistics without tears: A sampling-resampling perspective , 1992 .

[63]  D. Watts Networks, Dynamics, and the Small‐World Phenomenon1 , 1999, American Journal of Sociology.

[64]  D. Watts,et al.  Influentials, Networks, and Public Opinion Formation , 2007 .

[65]  Anthony O'Hagan,et al.  Kendall's Advanced Theory of Statistics, volume 2B: Bayesian Inference, second edition , 2004 .

[66]  P. Lazarsfeld,et al.  Friendship as Social process: a substantive and methodological analysis , 1964 .

[67]  T. S. Robertson,et al.  A Propositional Inventory for New Diffusion Research , 1985 .

[68]  Pradeep Chintagunta,et al.  The Effect of Signal Quality and Contiguous Word of Mouth on Customer Acquisition for a Video-on-Demand Service , 2010, Mark. Sci..

[69]  M. McPherson,et al.  Birds of a Feather: Homophily in Social Networks , 2001 .

[70]  J. Sethuraman A CONSTRUCTIVE DEFINITION OF DIRICHLET PRIORS , 1991 .

[71]  M. Escobar Estimating Normal Means with a Dirichlet Process Prior , 1994 .

[72]  Jacob Goldenberg,et al.  Talk of the Network: A Complex Systems Look at the Underlying Process of Word-of-Mouth , 2001 .

[73]  Carl F. Mela,et al.  E-Customization , 2003 .

[74]  P. Rousseeuw,et al.  Wiley Series in Probability and Mathematical Statistics , 2005 .

[75]  Lise Getoor,et al.  Link mining: a survey , 2005, SKDD.

[76]  Peter E. Rossi,et al.  Bayesian Statistics and Marketing , 2005 .

[77]  J. Arndt Role of Product-Related Conversations in the Diffusion of a New Product , 1967 .

[78]  P. Lazarsfeld,et al.  Personal Influence: The Part Played by People in the Flow of Mass Communications , 1956 .

[79]  Chris Volinsky,et al.  Network-Based Marketing: Identifying Likely Adopters Via Consumer Networks , 2006, math/0606278.

[80]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[81]  P. Lazarsfeld,et al.  6. Katz, E. Personal Influence: The Part Played by People in the Flow of Mass Communications , 1956 .

[82]  Puneet Manchanda,et al.  The Effects of Service Quality and Word of Mouth on Customer Acquisition, Retention and Usage , 2007 .