Multidimensional diffusion processes in dynamic online networks

We develop a dynamic matched sample estimation algorithm to distinguish peer influence and homophily effects on item adoption decisions in dynamic networks, with numerous items diffusing simultaneously. We infer preferences using a machine learning algorithm applied to previous adoption decisions, and we match agents using those inferred preferences. We show that ignoring previous adoption decisions leads to significantly overestimating the role of peer influence in the diffusion of information, mistakenly confounding influence-based contagion with diffusion driven by common preferences. Our matching-on-preferences algorithm with machine learning reduces the relative effect of peer influence on item adoption decisions in this network significantly more than matching on earlier adoption decisions, as well other observable characteristics. We also show significant and intuitive heterogeneity in the relative effect of peer influence.

[1]  Don Tapscott,et al.  Wikinomics: How Mass Collaboration Changes Everything , 2006 .

[2]  Dan Cosley,et al.  Distinguishing between Personal Preferences and Social Influence in Online Activity Feeds , 2016, CSCW.

[3]  Yifan Hu,et al.  Collaborative Filtering for Implicit Feedback Datasets , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[4]  Sendhil Mullainathan,et al.  Machine Learning: An Applied Econometric Approach , 2017, Journal of Economic Perspectives.

[5]  P. Lazarsfeld,et al.  Friendship as Social process: a substantive and methodological analysis , 1964 .

[6]  C. Manski Identification of Endogenous Social Effects: The Reflection Problem , 1993 .

[7]  G. Imbens,et al.  Large Sample Properties of Matching Estimators for Average Treatment Effects , 2004 .

[8]  Krishna P. Gummadi,et al.  Measuring User Influence in Twitter: The Million Follower Fallacy , 2010, ICWSM.

[9]  Erik Duval,et al.  Quantitative analysis of user-generated content on the Web , 2008 .

[10]  Arun Sundararajan,et al.  Distinguishing influence-based contagion from homophily-driven diffusion in dynamic networks , 2009, Proceedings of the National Academy of Sciences.

[11]  Elizabeth A Stuart,et al.  Matching methods for causal inference: A review and a look forward. , 2010, Statistical science : a review journal of the Institute of Mathematical Statistics.

[12]  B. Latané The psychology of social impact. , 1981 .

[13]  G. Imbens,et al.  On the Failure of the Bootstrap for Matching Estimators , 2006 .

[14]  Ravi Kumar,et al.  Influence and correlation in social networks , 2008, KDD.

[15]  Arun Sundararajan Local Network Effects and Complex Network Structure , 2006 .

[16]  Stefan Wager,et al.  Estimation and Inference of Heterogeneous Treatment Effects using Random Forests , 2015, Journal of the American Statistical Association.

[17]  Alberto Abadie,et al.  The risk of machine learning , 2017, 1703.10935.

[18]  S. Athey,et al.  Estimating Treatment Effects with Causal Forests: An Application , 2019, Observational Studies.

[19]  Art B. Owen,et al.  THE PIGEONHOLE BOOTSTRAP , 2007, 0712.1111.

[20]  Dylan S. Small,et al.  The use of bootstrapping when using propensity-score matching without replacement: a simulation study , 2014, Statistics in medicine.

[21]  Antonio Lima,et al.  Coding Together at Scale: GitHub as a Collaborative Social Network , 2014, ICWSM.

[22]  J. Kleinberg,et al.  Networks, Crowds, and Markets , 2010 .

[23]  Hal R. Varian,et al.  Big Data: New Tricks for Econometrics , 2014 .

[24]  M. McPherson,et al.  Birds of a Feather: Homophily in Social Networks , 2001 .

[25]  Jaana Kekäläinen,et al.  Cumulated gain-based evaluation of IR techniques , 2002, TOIS.

[26]  A. Owen,et al.  Bootstrapping data arrays of arbitrary order , 2011, 1106.2125.

[27]  Angie Wade Matched Sampling for Causal Effects , 2008 .