Measuring Influence in Twitter Ecosystems using a Counting Process Modeling Framework

Data extracted from social media platforms, such as Twitter, are both large in scale and complex in nature, since they contain both unstructured text, as well as structured data, such as time stamps and interactions between users. A key question for such platforms is to determine influential users, in the sense that they generate interactions between members of the platform. Common measures used both in the academic literature and by companies that provide analytics services are variants of the popular web-search PageRank algorithm applied to networks that capture connections between users. In this work, we develop a modeling framework using multivariate interacting counting processes to capture the detailed actions that users undertake on such platforms, namely posting original content, reposting and/or mentioning other users' postings. Based on the proposed model, we also derive a novel influence measure. We discuss estimation of the model parameters through maximum likelihood and establish their asymptotic properties. The proposed model and the accompanying influence measure are illustrated on a data set covering a five year period of the Twitter actions of the members of the US Senate, as well as mainstream news organizations and media personalities.

[1]  R. Gill,et al.  Cox's regression model for counting processes: a large sample study : (preprint) , 1982 .

[2]  Eric D. Kolaczyk,et al.  Statistical Analysis of Network Data: Methods and Models , 2009 .

[3]  Krishna P. Gummadi,et al.  Measuring User Influence in Twitter: The Million Follower Fallacy , 2010, ICWSM.

[4]  Wiley Interscience Journal of the American Society for Information Science and Technology , 2013 .

[5]  Stephen E. Fienberg,et al.  A Brief History of Statistical Models for Network Analysis and Open Challenges , 2012 .

[6]  Jennifer Golbeck,et al.  Twitter use by the U.S. Congress , 2010, J. Assoc. Inf. Sci. Technol..

[7]  Katharina Burger,et al.  Counting Processes And Survival Analysis , 2016 .

[8]  丸山 徹 Convex Analysisの二,三の進展について , 1977 .

[9]  Qi He,et al.  TwitterRank: finding topic-sensitive influential twitterers , 2010, WSDM '10.

[10]  Patrick J. Wolfe,et al.  Point process modelling for directed interaction networks , 2010, ArXiv.

[11]  Thomas Brendan Murphy,et al.  Review of statistical network analysis: models, algorithms, and software , 2012, Stat. Anal. Data Min..

[12]  Hosung Park,et al.  What is Twitter, a social network or a news media? , 2010, WWW '10.

[13]  David M. Blei,et al.  Probabilistic topic models , 2012, Commun. ACM.

[14]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[15]  Brian D. Davison,et al.  Empirical study of topic modeling in Twitter , 2010, SOMA '10.

[16]  Marius Bulearca,et al.  Twitter: a Viable Marketing Tool for SMEs? , 2010 .

[17]  Susan T. Dumais,et al.  Characterizing Microblogs with Topic Models , 2010, ICWSM.

[18]  John D. Lafferty,et al.  A correlated topic model of Science , 2007, 0708.3601.

[19]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[20]  Panagiotis Takis Metaxas,et al.  Limits of Electoral Predictions Using Twitter , 2011, ICWSM.

[21]  Matt Taddy,et al.  Measuring Political Sentiment on Twitter: Factor Optimal Design for Multinomial Inverse Regression , 2012, Technometrics.

[22]  Michael Trusov,et al.  Determining Influential Users in Internet Social Networks , 2010 .