Probabilistic graphical models in modern social network analysis

AbstractThe advent and availability of technology has brought us closer than ever through social networks. Consequently, there is a growing emphasis on mining social networks to extract information for knowledge and discovery. However, methods for social network analysis (SNA) have not kept pace with the data explosion. In this review, we describe directed and undirected probabilistic graphical models (PGMs), and highlight recent applications to social networks. PGMs represent a flexible class of models that can be adapted to address many of the current challenges in SNA. In this work, we motivate their use with simple and accessible examples to demonstrate the modeling and connect to theory. In addition, recent applications in modern SNA are highlighted, including the estimation and quantification of importance, propagation of influence, trust (and distrust), link and profile prediction, privacy protection, and news spread through microblogging. Applications are selected to demonstrate the flexibility and predictive capabilities of PGMs in SNA. Finally, we conclude with a discussion of challenges and opportunities for PGMs in social networks.

[1]  Stefan Bornholdt,et al.  Mean-field-like behavior of the generalized voter-model-class kinetic Ising model. , 2012, Physical review. E, Statistical, nonlinear, and soft matter physics.

[2]  Éva Tardos,et al.  Maximizing the Spread of Influence through a Social Network , 2015, Theory Comput..

[3]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[4]  Martina Morris,et al.  Specification of Exponential-Family Random Graph Models: Terms and Computational Aspects. , 2008, Journal of statistical software.

[5]  Yang Guo,et al.  Bayesian-Inference-Based Recommendation in Online Social Networks , 2011, IEEE Transactions on Parallel and Distributed Systems.

[6]  Jérôme Kunegis Social Network Datasets , 2014, Encyclopedia of Social Network Analysis and Mining.

[7]  Matthew Richardson,et al.  Markov Logic: A Language and Algorithms for Link Mining , 2010, Link Mining.

[8]  Dianne P. O'Leary,et al.  Scaling MPE Inference for Constrained Continuous Markov Random Fields with Consensus Optimization , 2012, NIPS.

[9]  Michele Colajanni,et al.  Data Acquisition in Social Networks: Issues and Proposals , 2011 .

[10]  Minas Gjoka,et al.  Walking in Facebook: A Case Study of Unbiased Sampling of OSNs , 2010, 2010 Proceedings IEEE INFOCOM.

[11]  Eric P. Xing,et al.  Grafting-light: fast, incremental feature selection and structure learning of Markov random fields , 2010, KDD '10.

[12]  Vasant Honavar,et al.  Efficient Markov Network Structure Discovery using Independence Tests , 2006, SDM.

[13]  Tu-Anh Nguyen-Hoang,et al.  Features Extraction for Link Prediction in Social Networks , 2013, 2013 13th International Conference on Computational Science and Its Applications.

[14]  Steven M. Goodreau,et al.  Advances in exponential random graph (p*) models applied to a large social network , 2007, Soc. Networks.

[15]  Stephen E. Fienberg,et al.  A Brief History of Statistical Models for Network Analysis and Open Challenges , 2012 .

[16]  M. Newman,et al.  Statistical mechanics of networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[17]  Geoffrey Zweig,et al.  Speech Recognition with Dynamic Bayesian Networks , 1998, AAAI/IAAI.

[18]  Lee Humphreys,et al.  Mobile Social Networks and Social Practice: A Case Study of Dodgeball , 2007, J. Comput. Mediat. Commun..

[19]  Bruce A. Desmarais,et al.  Inferential Network Analysis with Exponential Random Graph Models , 2011, Political Analysis.

[20]  A. Rinaldo,et al.  On the geometry of discrete exponential families with application to exponential random graph models , 2008, 0901.0026.

[21]  Ting Wang,et al.  Propagated Opinion Retrieval in Twitter , 2013, WISE.

[22]  Mark W. Schmidt,et al.  Structure learning in random fields for heart motion abnormality detection , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  R. Atkinson,et al.  Accessing Hidden and Hard-to-Reach Populations: Snowball Research Strategies , 2022 .

[24]  Hsinchun Chen,et al.  Collaborative Friendship Networks in Online Healthcare Communities: An Exponential Random Graph Model Analysis , 2014, ICSH.

[25]  Sach Mukherjee,et al.  Network inference using informative priors , 2008, Proceedings of the National Academy of Sciences.

[26]  Shilin Ding,et al.  Learning Undirected Graphical Models with Structure Penalty , 2011, ArXiv.

[27]  R. May,et al.  Population biology of infectious diseases: Part II , 1979, Nature.

[28]  Yang Guo,et al.  Bayesian-inference based recommendation in online social networks , 2011, 2011 Proceedings IEEE INFOCOM.

[29]  Thomas Brendan Murphy,et al.  Review of statistical network analysis: models, algorithms, and software , 2012, Stat. Anal. Data Min..

[30]  J. Coleman,et al.  Medical Innovation: A Diffusion Study. , 1967 .

[31]  Hosung Park,et al.  What is Twitter, a social network or a news media? , 2010, WWW '10.

[32]  A. Grabowskia,et al.  Ising-based model of opinion formation in a complex network of interpersonal interactions , 2006 .

[33]  Martine De Cock,et al.  Ranking Approaches for Microblog Search , 2010, 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology.

[34]  Mohand Boughanem,et al.  Featured Tweet Search: Modeling Time and Social Influence for Microblog Retrieval , 2012, 2012 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology.

[35]  Shyhtsun Felix Wu,et al.  Crawling Online Social Graphs , 2010, 2010 12th International Asia-Pacific Web Conference.

[36]  Ramayya Krishnan,et al.  Estimating the effect of word of mouth on churn and cross-buying in the mobile phone market with Markov logic networks , 2011, Decis. Support Syst..

[37]  M. Handcock Center for Studies in Demography and Ecology Assessing Degeneracy in Statistical Models of Social Networks , 2005 .

[38]  Jon Rokne,et al.  Encyclopedia of Social Network Analysis and Mining , 2014, Springer New York.

[39]  Volker Tresp,et al.  Relational Models , 2014, Encyclopedia of Social Network Analysis and Mining.

[40]  万怀宇,et al.  Discovering Typed Communities in Mobile Social Networks , 2012 .

[41]  Robin Cowan,et al.  Network Structure and the Diffusion of Knowledge , 2004 .

[42]  Iadh Ounis,et al.  Overview of the TREC 2011 Microblog Track , 2011, TREC.

[43]  Douglas D. Heckathorn,et al.  Respondent-driven sampling : A new approach to the study of hidden populations , 1997 .

[44]  Michael I. Jordan,et al.  Probabilistic Independence Networks for Hidden Markov Probability Models , 1997, Neural Computation.

[45]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[46]  Charu C. Aggarwal,et al.  An Introduction to Social Network Data Analytics , 2011, Social Network Data Analytics.

[47]  Fabrizio Riguzzi,et al.  Probabilistic Inductive Logic Programming on the Web , 2017, RuleML+RR.

[48]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[49]  Ove Frank,et al.  http://www.jstor.org/about/terms.html. JSTOR's Terms and Conditions of Use provides, in part, that unless you have obtained , 2007 .

[50]  Sargur N. Srihari Probabilistic Graphical Models , 2014, Encyclopedia of Social Network Analysis and Mining.

[51]  J. Faugier,et al.  Sampling hard to reach populations. , 1997, Journal of advanced nursing.

[52]  Zhenyu Liu,et al.  Inferring Privacy Information from Social Networks , 2006, ISI.

[53]  Pedro M. Domingos,et al.  Lifted First-Order Belief Propagation , 2008, AAAI.

[54]  D. Hunter,et al.  Goodness of Fit of Social Network Models , 2008 .

[55]  Fernando Vega-Redondo,et al.  Complex Social Networks: Searching in Social Networks , 2007 .

[56]  S. Wasserman,et al.  Logit models and logistic regressions for social networks: I. An introduction to Markov graphs andp , 1996 .

[57]  Aristides Gionis,et al.  Social Network Analysis and Mining for Business Applications , 2011, TIST.

[58]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[59]  Frans Stokman,et al.  Encyclopedia of Social Network Analysis and Mining , 2014 .

[60]  J. M. Hammersley,et al.  Markov fields on finite graphs and lattices , 1971 .

[61]  Iadh Ounis,et al.  On building a reusable Twitter corpus , 2012, SIGIR '12.

[62]  Michael Salter-Townshend,et al.  Role Analysis in Networks Using Mixtures of Exponential Random Graph Models , 2015, Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America.

[63]  Eyke Hüllermeier,et al.  Open challenges for data stream mining research , 2014, SKDD.

[64]  Walter Willinger,et al.  On unbiased sampling for unstructured peer-to-peer networks , 2009, TNET.

[65]  Tom A. B. Snijders,et al.  Estimation On the Basis of Snowball Samples: How To Weight? , 1992 .

[66]  P. Pattison,et al.  New Specifications for Exponential Random Graph Models , 2006 .

[67]  Heather Richter Lipford,et al.  Understanding Privacy Settings in Facebook with an Audience View , 2008, UPSEC.

[68]  Erman Ayday,et al.  A belief propagation based recommender system for online services , 2010, RecSys '10.

[69]  Olivier Chapelle,et al.  A dynamic bayesian network click model for web search ranking , 2009, WWW '09.

[70]  Mohammed Shahadat Uddin,et al.  Exploring communication networks to understand organizational crisis using exponential random graph models , 2013, Comput. Math. Organ. Theory.

[71]  B. Efron Size, power and false discovery rates , 2007, 0710.2245.

[72]  S. Chib,et al.  Understanding the Metropolis-Hastings Algorithm , 1995 .

[73]  Jon M. Kleinberg,et al.  Challenges in mining social network data: processes, privacy, and paradoxes , 2007, KDD '07.

[74]  Myra Spiliopoulou,et al.  Evolution in Social Networks: A Survey , 2011, Social Network Data Analytics.

[75]  Michael J. A. Berry,et al.  Data mining techniques - for marketing, sales, and customer support , 1997, Wiley computer publishing.

[76]  Julien Brailly,et al.  Exponential Random Graph Models for Social Networks , 2014 .

[77]  Jennifer Neville,et al.  Relational Dependency Networks , 2007, J. Mach. Learn. Res..

[78]  Matthew Richardson,et al.  Markov Logic , 2008, Probabilistic Inductive Logic Programming.

[79]  Yehuda Koren,et al.  Collaborative filtering with temporal dynamics , 2009, KDD.

[80]  Mohand Boughanem,et al.  Uprising microblogs: a bayesian network retrieval model for tweet search , 2012, SAC '12.

[81]  A. Dobra Collective vs Independent Classification in Statistical Relational Learning , 2009 .

[82]  R. May,et al.  Population Biology of Infectious Diseases , 1982, Dahlem Workshop Reports.

[83]  Peter D. Hoff,et al.  Latent Space Approaches to Social Network Analysis , 2002 .

[84]  Steffen L. Lauritzen,et al.  Graphical models in R , 1996 .

[85]  Walter Willinger,et al.  On Unbiased Sampling for Unstructured Peer-to-Peer Networks , 2006, IEEE/ACM Transactions on Networking.

[86]  Garry Robins,et al.  An introduction to exponential random graph (p*) models for social networks , 2007, Soc. Networks.

[87]  Vasudeva Varma,et al.  User context as a source of topic retrieval in Twitter , 2011 .

[88]  Timothy W. Finin,et al.  Why we twitter: understanding microblogging usage and communities , 2007, WebKDD/SNA-KDD '07.

[89]  S. Wasserman,et al.  Logit models and logistic regressions for social networks: II. Multivariate relations. , 1999, The British journal of mathematical and statistical psychology.

[90]  Jérôme Kunegis Social Network Datasets , 2014 .

[91]  Jennifer Wortman,et al.  Viral Marketing and the Diffusion of Trends on Social Networks , 2008 .

[92]  S. Galam Rational group decision making: A random field Ising model at T = 0 , 1997, cond-mat/9702163.

[93]  Jennifer Neville,et al.  Collective inference for network data with copula latent markov networks , 2013, WSDM.

[94]  Martina Morris,et al.  statnet: Software Tools for the Representation, Visualization, Analysis and Simulation of Network Data. , 2008, Journal of statistical software.

[95]  Malcolm K. Sparrow,et al.  The application of network analysis to criminal intelligence: An assessment of the prospects , 1991 .

[96]  Ben Taskar,et al.  Discriminative Probabilistic Models for Relational Data , 2002, UAI.

[97]  Kevin P. Murphy,et al.  Learning the Structure of Dynamic Probabilistic Networks , 1998, UAI.

[98]  Mohammad Hadi Afrasiabi,et al.  Opinion formation in Ising networks , 2013, 2013 Information Theory and Applications Workshop (ITA).

[99]  S. Goodreau,et al.  Birds of a feather, or friend of a friend? using exponential random graph models to investigate adolescent social networks* , 2009, Demography.

[100]  Alberto Caimo,et al.  Bayesian exponential random graph models with nodal random effects , 2014, Soc. Networks.

[101]  Peng Wang,et al.  Recent developments in exponential random graph (p*) models for social networks , 2007, Soc. Networks.

[102]  J. Lafferty,et al.  High-dimensional Ising model selection using ℓ1-regularized logistic regression , 2010, 1010.0311.

[103]  Bin Wu,et al.  Community detection in large-scale social networks , 2007, WebKDD/SNA-KDD '07.

[104]  Sanjay Shakkottai,et al.  Greedy learning of Markov network structure , 2010, 2010 48th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[105]  Juan-Zi Li,et al.  Knowledge discovery through directed probabilistic topic models: a survey , 2010, Frontiers of Computer Science in China.

[106]  Matthew J. Salganik,et al.  5. Sampling and Estimation in Hidden Populations Using Respondent-Driven Sampling , 2004 .

[107]  S. Berg Snowball Sampling—I , 2006 .

[108]  Kevin P. Murphy,et al.  Machine learning - a probabilistic perspective , 2012, Adaptive computation and machine learning series.

[109]  Andrew Gelman,et al.  Inference from Simulations and Monitoring Convergence , 2011 .

[110]  Ajay Mehra The Development of Social Network Analysis: A Study in the Sociology of Science , 2005 .

[111]  spacercece,et al.  Evaluating Markov Logic Networks for Collective Classification , 2011 .

[112]  Hawoong Jeong,et al.  Statistical properties of sampled networks. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[113]  David Heckerman,et al.  Bayesian Networks for Data Mining , 2004, Data Mining and Knowledge Discovery.

[114]  Krishna P. Gummadi,et al.  On the evolution of user interaction in Facebook , 2009, WOSN '09.

[115]  J. Manyika Big data: The next frontier for innovation, competition, and productivity , 2011 .

[116]  Yun Chi,et al.  Facetnet: a framework for analyzing communities and their evolutions in dynamic networks , 2008, WWW.

[117]  Stuart J. Russell,et al.  Dynamic bayesian networks: representation, inference and learning , 2002 .

[118]  Mohammad Al Hasan,et al.  A Survey of Link Prediction in Social Networks , 2011, Social Network Data Analytics.

[119]  Matthew Richardson,et al.  Markov logic networks , 2006, Machine Learning.

[120]  D S Callaway,et al.  Network robustness and fragility: percolation on random graphs. , 2000, Physical review letters.

[121]  Athina Markopoulou,et al.  On the bias of BFS (Breadth First Search) , 2010, 2010 22nd International Teletraffic Congress (lTC 22).

[122]  S H Strogatz,et al.  Random graph models of social networks , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[123]  Liang Tang,et al.  LinkProbe: Probabilistic inference on large-scale social networks , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[124]  Julita Vassileva,et al.  Bayesian network-based trust model , 2003, Proceedings IEEE/WIC International Conference on Web Intelligence (WI 2003).

[125]  Steve Renals,et al.  Dynamic Bayesian networks for meeting structuring , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[126]  Krishna P. Gummadi,et al.  Measurement and analysis of online social networks , 2007, IMC '07.

[127]  Valdis E. Krebs,et al.  Mapping Networks of Terrorist Cells , 2001 .

[128]  S. Wasserman,et al.  Logit models and logistic regressions for social networks: III. Valued relations , 1999 .

[129]  Uffe Kjærulff,et al.  A Computational Scheme for Reasoning in Dynamic Probabilistic Networks , 1992, UAI.

[130]  Ashraful Alam,et al.  A study of physician collaborations through social network and exponential random graph , 2013, BMC Health Services Research.

[131]  Amr Ahmed,et al.  Recovering time-varying networks of dependencies in social and biological studies , 2009, Proceedings of the National Academy of Sciences.

[132]  Imad Aad,et al.  The Mobile Data Challenge: Big Data for Mobile Computing Research , 2012 .

[133]  G. Lilien,et al.  Medical Innovation Revisited: Social Contagion versus Marketing Effort1 , 2001, American Journal of Sociology.

[134]  F. C. Santos,et al.  Evolutionary dynamics of social dilemmas in structured heterogeneous populations. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[135]  Edoardo M. Airoldi,et al.  A Survey of Statistical Network Models , 2009, Found. Trends Mach. Learn..

[136]  P. Biernacki,et al.  Snowball Sampling: Problems and Techniques of Chain Referral Sampling , 1981 .

[137]  Jennifer Golbeck,et al.  SUNNY: A New Algorithm for Trust Inference in Social Networks Using Probabilistic Confidence Models , 2007, AAAI.

[138]  R. May,et al.  Population biology of infectious diseases: Part I , 1979, Nature.

[139]  Elena Agliari,et al.  A Diffusive Strategic Dynamics for Social Systems , 2008, 0812.1435.

[140]  Mark S Handcock,et al.  MODELING SOCIAL NETWORKS FROM SAMPLED DATA. , 2010, The annals of applied statistics.

[141]  Jonathan D. Pfautz,et al.  Applications of Bayesian Belief Networks in Social Network Analysis , 2006 .

[142]  David R. Karger,et al.  Learning Markov networks: maximum bounded tree-width graphs , 2001, SODA '01.

[143]  Bernard J. Jansen,et al.  Twitter power: Tweets as electronic word of mouth , 2009, J. Assoc. Inf. Sci. Technol..

[144]  Yihong Gong,et al.  Detecting communities and their evolutions in dynamic social networks—a Bayesian approach , 2011, Machine Learning.

[145]  Kristen LeFevre,et al.  Privacy wizards for social networking sites , 2010, WWW '10.

[146]  Terran Lane,et al.  Learning structurally consistent undirected probabilistic graphical models , 2009, ICML '09.

[147]  Yutaka Matsuo,et al.  Earthquake shakes Twitter users: real-time event detection by social sensors , 2010, WWW '10.

[148]  Marc Cheong,et al.  Integrating web-based intelligence retrieval and decision-making from the twitter trends knowledge base , 2009, CIKM-SWSM.

[149]  David Maxwell Chickering,et al.  Large-Sample Learning of Bayesian Networks is NP-Hard , 2002, J. Mach. Learn. Res..

[150]  John Scott,et al.  The SAGE Handbook of Social Network Analysis , 2011 .

[151]  Carolyn J. Anderson,et al.  A p* primer: logit models for social networks , 1999, Soc. Networks.

[152]  P. Pattison,et al.  Network models for social influence processes , 2001 .

[153]  Daphne Koller,et al.  Efficient Structure Learning of Markov Networks using L1-Regularization , 2006, NIPS.

[154]  Ian Fellows,et al.  Exponential-family Random Network Models , 2012, 1208.0121.

[155]  Xi Chen,et al.  Privacy Issues and Solutions in Social Network Sites , 2012, IEEE Technology and Society Magazine.

[156]  Ron Korstanje,et al.  A Bayesian Framework for Inference of the Genotype–Phenotype Map for Segregating Populations , 2011, Genetics.

[157]  R. Leenders Longitudinal behavior of network structure and actor attributes: modelling interdependence of contagion and selection , 1997 .

[158]  João Gama,et al.  A survey on concept drift adaptation , 2014, ACM Comput. Surv..

[159]  J. York,et al.  Bayesian Graphical Models for Discrete Data , 1995 .

[160]  Dong Wang,et al.  Towards Unbiased Sampling of Online Social Networks , 2011, 2011 IEEE International Conference on Communications (ICC).

[161]  David R. Schaefer,et al.  Using social network analysis to clarify the role of obesity in selection of adolescent friends. , 2014, American journal of public health.

[162]  Krishna P. Gummadi,et al.  Characterizing social cascades in flickr , 2008, WOSN '08.

[163]  David Heckerman,et al.  A Tutorial on Learning with Bayesian Networks , 1998, Learning in Graphical Models.

[164]  Matté Hartog,et al.  Explaining the Structure of Inter-Organizational Networks using Exponential Random Graph Models , 2011 .

[165]  R. Berk An introduction to sample selection bias in sociological data. , 1983 .

[166]  Brendan T. O'Connor,et al.  TweetMotif: Exploratory Search and Topic Summarization for Twitter , 2010, ICWSM.

[167]  Krishna P. Gummadi,et al.  A measurement-driven analysis of information propagation in the flickr social network , 2009, WWW '09.

[168]  Houkuan Huang,et al.  A Community-Based Pseudolikelihood Approach for Relationship Labeling in Social Networks , 2011, ECML/PKDD.

[169]  Anna Goldenberg,et al.  Tractable learning of large Bayes net structures from sparse data , 2004, ICML.