Mining social media with social theories: a survey

The increasing popularity of social media encourages more and more users to participate in various online activities and produces data in an unprecedented rate. Social media data is big, linked, noisy, highly unstructured and in- complete, and differs from data in traditional data mining, which cultivates a new research field - social media mining. Social theories from social sciences are helpful to explain social phenomena. The scale and properties of social media data are very different from these of data social sciences use to develop social theories. As a new type of social data, social media data has a fundamental question - can we apply social theories to social media data? Recent advances in computer science provide necessary computational tools and techniques for us to verify social theories on large-scale social media data. Social theories have been applied to mining social media. In this article, we review some key social theories in mining social media, their verification approaches, interesting findings, and state-of-the-art algorithms. We also discuss some future directions in this active area of mining social media with social theories.

[1]  Jennifer Neville,et al.  Linkage and Autocorrelation Cause Feature Selection Bias in Relational Learning , 2002, ICML.

[2]  Martin Ester,et al.  A matrix factorization technique with trust propagation for recommendation in social networks , 2010, RecSys '10.

[3]  Huan Liu,et al.  Exploring Social-Historical Ties on Location-Based Social Networks , 2012, ICWSM.

[4]  Xiaowei Xu,et al.  Investigating Homophily in Online Social Networks , 2010, 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology.

[5]  Huan Liu,et al.  Exploiting social relations for sentiment analysis in microblogging , 2013, WSDM.

[6]  Dino Pedreschi,et al.  Human mobility, social ties, and link prediction , 2011, KDD.

[7]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[8]  Lise Getoor,et al.  Link mining: a survey , 2005, SKDD.

[9]  Mark S. Granovetter The Strength of Weak Ties , 1973, American Journal of Sociology.

[10]  Huan Liu,et al.  Relational learning via latent social dimensions , 2009, KDD.

[11]  Philip S. Yu,et al.  Spectral clustering for multi-type relational data , 2006, ICML.

[12]  Lada A. Adamic,et al.  The role of social networks in information diffusion , 2012, WWW.

[13]  V. Traag,et al.  Community detection in networks with positive and negative links. , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.

[14]  Jure Leskovec,et al.  Modeling Information Diffusion in Implicit Networks , 2010, 2010 IEEE International Conference on Data Mining.

[15]  Huan Liu,et al.  mTrust: discerning multi-faceted trust in a connected world , 2012, WSDM '12.

[16]  Huan Liu,et al.  Feature Selection with Linked Data in Social Media , 2012, SDM.

[17]  Jennifer Neville,et al.  Using Transactional Information to Predict Link Strength in Online Social Networks , 2009, ICWSM.

[18]  F. Heider ATTITUDES AND COGNITIVE ORGANIZATION , 1977 .

[19]  P. Abbeel,et al.  Label and Link Prediction in Relational Data , 2003 .

[20]  Krishna P. Gummadi,et al.  You are who you know: inferring user profiles in online social networks , 2010, WSDM '10.

[21]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[22]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[23]  Jie Tang,et al.  Learning to Infer Social Ties in Large Networks , 2011, ECML/PKDD.

[24]  Rashmi R. Sinha,et al.  Comparing Recommendations Made by Online Systems and Friends , 2001, DELOS.

[25]  Foster J. Provost,et al.  Classification in Networked Data: a Toolkit and a Univariate Case Study , 2007, J. Mach. Learn. Res..

[26]  Yiannis Kompatsiaris,et al.  Community detection in Social Media , 2012, Data Mining and Knowledge Discovery.

[27]  Inderjit S. Dhillon,et al.  Low rank modeling of signed networks , 2012, KDD.

[28]  Jie Tang,et al.  Inferring social ties across heterogenous networks , 2012, WSDM '12.

[29]  Aravind Srinivasan,et al.  Predicting Trust and Distrust in Social Networks , 2011, 2011 IEEE Third Int'l Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third Int'l Conference on Social Computing.

[30]  Jennifer Neville,et al.  Collective inference for network data with copula latent markov networks , 2013, WSDM.

[31]  Gilad Mishne,et al.  Finding high-quality content in social media , 2008, WSDM '08.

[32]  T. Graepel,et al.  Private traits and attributes are predictable from digital records of human behavior , 2013, Proceedings of the National Academy of Sciences.

[33]  Reza Zafarani,et al.  Social Media Mining: An Introduction , 2014 .

[34]  Muhammad Abulaish,et al.  Community-based features for identifying spammers in Online Social Networks , 2013, 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2013).

[35]  Gianluca Stringhini,et al.  Detecting spammers on social networks , 2010, ACSAC '10.

[36]  Nagarajan Natarajan,et al.  Exploiting longer cycles for link prediction in signed networks , 2011, CIKM '11.

[37]  Clara Pizzuti,et al.  Community mining in signed networks: A multiobjective approach , 2013, 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2013).

[38]  David Liben-Nowell,et al.  The link-prediction problem for social networks , 2007 .

[39]  Jie Tang,et al.  Mining structural hole spanners through information diffusion in social networks , 2013, WWW.

[40]  Yihong Gong,et al.  Combining content and link for classification using matrix factorization , 2007, SIGIR.

[41]  Mao Ye,et al.  Exploring social influence for recommendation: a generative model approach , 2012, SIGIR '12.

[42]  Jure Leskovec,et al.  Signed networks in social media , 2010, CHI.

[43]  Chao Liu,et al.  Recommender systems with social regularization , 2011, WSDM '11.

[44]  Steven B. Andrews,et al.  Structural Holes: The Social Structure of Competition , 1995, The SAGE Encyclopedia of Research Design.

[45]  Paolo Avesani,et al.  Trust-Aware Collaborative Filtering for Recommender Systems , 2004, CoopIS/DOA/ODBASE.

[46]  Huan Liu,et al.  Exploiting homophily effect for trust prediction , 2013, WSDM.

[47]  Alexander J. Smola,et al.  Friend or frenemy?: predicting signed ties in social networks , 2012, SIGIR '12.

[48]  James A. Davis Clustering and Structural Balance in Graphs , 1977 .

[49]  Foster Provost,et al.  A Simple Relational Classifier , 2003 .

[50]  Huan Liu,et al.  Social Spammer Detection in Microblogging , 2013, IJCAI.

[51]  Bernardo A. Huberman,et al.  Predicting the Future with Social Media , 2010, Web Intelligence.

[52]  Ben Taskar,et al.  Link Prediction in Relational Data , 2003, NIPS.

[53]  Long Jiang,et al.  User-level sentiment analysis incorporating social networks , 2011, KDD.

[54]  Huan Liu,et al.  Discovering Overlapping Groups in Social Media , 2010, 2010 IEEE International Conference on Data Mining.

[55]  Hiroshi Motoda,et al.  Computational Methods of Feature Selection , 2007 .

[56]  Lise Getoor,et al.  Link-Based Classification , 2003, Encyclopedia of Machine Learning and Data Mining.

[57]  Huan Liu,et al.  Social recommendation: a review , 2013, Social Network Analysis and Mining.

[58]  M. McPherson,et al.  Birds of a Feather: Homophily in Social Networks , 2001 .

[59]  Seungmin Rho,et al.  TwitterTrends: a spatio-temporal trend detection and related keywords recommendation scheme , 2013, Multimedia Systems.

[60]  Pedro M. Domingos A few useful things to know about machine learning , 2012, Commun. ACM.

[61]  Fang Wu,et al.  Social Networks that Matter: Twitter Under the Microscope , 2008, First Monday.

[62]  Xi Zhang,et al.  TopRec: domain-specific recommendation through community topic mining in social network , 2013, WWW '13.

[63]  Qi He,et al.  TwitterRank: finding topic-sensitive influential twitterers , 2010, WSDM '10.

[64]  Huan Liu,et al.  Community Detection and Mining in Social Media , 2010, Community Detection and Mining in Social Media.

[65]  Jennifer Neville,et al.  Modeling relationship strength in online social networks , 2010, WWW '10.

[66]  Qiang Yang,et al.  Discovering Spammers in Social Networks , 2012, AAAI.

[67]  Fulu Li,et al.  An Empirical Study of Clustering Behavior of Spammers and Group-based Anti-Spam Strategies , 2006, CEAS.

[68]  Johan Bollen,et al.  Twitter mood predicts the stock market , 2010, J. Comput. Sci..

[69]  Xiaowei Xu,et al.  SCAN: a structural clustering algorithm for networks , 2007, KDD '07.

[70]  Calton Pu,et al.  Social Honeypots: Making Friends With A Spammer Near You , 2008, CEAS.

[71]  Jure Leskovec,et al.  Predicting positive and negative links in online social networks , 2010, WWW '10.

[72]  Jason Baldridge,et al.  Twitter Polarity Classification with Label Propagation over Lexical Links and the Follower Graph , 2011, ULNLP@EMNLP.

[73]  Paolo Massa,et al.  A Survey of Trust Use and Modeling in Real Online Systems , 2007 .