Predicting Human Preferences Using the Block Structure of Complex Social Networks

With ever-increasing available data, predicting individuals' preferences and helping them locate the most relevant information has become a pressing need. Understanding and predicting preferences is also important from a fundamental point of view, as part of what has been called a “new” computational social science. Here, we propose a novel approach based on stochastic block models, which have been developed by sociologists as plausible models of complex networks of social interactions. Our model is in the spirit of predicting individuals' preferences based on the preferences of others but, rather than fitting a particular model, we rely on a Bayesian approach that samples over the ensemble of all possible models. We show that our approach is considerably more accurate than leading recommender algorithms, with major relative improvements between 38% and 99% over industry-level algorithms. Besides, our approach sheds light on decision-making processes by identifying groups of individuals that have consistently similar preferences, and enabling the analysis of the characteristics of those groups.

[1]  S. Boorman,et al.  Social Structure from Multiple Networks. I. Blockmodels of Roles and Positions , 1976, American Journal of Sociology.

[2]  L. Amaral,et al.  On Universality in Human Correspondence Activity , 2009, Science.

[3]  Esteban Moro,et al.  Impact of human activity patterns on the dynamics of information diffusion. , 2009, Physical review letters.

[4]  T. Snijders,et al.  Estimation and Prediction for Stochastic Blockstructures , 2001 .

[5]  Daniel Choquet,et al.  The data deluge , 2012, Nature Cell Biology.

[6]  M. Tribus,et al.  Probability theory: the logic of science , 2003 .

[7]  N. Christakis,et al.  Estimating Peer Effects on Health in Social Networks , 2008, Journal of health economics.

[8]  Wenfei Fan,et al.  Keys for XML , 2001, WWW '01.

[9]  Geoffrey I. Webb,et al.  Encyclopedia of Machine Learning , 2011, Encyclopedia of Machine Learning.

[10]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[11]  Daniel A. Keim,et al.  On Knowledge Discovery and Data Mining , 1997 .

[12]  T. Geisel,et al.  The scaling laws of human travel , 2006, Nature.

[13]  Duncan J. Watts,et al.  Characterizing individual communication patterns , 2009, KDD.

[14]  Albert-László Barabási,et al.  Limits of Predictability in Human Mobility , 2010, Science.

[15]  D. Madigan,et al.  Correction to: ``Bayesian model averaging: a tutorial'' [Statist. Sci. 14 (1999), no. 4, 382--417; MR 2001a:62033] , 2000 .

[16]  Gediminas Adomavicius,et al.  Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions , 2005, IEEE Transactions on Knowledge and Data Engineering.

[17]  John Riedl,et al.  Rethinking the recommender research ecosystem: reproducibility, openness, and LensKit , 2011, RecSys '11.

[18]  S. Boorman,et al.  Social Structure from Multiple Networks. II. Role Structures , 1976, American Journal of Sociology.

[19]  P. Arabie,et al.  An algorithm for clustering relational data with applications to social network analysis and comparison with multidimensional scaling , 1975 .

[20]  Arkadiusz Paterek,et al.  Improving regularized singular value decomposition for collaborative filtering , 2007 .

[21]  Cohen-Cole,et al.  Estimating peer effects on health in social networks : A response to , 2008 .

[22]  A. Pentland,et al.  Computational Social Science , 2009, Science.

[23]  N. Christakis,et al.  The Spread of Obesity in a Large Social Network Over 32 Years , 2007, The New England journal of medicine.

[24]  John Riedl,et al.  Item-based collaborative filtering recommendation algorithms , 2001, WWW '01.

[25]  Yehuda Koren,et al.  Matrix Factorization Techniques for Recommender Systems , 2009, Computer.

[26]  Roger Guimerà,et al.  Extracting the hierarchical organization of complex systems , 2007, Proceedings of the National Academy of Sciences.

[27]  W. Gardner Learning characteristics of stochastic-gradient-descent algorithms: A general study, analysis, and critique , 1984 .

[28]  Jason M. Fletcher,et al.  Is Obesity Contagious? Social Networks vs. Environmental Factors in the Obesity Epidemic , 2008, Journal of Health Economics.

[29]  D. Watts A twenty-first century science , 2007, Nature.

[30]  Gerhard Lakemeyer,et al.  Exploring artificial intelligence in the new millennium , 2003 .

[31]  M. Newman Communities, modules and large-scale structure in networks , 2011, Nature Physics.

[32]  Roger Guimerà,et al.  Missing and spurious interactions and the reconstruction of complex networks , 2009, Proceedings of the National Academy of Sciences.

[33]  Adrian E. Raftery,et al.  Bayesian model averaging: a tutorial (with comments by M. Clyde, David Draper and E. I. George, and a rejoinder by the authors , 1999 .

[34]  Matthew J. Salganik,et al.  Experimental Study of Inequality and Unpredictability in an Artificial Cultural Market , 2006, Science.

[35]  A. Vespignani Predicting the Behavior of Techno-Social Systems , 2009, Science.

[36]  Taghi M. Khoshgoftaar,et al.  A Survey of Collaborative Filtering Techniques , 2009, Adv. Artif. Intell..

[37]  Kathryn B. Laskey,et al.  Stochastic blockmodels: First steps , 1983 .

[38]  Lada A. Adamic,et al.  Computational Social Science , 2009, Science.