Finding your friends and following them to where you are

Location plays an essential role in our lives, bridging our online and offline worlds. This paper explores the interplay between people's location, interactions, and their social ties within a large real-world dataset. We present and evaluate Flap, a system that solves two intimately related tasks: link and location prediction in online social networks. For link prediction, Flap infers social ties by considering patterns in friendship formation, the content of people's messages, and user location. We show that while each component is a weak predictor of friendship alone, combining them results in a strong model, accurately identifying the majority of friendships. For location prediction, Flap implements a scalable probabilistic model of human mobility, where we treat users with known GPS positions as noisy sensors of the location of their friends. We explore supervised and unsupervised learning scenarios, and focus on the efficiency of both learning and inference. We evaluate Flap on a large sample of highly active users from two distinct geographical areas and show that it (1) reconstructs the entire friendship graph with high accuracy even when no edges are given; and (2) infers people's fine-grained location, even when they keep their data private and we can only access the location of their friends. Our models significantly outperform current comparable approaches to either task.

[1]  Stuart J. Russell,et al.  Dynamic bayesian networks: representation, inference and learning , 2002 .

[2]  Jure Leskovec,et al.  Supervised random walks: predicting and recommending links in social networks , 2010, WSDM '11.

[3]  Nicole Immorlica,et al.  Locality-sensitive hashing scheme based on p-stable distributions , 2004, SCG '04.

[4]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[5]  David Liben-Nowell,et al.  The link-prediction problem for social networks , 2007 .

[6]  Michael I. Jordan Learning in Graphical Models , 1999, NATO ASI Series.

[7]  Lars Backstrom,et al.  Find me if you can: improving geographical prediction with social and spatial proximity , 2010, WWW '10.

[8]  Dan Cosley,et al.  Inferring social ties from geographic coincidences , 2010, Proceedings of the National Academy of Sciences.

[9]  Albert-László Barabási,et al.  Limits of Predictability in Human Mobility , 2010, Science.

[10]  Ed H. Chi,et al.  Tweets from Justin Bieber's heart: the dynamics of the location field in user profiles , 2011, CHI.

[11]  Cecilia Mascolo,et al.  Socio-Spatial Properties of Online Location-Based Social Networks , 2011, ICWSM.

[12]  Naonori Ueda,et al.  Deterministic annealing EM algorithm , 1998, Neural Networks.

[13]  Yehuda Koren,et al.  Modeling relationships at multiple scales to improve accuracy of large recommender systems , 2007, KDD '07.

[14]  Aravind Srinivasan,et al.  Modelling disease outbreaks in realistic urban social networks , 2004, Nature.

[15]  B. Wellman,et al.  Imagining Twitter as an Imagined Community , 2011 .

[16]  Alex Pentland,et al.  Reality mining: sensing complex social systems , 2006, Personal and Ubiquitous Computing.

[17]  Jure Leskovec,et al.  Friendship and mobility: user movement in location-based social networks , 2011, KDD.

[18]  Hosung Park,et al.  What is Twitter, a social network or a news media? , 2010, WWW '10.

[19]  E. David,et al.  Networks, Crowds, and Markets: Reasoning about a Highly Connected World , 2010 .

[20]  Michael I. Jordan,et al.  Loopy Belief Propagation for Approximate Inference: An Empirical Study , 1999, UAI.

[21]  David A. Smith,et al.  Dependency Parsing by Belief Propagation , 2008, EMNLP.

[22]  Ben Taskar,et al.  Link Prediction in Relational Data , 2003, NIPS.

[23]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[24]  Jasmine Novak,et al.  Geographic routing in social networks , 2005, Proc. Natl. Acad. Sci. USA.

[25]  Zoubin Ghahramani,et al.  Learning Dynamic Bayesian Networks , 1997, Summer School on Neural Networks.

[26]  Timothy W. Finin,et al.  Why we twitter: understanding microblogging usage and communities , 2007, WebKDD/SNA-KDD '07.

[27]  John Krumm,et al.  Exploring end user preferences for location obfuscation, location-based services, and the value of location , 2010, UbiComp.

[28]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[29]  Alex Pentland,et al.  Honest Signals - How They Shape Our World , 2008 .