Inferring Social Strength from Spatiotemporal Data

The advent of geolocation technologies has generated unprecedented rich datasets of people’s location information at a very high fidelity. These location datasets can be used to study human behavior; for example, social studies have shown that people who are seen together frequently at the same place and same time are most probably socially related. In this article, we are interested in inferring these social connections by analyzing people’s location information; this is useful in a variety of application domains, from sales and marketing to intelligence analysis. In particular, we propose an entropy-based model (EBM) that not only infers social connections but also estimates the strength of social connections by analyzing people’s co-occurrences in space and time. We examine two independent methods: diversity and weighted frequency, through which co-occurrences contribute to the strength of a social connection. In addition, we take the characteristics of each location into consideration in order to compensate for cases where only limited location information is available. We also study the role of location semantics in improving our computation of social strength. We develop a parallel implementation of our algorithm using MapReduce to create a scalable and efficient solution for online applications. We conducted extensive sets of experiments with real-world datasets including both people’s location data and their social connections, where we used the latter as the ground truth to verify the results of applying our approach to the former. We show that our approach is valid across different networks and outperforms the competitors.

[1]  Chedy Raïssi,et al.  ρ-uncertainty , 2010, Proc. VLDB Endow..

[2]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[3]  D. Oswald,et al.  Best friends forever?: High school best friendships and the transition to college , 2003 .

[4]  Vincent D. Blondel,et al.  A Place-Focused Model for Social Networks in Cities , 2013, 2013 International Conference on Social Computing.

[5]  Yan Liu,et al.  EBM: an entropy-based model to infer social strength from spatiotemporal data , 2013, SIGMOD '13.

[6]  Ouri Wolfson,et al.  Extracting Semantic Location from Outdoor Positioning Systems , 2006, 7th International Conference on Mobile Data Management (MDM'06).

[7]  Charu C. Aggarwal,et al.  Link prediction across networks by biased cross-network sampling , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[8]  W. Hartup,et al.  The Company They Keep: Friendship in Childhood and Adolescence. , 1996 .

[9]  Cecilia Mascolo,et al.  Social and place-focused communities in location-based online social networks , 2013, The European Physical Journal B.

[10]  Jon M. Kleinberg,et al.  The link-prediction problem for social networks , 2007, J. Assoc. Inf. Sci. Technol..

[11]  Cho-Jui Hsieh,et al.  Organizational overlap on social networks and its applications , 2013, WWW.

[12]  Jong Kim,et al.  Protecting location privacy using location semantics , 2011, KDD.

[13]  Dan Cosley,et al.  Inferring social ties from geographic coincidences , 2010, Proceedings of the National Academy of Sciences.

[14]  Aniket Kittur,et al.  Bridging the gap between physical location and online social networks , 2010, UbiComp.

[15]  Valtteri Niemi,et al.  Inferring social ties in academic networks using short-range wireless communications , 2013, WPES.

[16]  Vitaly Shmatikov,et al.  Robust De-anonymization of Large Sparse Datasets , 2008, 2008 IEEE Symposium on Security and Privacy (sp 2008).

[17]  Liang Tang,et al.  LinkProbe: Probabilistic inference on large-scale social networks , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[18]  Jure Leskovec,et al.  Inferring networks of diffusion and influence , 2010, KDD.

[19]  Laks V. S. Lakshmanan,et al.  Learning influence probabilities in social networks , 2010, WSDM '10.

[20]  G. Crooks On Measures of Entropy and Information , 2015 .

[21]  L. Jost Entropy and diversity , 2006 .

[22]  Patricia M. Sias,et al.  From coworkers to friends: The development of peer friendships in the workplace , 1998 .

[23]  Valtteri Niemi,et al.  Big Brother Knows Your Friends: On Privacy of Social Communities in Pervasive Networks , 2012, Pervasive.

[24]  C. J. Adkins An introduction to thermal physics , 1987 .

[25]  David Lazer,et al.  Inferring friendship network structure by using mobile phone data , 2009, Proceedings of the National Academy of Sciences.

[26]  Laks V. S. Lakshmanan,et al.  A Data-Based Approach to Social Influence Maximization , 2011, Proc. VLDB Endow..

[27]  Krzysztof Janowicz,et al.  On the semantic annotation of places in location-based social networks , 2011, KDD.

[28]  Gerhard Weikum,et al.  ACM Transactions on Database Systems , 2005 .

[29]  Jasmine Novak,et al.  Anti-aliasing on the web , 2004, WWW '04.

[30]  H. Tuomisto A diversity of beta diversities: straightening up a concept gone awry. Part 1. Defining beta diversity as a function of alpha and gamma diversity , 2010 .

[31]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[32]  Xing Xie,et al.  Mining user similarity based on location history , 2008, GIS '08.

[33]  D. Ross,et al.  The Company They Keep: Friendships in Childhood and Adolescence , 1997 .

[34]  Jure Leskovec,et al.  Friendship and mobility: user movement in location-based social networks , 2011, KDD.

[35]  Hao Ma,et al.  An experimental study on implicit social recommendation , 2013, SIGIR.

[36]  Hanan Samet,et al.  The Quadtree and Related Hierarchical Data Structures , 1984, CSUR.

[37]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[38]  H. Tuomisto A consistent terminology for quantifying species diversity? Yes, it does exist , 2010, Oecologia.

[39]  Christian S. Jensen,et al.  Mining significant semantic locations from GPS data , 2010, Proc. VLDB Endow..

[40]  Chi-Yin Chow,et al.  Enabling Private Continuous Queries for Revealed User Locations , 2007, SSTD.

[41]  Cecilia Mascolo,et al.  The importance of being placefriends: discovering location-focused online communities , 2012, WOSN '12.

[42]  Jon Louis Bentley,et al.  Multidimensional binary search trees used for associative searching , 1975, CACM.

[43]  M. Hill Diversity and Evenness: A Unifying Notation and Its Consequences , 1973 .

[44]  Carol M. Werner,et al.  Similarity of Activity Preferences among Friends: Those Who Play Together Stay Together. , 1979 .

[45]  Massimo Barbaro,et al.  A Face Is Exposed for AOL Searcher No , 2006 .

[46]  Vincent D. Blondel,et al.  Group Colocation Behavior in Technological Social Networks , 2014, PloS one.

[47]  Walid G. Aref,et al.  Casper*: Query processing for location services without compromising privacy , 2006, TODS.

[48]  Cyrus Shahabi,et al.  Towards integrating real-world spatiotemporal data with social networks , 2011, GIS.

[49]  Ling Liu,et al.  Supporting anonymous location queries in mobile environments with privacygrid , 2008, WWW.

[50]  Marco Gruteser,et al.  USENIX Association , 1992 .