Finding Similar Mobile Consumers with a Privacy-Friendly Geosocial Design

This paper focuses on finding the same and similar users based on location-visitation data in a mobile environment. We propose a new design that uses consumer-location data from mobile devices smartphones, smart pads, laptops, etc. to build a "geosimilarity network" among users. The geosimilarity network GSN could be used for a variety of analytics-driven applications, such as targeting advertisements to the same user on different devices or to users with similar tastes, and to improve online interactions by selecting users with similar tastes. The basic idea is that two devices are similar, and thereby connected in the GSN, when they share at least one visited location. They are more similar as they visit more shared locations and as the locations they share are visited by fewer people. This paper first introduces the main ideas and ties them to theory and related work. It next introduces a specific design for selecting entities with similar location distributions, the results of which are shown using real mobile location data across seven ad exchanges. We focus on two high-level questions: 1 Does geosimilarity allow us to find different entities corresponding to the same individual, for example, as seen through different bidding systems? And 2 do entities linked by similarities in local mobile behavior show similar interests, as measured by visits to particular publishers? The results show positive results for both. Specifically, for 1, even with the data sample's limited observability, 70%-80% of the time the same individual is connected to herself in the GSN. For 2, the GSN neighbors of visitors to a wide variety of publishers are substantially more likely also to visit those same publishers. Highly similar GSN neighbors show very substantial lift.

[1]  D. Watts,et al.  Origins of Homophily in an Evolving Social Network1 , 2009, American Journal of Sociology.

[2]  M. McPherson,et al.  Birds of a Feather: Homophily in Social Networks , 2001 .

[3]  Corinna Cortes,et al.  Communities of interest , 2001, Intell. Data Anal..

[4]  David Martens,et al.  Pseudo-Social Network Targeting from Consumer Transaction Data , 2011 .

[5]  Robert E. Kraut,et al.  Editorial Overview - The Interplay Between Digital and Social Networks , 2008, Inf. Syst. Res..

[6]  Foster Provost,et al.  Audience selection for on-line brand advertising: privacy-friendly social network targeting , 2009, KDD.

[7]  Harri Oinas-Kukkonen,et al.  Social Networks and Information Systems: Ongoing and Future Research Streams , 2010, J. Assoc. Inf. Syst..

[8]  Foster J. Provost,et al.  The myth of the double-blind review?: author identification using only citations , 2003, SKDD.

[9]  Jure Leskovec,et al.  Friendship and mobility: user movement in location-based social networks , 2011, KDD.

[10]  Dan Cosley,et al.  Inferring social ties from geographic coincidences , 2010, Proceedings of the National Academy of Sciences.

[11]  Alex Pentland,et al.  Composite Social Network for Predicting Mobile Apps Installation , 2011, AAAI.

[12]  César A. Hidalgo,et al.  Unique in the Crowd: The privacy bounds of human mobility , 2013, Scientific Reports.

[13]  Mark Wallace,et al.  The Effects of the Social Structure of Digital Networks on Viral Marketing Performance , 2008, Inf. Syst. Res..

[14]  Manika Gupta Data Mining and Audience Intelligence for Advertising , 2010 .

[15]  Chris Volinsky,et al.  Network-Based Marketing: Identifying Likely Adopters Via Consumer Networks , 2006, math/0606278.

[16]  Arun Sundararajan,et al.  Distinguishing influence-based contagion from homophily-driven diffusion in dynamic networks , 2009, Proceedings of the National Academy of Sciences.

[17]  Foster J. Provost,et al.  Design principles of massive, robust prediction systems , 2012, KDD.

[18]  Tom Fawcett,et al.  Data science for business , 2013 .

[19]  Avi Goldfarb,et al.  Comments on 'A Preliminary FTC Staff Report on Protecting Consumer Privacy in an Era of Rapid Change: A Proposed Framework for Businesses and Policymakers' , 2011 .

[20]  Florian Probst,et al.  Identifying Key Users in Online Social Networks: A PageRank Based Approach , 2010, ICIS.

[21]  Daniele Quercia,et al.  Recommending Social Events from Mobile Phone Location Data , 2010, 2010 IEEE International Conference on Data Mining.

[22]  Foster J. Provost,et al.  Using co-visitation networks for detecting large scale online display advertising exchange fraud , 2013, KDD.

[23]  Foster J. Provost,et al.  Machine learning for targeted display advertising: transfer learning in action , 2013, Machine Learning.

[24]  Andreas Hotho,et al.  A Brief Survey of Text Mining , 2005, LDV Forum.

[25]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..