A Utility-Theoretic Approach to Privacy and Personalization

Online services such as web search, news portals, and e-commerce applications face the challenge of providing high-quality experiences to a large, heterogeneous user base. Recent efforts have highlighted the potential to improve performance by personalizing services based on special knowledge about users. For example, a user's location, demographics, and search and browsing history may be useful in enhancing the results offered in response to web search queries. However, reasonable concerns about privacy by both users, providers, and government agencies acting on behalf of citizens, may limit access to such information. We introduce and explore an economics of privacy in personalization, where people can opt to share personal information in return for enhancements in the quality of an online service. We focus on the example of web search and formulate realistic objective functions for search efficacy and privacy. We demonstrate how we can identify a near-optimal solution to the utility-privacy tradeoff. We evaluate the methodology on data drawn from a log of the search activity of volunteer participants. We separately assess users' preferences about privacy and utility via a large-scale survey, aimed at eliciting preferences about peoples' willingness to trade the sharing of personal data in returns for gains in search efficiency. We show that a significant level of personalization can be achieved using only a small amount of information about users.

[1]  Masatoshi Yoshikawa,et al.  Adaptive web search based on user profile constructed without any effort from users , 2004, WWW '04.

[2]  Rahul Telang,et al.  Examining the Personalization-Privacy Tradeoff – an Empirical Investigation with Email Advertisements , 2005 .

[3]  Eric Horvitz Machine Learning, Reasoning, and Intelligence in Daily Life: Directions and Challenges , 2006 .

[4]  Eytan Adar,et al.  Valuating Privacy , 2005, WEIS.

[5]  G. A. Tijssen,et al.  The Data-Correcting Algorithm for the Minimization of Supermodular Functions , 1999 .

[6]  Ji-Rong Wen,et al.  A large-scale evaluation and analysis of personalized search strategies , 2007, WWW '07.

[7]  Eytan Adar,et al.  User 4XXXXX9: Anonymizing Query Logs , 2007 .

[8]  Ashwin Machanavajjhala,et al.  l-Diversity: Privacy Beyond k-Anonymity , 2006, ICDE.

[9]  Latanya Sweeney,et al.  k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[10]  Jonathan Grudin,et al.  A study of preferences for sharing and privacy , 2005, CHI Extended Abstracts.

[11]  Andreas Krause,et al.  Near-optimal Nonmyopic Value of Information in Graphical Models , 2005, UAI.

[12]  Raghu Ramakrishnan,et al.  Privacy Skyline: Privacy with Multidimensional Adversarial Knowledge , 2007, VLDB.

[13]  Kai Lung Hui,et al.  Online Information Privacy: Measuring the Cost-Benefit Trade-Off , 2002, ICIS.

[14]  G. Nemhauser,et al.  Maximizing Submodular Set Functions: Formulations and Analysis of Algorithms* , 1981 .

[15]  Susan T. Dumais,et al.  Personalizing Search via Automated Analysis of Interests and Activities , 2005, SIGIR.

[16]  Cynthia Dwork,et al.  Differential Privacy , 2006, ICALP.

[17]  David J. DeWitt,et al.  Mondrian Multidimensional K-Anonymity , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[18]  M. L. Fisher,et al.  An analysis of approximations for maximizing submodular set functions—I , 1978, Math. Program..

[19]  Elisa Bertino,et al.  Beyond k-Anonymity: A Decision Theoretic Framework for Assessing Privacy Risk , 2009, Trans. Data Priv..

[20]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[21]  John R. Beaumont,et al.  Studies on Graphs and Discrete Programming , 1982 .

[22]  Doug Downey,et al.  Models of Searching and Browsing: Languages, Studies, and Application , 2007, IJCAI.

[23]  Ke Wang,et al.  Privacy-enhancing personalized web search , 2007, WWW '07.

[24]  Vahab S. Mirrokni,et al.  Maximizing Non-Monotone Submodular Functions , 2011, 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07).

[25]  Sharad Mehrotra,et al.  Flexible Anonymization For Privacy Preserving Data Publishing: A Systematic Search Based Approach , 2007, SDM.

[26]  Vahab Mirrokni,et al.  Maximizing Non-Monotone Submodular Functions , 2007, FOCS 2007.