Disclosing users' data in an environment that preserves privacy

The conflict between Web service personalization and privacy is a challenge in the information society. In this paper we address this challenge by introducing MASKS, an architecture that provides data on the users' interests to Web services, without violating their privacy. The proposed approach hides the actual identity of users by classifying them into groups, according to their interests exhibited during the interaction with a Web service. By making requests on behalf of a group, instead of an individual user, MASKS provides relevant information to the Web services, without disclosing the identity of the users. We have implemented and tested a grouping algorithm, based on categories defined by the semantic tree of DMOZ. We used access logs from actual e-commerce sites to evaluate the grouping algorithm. Our tests show that 64% of the requests made to the e-commerce service could be grouped into meaningful categories. This indicates that the e-commerce sites could use the information provided by MASKS to do personalization of services, without having access to the individual users in the groups.

[1]  David A. Wagner,et al.  Privacy-enhancing technologies for the Internet , 1997, Proceedings IEEE COMPCON 97. Digest of Papers.

[2]  Chen Wang,et al.  Consumer privacy concerns about Internet marketing , 1998, CACM.

[3]  Peter Pirolli,et al.  Mining Longest Repeating Subsequences to Predict World Wide Web Surfing , 1999, USENIX Symposium on Internet Technologies and Systems.

[4]  Eytan Adar,et al.  A Market for Secrets , 2001, First Monday.

[5]  Mark S. Ackerman,et al.  Privacy in e-commerce: examining user scenarios and privacy preferences , 1999, EC '99.

[6]  Michael K. Reiter,et al.  Crowds: anonymity for Web transactions , 1998, TSEC.

[7]  Jaideep Srivastava,et al.  Web usage mining: discovery and applications of usage patterns from Web data , 2000, SKDD.

[8]  Roger Clarke,et al.  Internet privacy concerns confirm the case for intervention , 1999, CACM.

[9]  Mark S. Ackerman,et al.  Beyond Concern: Understanding Net Users' Attitudes About Online Privacy , 1999, ArXiv.

[10]  Virgílio A. F. Almeida,et al.  Characterizing reference locality in the WWW , 1996, Fourth International Conference on Parallel and Distributed Information Systems.

[11]  Virgílio A. F. Almeida,et al.  A methodology for workload characterization of E-commerce sites , 1999, EC '99.

[12]  Eytan Adar,et al.  Free Riding on Gnutella , 2000, First Monday.

[13]  Víctor Pàmies,et al.  Open Directory Project , 2003 .

[14]  David M. Kristol,et al.  HTTP Cookies: Standards, privacy, and politics , 2001, TOIT.

[15]  Tad Hogg,et al.  Enhancing privacy and trust in electronic communities , 1999, EC '99.

[16]  John Leubsdorf,et al.  Privacy and Freedom , 1968 .

[17]  Chris Clifton,et al.  SECURITY AND PRIVACY IMPLICATIONS OF DATA MINING , 1996 .