Faking contextual data for fun, profit, and privacy

The amount of contextual data collected, stored, mined, and shared is increasing exponentially. Street cameras, credit card transactions, chat and Twitter logs, e-mail, web site visits, phone logs and recordings, social networking sites, all are examples of data that persists in a manner not under individual control, leading some to declare the death of privacy. We argue here that the ability to generate convincing fake contextual data can be a basic tool in the fight to preserve privacy. One use for the technology is for an individual to make his actual data indistinguishable amongst a pile of false data. In this paper we consider two examples of contextual data, search engine query data and location data. We describe the current state of faking these types of data and our own efforts in this direction.

[1]  Carlos Fernández-Valdivielso,et al.  Disappearing for a while - using white lies in pervasive computing , 2007, WPES '07.

[2]  Hao Chen,et al.  Noise Injection for Search Privacy Protection , 2009, 2009 International Conference on Computational Science and Engineering.

[3]  Ashwin Machanavajjhala,et al.  Privacy: Theory meets Practice on the Map , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[4]  John Krumm,et al.  Inference Attacks on Location Tracks , 2007, Pervasive.

[5]  Jun-Lin Lin,et al.  Privacy preserving itemset mining through fake transactions , 2007, SAC '07.

[6]  Irit Dinur,et al.  Revealing information while preserving privacy , 2003, PODS.

[7]  John Krumm Realistic Driving Trips For Location Privacy , 2009, Pervasive.

[8]  Ninghui Li,et al.  End-User Privacy in Human–Computer Interaction , 2009 .

[9]  Theodore P. Hill,et al.  The Difficulty of Faking Data , 1999 .