ISP-Enabled Behavioral Ad Targeting without Deep Packet Inspection

Online advertising is a rapidly growing industry currently dominated by the search engine 'giant' Google. In an attempt to tap into this huge market, Internet Service Providers (ISPs) started deploying deep packet inspection techniques to track and collect user browsing behavior. However, such techniques violate wiretap laws that explicitly prevent intercepting the contents of communication without gaining consent from consumers. In this paper, we show that it is possible for ISPs to extract user browsing patterns without inspecting contents of communication. Our contributions are threefold. First, we develop a methodology and implement a system that is capable of extracting web browsing features from stored non- content based records of online communication, which could be legally shared. When such browsing features are correlated with information collected by independently crawling the Web, it becomes possible to recover the actual web pages accessed by clients. Second, we systematically evaluate our system on the Internet and demonstrate that it can successfully recover user browsing patterns with high accuracy. Finally, our findings call for a comprehensive legislative reform that would not only enable fair competition in the online advertising business, but more importantly, protect the consumer rights in a more effective way.

[1]  Azer Bestavros,et al.  Self-similarity in World Wide Web traffic: evidence and possible causes , 1996, SIGMETRICS '96.

[2]  S. Hadjiefthymiades,et al.  Hypertext Transfer Protocol (HTTP) , 1996 .

[3]  Roy T. Fielding,et al.  Hypertext Transfer Protocol - HTTP/1.1 , 1997, RFC.

[4]  Bruce A. Mah,et al.  An empirical model of HTTP network traffic , 1997, Proceedings of INFOCOM '97.

[5]  H. Cheng,et al.  Traffic Analysis of SSL Encrypted Web Browsing , 1998 .

[6]  Roy T. Fielding,et al.  Uniform Resource Identifiers (URI): Generic Syntax , 1998, RFC.

[7]  Andrew Hintz,et al.  Fingerprinting Websites Using Traffic Analysis , 2002, Privacy Enhancing Technologies.

[8]  Lili Qiu,et al.  Statistical identification of encrypted Web browsing traffic , 2002, Proceedings 2002 IEEE Symposium on Security and Privacy.

[9]  Roy T. Fielding,et al.  Uniform Resource Identifier (URI): Generic Syntax , 2005, RFC.

[10]  Brian Neil Levine,et al.  Inferring the source of encrypted HTTP connections , 2006, CCS '06.

[11]  Arun Venkataramani,et al.  iPlane: an information plane for distributed services , 2006, OSDI '06.

[12]  Dirk Grunwald,et al.  Legal Issues Surrounding Monitoring During Network Research (Invited Paper) , 2007 .

[13]  cyberdetective Convention on Cybercrime , 2007 .

[14]  Dirk Grunwald,et al.  Legal issues surrounding monitoring during network research , 2007, IMC '07.

[15]  Wen Zhang,et al.  How much can behavioral targeting help online advertising? , 2009, WWW '09.