Tracking website data-collection and privacy practices with the iWatch web crawler

In this paper we introduce the iWatch web crawler, a tool designed to catalogue and analyze online data practices and the use of privacy related indicators and technologies. Our goal in developing iWatch was to make possible a new type of analysis of trends, the impact of legislation on practices, and geographic and social differences online. In this paper we present preliminary findings from two sets of data collected 15 months apart and analyzed with this tool. Our combined samples included more than 240,000 pages from over 24,000 domains and 47 different countries. In addition to providing useful and needed data on the state of online data practices, we show that iWatch is a promising approach to the study of the web ecosystem.

[1]  S. Becker THE HEALTH INSURANCE PORTABILITY AND ACCOUNTABILITY ACT , 2004 .

[2]  Yitao Duan,et al.  Designing for Privacy in Ubiquitous Computing Environments , 2004 .

[3]  Lorrie Faith Cranor,et al.  Automated analysis of P3P-enabled Web sites , 2003, ICEC '03.

[4]  Marc Najork,et al.  Mercator: A scalable, extensible Web crawler , 1999, World Wide Web.

[5]  Mark S. Ackerman,et al.  Beyond Concern: Understanding Net Users' Attitudes About Online Privacy , 1999, ArXiv.

[6]  Marc Langheinrich,et al.  The platform for privacy preferences 1.0 (p3p1.0) specification , 2002 .

[7]  Gurpreet Dhillon,et al.  Do privacy seals in e-commerce really work? , 2003, CACM.

[8]  Annie I. Antón,et al.  Analyzing Website privacy requirements using a privacy goal taxonomy , 2002, Proceedings IEEE Joint International Conference on Requirements Engineering.

[9]  France Bélanger,et al.  Trustworthiness in electronic commerce: the role of privacy, security, and site attributes , 2002, J. Strateg. Inf. Syst..

[10]  Anthony D. Miyazaki,et al.  Internet Seals of Approval: Effects on Online Privacy Policies and Consumer Perceptions , 2002 .

[11]  Min Wu,et al.  Do security toolbars actually prevent phishing attacks? , 2006, CHI.

[12]  Annie I. Antón,et al.  Financial privacy policies and the need for standardization , 2004, IEEE Security & Privacy Magazine.

[13]  Dane K. Peterson,et al.  Would Regulation of Web Site Privacy Policy Statements Increase Consumer Trust , 2006 .

[14]  Niels Provos,et al.  The Ghost in the Browser: Analysis of Web-based Malware , 2007, HotBots.

[15]  Steven D. Gribble,et al.  A Crawler-based Study of Spyware in the Web , 2006, NDSS.

[16]  R. Kraut,et al.  Awareness and Coordination in Shared Work Spaces , 1992 .

[17]  Ronald E. Anderson Social Impacts of Computing: Codes of Professional Ethics , 1992 .

[18]  Colin Potts,et al.  Privacy policies as decision-making tools: an evaluation of online privacy notices , 2004, CHI.

[19]  Fahd Arshad Fox-A JavaScript-based P 3 P Agent for Mozilla Firefox , 2004 .

[20]  Clare-Marie Karat,et al.  An empirical study of natural language parsing of privacy policy rules using the SPARCLE policy workbench , 2006, SOUPS '06.

[21]  Lorrie Faith Cranor,et al.  An analysis of P3P-enabled web sites among top-20 search results , 2006, ICEC '06.

[22]  田端 利宏,et al.  Network and Distributed System Security Symposiumにおける研究動向の調査 , 2004 .

[23]  M. Crawford The Art of Readable Writing , 1969 .

[24]  Alexandra J. Campbell Relationship marketing in consumer markets , 1997 .

[25]  Paul Dourish,et al.  Unpacking "privacy" for a networked world , 2003, CHI '03.

[26]  Christopher Kuner,et al.  European Data Privacy Law and Online Business , 2003 .

[27]  Colin Potts,et al.  Privacy practices of Internet users: Self-reports versus observed behavior , 2005, Int. J. Hum. Comput. Stud..

[28]  J. Doug Tygar,et al.  The battle against phishing: Dynamic Security Skins , 2005, SOUPS '05.

[29]  Lorrie Faith Cranor,et al.  The platform for privacy preferences , 1999, CACM.

[30]  Joel R. Reidenberg,et al.  Data Privacy Law: A Study of United States Data Protection , 1996 .

[31]  J. Doug Tygar,et al.  Why Johnny Can't Encrypt: A Usability Evaluation of PGP 5.0 , 1999, USENIX Security Symposium.

[32]  R. Caplan HIPAA. Health Insurance Portability and Accountability Act of 1996. , 2003, Dental assistant.

[33]  Herbert Burkert,et al.  Some Preliminary Comments on the DIRECTIVE 95/46/EC OF THE EUROPEAN PARLIAMENT AND OF THE COUNCIL of 24 October 1995 on the protection of individuals with regard to the processing of personal data and on the free movement of such data. , 1996 .

[34]  M. Angela Sasse,et al.  Pretty good persuasion: a first step towards effective password security in the real world , 2001, NSPW '01.

[35]  Marti A. Hearst,et al.  Why phishing works , 2006, CHI.

[36]  Dw Arner,et al.  Sectoral Regulation in the United States: Financial Services Modernization in the US and the Gramm-Leach-Bliley Act of 1999 , 2002 .

[37]  Marc Langheinrich,et al.  Privacy by Design - Principles of Privacy-Aware Ubiquitous Systems , 2001, UbiComp.

[38]  Lorrie Faith Cranor Proceedings of the 2005 symposium on Usable privacy and security , 2005 .

[39]  Lorrie Faith Cranor,et al.  Web Privacy with P3p , 2002 .

[40]  A. Nation Online: How Americans Are Expanding Their Use of the Internet , 2002 .