A Crawler-based Study of Spyware in the Web

Malicious spyware poses a significant threat to desktop security and integrity. This paper examines that threat from an Internet perspective. Using a crawler, we performed a large-scale, longitudinal study of the Web, sampling both executables and conventional Web pages for malicious objects. Our results show the extent of spyware content. For example, in a May 2005 crawl of 18 million URLs, we found spyware in 13.4% of the 21,200 executables we identified. At the same time, we found scripted “drive-by download” attacks in 5.9% of the Web pages we processed. Our analysis quantifies the density of spyware, the types of of threats, and the most dangerous Web zones in which spyware is likely to be encountered. We also show the frequency with which specific spyware programs were found in the content we crawled. Finally, we measured changes in the density of spyware over time; e.g., our October 2005 crawl saw a substantial reduction in the presence of drive-by download attacks, compared with those we detected in May.

[1]  Beng-Hong Lim,et al.  Virtualizing I/O Devices on VMware Workstation's Hosted Virtual Machine Monitor , 2001, USENIX Annual Technical Conference, General Track.

[2]  Arun Lakhotia,et al.  Analysis and detection of computer viruses and worms: an annotated bibliography , 2002, SIGP.

[3]  Matthew C. Elder,et al.  Recent worms: a survey and trends , 2003, WORM '03.

[4]  Stefan Savage,et al.  Inside the Slammer Worm , 2003, IEEE Secur. Priv..

[5]  Niels Provos,et al.  A Virtual Honeypot Framework , 2004, USENIX Security Symposium.

[6]  Stefan Saroiu,et al.  Measurement and Analysis of Spyware in a University Environment , 2004, NSDI.

[7]  Sy-Yen Kuo,et al.  Gatekeeper: Monitoring Auto-Start Extensibility Points (ASEPs) for Spyware Management , 2004, LISA.

[8]  Somesh Jha,et al.  Semantics-aware malware detection , 2005, 2005 IEEE Symposium on Security and Privacy (S&P'05).

[9]  Yi-Min Wang,et al.  Detecting stealth software with Strider GhostBuster , 2005, 2005 International Conference on Dependable Systems and Networks (DSN'05).

[10]  Younghwa Lee,et al.  Investigating factors affecting the adoption of anti-spyware systems , 2005, CACM.

[11]  Mark B. Schmidt,et al.  Busting the ghost in the machine , 2005, CACM.

[12]  Lee A. Freeman,et al.  Why do people hate spyware? , 2005, CACM.

[13]  Mark B. Schmidt,et al.  Spyware: a little knowledge is a wonderful thing , 2005, CACM.

[14]  Xiaoni Zhang,et al.  What do consumers really know about spyware? , 2005, CACM.

[15]  Steve Gibson,et al.  Spyware was inevitable , 2005, CACM.

[16]  Roger Thompson,et al.  Why spyware poses multiple threats to security , 2005, CACM.

[17]  Qing Hu,et al.  Is spyware an Internet nuisance or public menace? , 2005, CACM.

[18]  Xin Luo,et al.  A framework for spyware assessment , 2005, CACM.

[19]  Fiona Fui-Hoon Nah,et al.  Web browsing and spyware intrusion , 2005, CACM.

[20]  Neveen Farag Awad,et al.  The deceptive behaviors that offend us most about spyware , 2005, CACM.

[21]  Thomas F. Stafford,et al.  Spyware: a view from the (online) street , 2005, CACM.

[22]  Xuxian Jiang,et al.  Automated Web Patrol with Strider HoneyMonkeys: Finding Web Sites That Exploit Browser Vulnerabilities , 2006, NDSS.