Website defacement is the practice of altering the web pages of a website after its compromise. The altered pages, calleddeface pages, can negatively affect the reputation and business of the victim site. Previous research has focused primarily on detection, rather than exploring the defacement phenomenon in depth. While investigating several defacements, we observed that the artifacts left by the defacers allow an expert analyst to investigate the actors' modus operandi and social structure, and expand from the single deface page to a group of related defacements (i.e., acampaign ). However, manually performing such analysis on millions of incidents is tedious, and poses scalability challenges. From these observations, we propose an automated approach that efficiently builds intelligence information out of raw deface pages. Our approach streamlines the analysts job by automatically recognizing defacement campaigns, and assigning meaningful textual labels to them. Applied to a comprehensive dataset of 13 million defacement records, from Jan. 1998 to Sept. 2016, our approach allowed us to conduct the first large-scale measurement on web defacement campaigns. In addition, our approach is meant to be adopted operationally by analysts to identify live campaigns on the field. We go beyond confirming anecdotal evidence. We analyze the social structure of modern defacers, which includes lone individuals as well as actors that cooperate with each others, or with teams, which evolve over time and dominate the scene. We conclude by drawing a parallel between the time line of World-shaping events and defacement campaigns, representing the evolution of the interests and orientation of modern defacers.
[1]
Christopher Krügel,et al.
Meerkat: Detecting Website Defacements through Image-based Object Recognition
,
2015,
USENIX Security Symposium.
[2]
Eric Medvet,et al.
Anomaly detection techniques for a web defacement monitoring service
,
2011,
Expert Syst. Appl..
[3]
Joseph R. Dominick,et al.
Hackers: Militants or Merry Pranksters? A Content Analysis of Defaced Web Pages
,
2004
.
[4]
Gaël Varoquaux,et al.
Scikit-learn: Machine Learning in Python
,
2011,
J. Mach. Learn. Res..
[5]
Michael I. Jordan,et al.
Latent Dirichlet Allocation
,
2001,
J. Mach. Learn. Res..
[6]
Tian Zhang,et al.
BIRCH: an efficient data clustering method for very large databases
,
1996,
SIGMOD '96.
[7]
Eric Medvet,et al.
A Comparative Study of Anomaly Detection Techniques in Web Site Defacement Detection
,
2008,
SEC.
[8]
Wes McKinney,et al.
Data Structures for Statistical Computing in Python
,
2010,
SciPy.