URLs often utilize query strings (i.e., key-value pairs appended to the URL path) as a means to pass session parameters and form data. Often times these arguments are not privacy sensitive but are necessary to render the web page. However, query strings may also contain tracking mechanisms, user names, email addresses, and other information that users may not wish to reveal. In isolation such URLs are not particularly problematic, but the growth of Web 2.0 platforms such as social networks and micro-blogging means URLs (often copy-pasted from web browsers) are increasingly being publicly broadcast. This position paper argues that the threat posed by such privacy disclosures is significant and prevalent. It demonstrates this by analyzing 892 million user-submitted URLs, many disseminated in (semi-) public forums. Within this corpus our casestudy identifies troves of personal data including 1.7 million email addresses. In the most egregious examples the query string contains plaintext usernames and passwords for administrative and extremely sensitive accounts. With this as motivation the authors propose a privacy-aware service named “CleanURL”. CleanURL’s goal is to transform input addresses by stripping nonessential key-value pairs and/or notifying users when sensitive data is critical to proper page rendering. This logic is based on difference algorithms, mining of URL corpora, and human feedback loops. Though realized as a link shortener in its prototype implementation, CleanURL could be leveraged on any platform to scan URLs before they are published or retroactively sanitize existing links.
[1]
Sotiris Ioannidis,et al.
we.b: the web of short urls
,
2011,
WWW.
[2]
Gianluca Stringhini,et al.
Two years of short URLs internet measurement: security threats and countermeasures
,
2013,
WWW.
[3]
Stéphane Gançarski,et al.
Vi-DIFF: Understanding Web Pages Changes
,
2010,
DEXA.
[4]
Micah Sherr,et al.
Validating web content with senser
,
2013,
ACSAC.
[5]
Vern Paxson,et al.
@spam: the underground on 140 characters or less
,
2010,
CCS '10.
[6]
Balachander Krishnamurthy,et al.
WWW 2009 MADRID! Track: Security and Privacy / Session: Web Privacy Privacy Diffusion on the Web: A Longitudinal Perspective
,
2022
.
[7]
Birgit Pfitzmann,et al.
Privacy in browser-based attribute exchange
,
2002,
WPES '02.
[8]
Adam J. Aviv,et al.
CleanURL: A Privacy Aware Link Shortener
,
2012
.
[9]
Fabrício Benevenuto,et al.
Phi.sh/$oCiaL: the phishing landscape through short URLs
,
2011,
CEAS '11.
[10]
Engin Kirda,et al.
Automated Discovery of Parameter Pollution Vulnerabilities in Web Applications
,
2011,
NDSS.
[11]
Markus Strohmaier,et al.
Short links under attack: geographical analysis of spam in a URL shortener network
,
2012,
HT '12.
[12]
Jong Kim,et al.
Fluxing botnet command and control channels with URL shortening services
,
2013,
Comput. Commun..