The E-Commerce Market for "Lemons": Identification and Analysis of Websites Selling Counterfeit Goods

We investigate the practice of websites selling counterfeit goods. We inspect web search results for 225 queries across 25 brands. We devise a binary classifier that predicts whether a given website is selling counterfeits by examining automatically extracted features such as WHOIS information, pricing and website content. We then apply the classifier to results collected between January and August 2014. We find that, overall, 32% of search results point to websites selling fakes. For 'complicit' search terms, such as "replica rolex", 39% of the search results point to fakes, compared to 20% for 'innocent' terms, such as "hermes buy online". Using a linear regression, we find that brands with a higher street price for fakes have higher incidence of counterfeits in search results, but that brands who take active countermeasures such as filing DMCA requests experience lower incidence of counterfeits in search results. Finally, we study how the incidence of counterfeits evolves over time, finding that the fraction of search results pointing to fakes remains remarkably stable.

[1]  Suku Nair,et al.  A comparison of machine learning techniques for phishing detection , 2007, eCrime '07.

[2]  Niels Provos,et al.  All Your iFRAMEs Point to Us , 2008, USENIX Security Symposium.

[3]  Rahul Telang,et al.  DRAFT : Preliminary and Incomplete Comments Welcome Competing with Free : The Impact of Movie Broadcasts on DVD Sales and Internet Piracy , 2006 .

[4]  T. Mansfield,et al.  A Study of Whois Privacy and Proxy Service Abuse , 2013 .

[5]  Lawrence K. Saul,et al.  Search + Seizure: The Effectiveness of Interventions on SEO Campaigns , 2014, Internet Measurement Conference.

[6]  Fabrizio Schifano,et al.  Internet pharmacies and online prescription drug sales: a cross-sectional study , 2005 .

[7]  Manfred Kochen,et al.  On the economics of information , 1972, J. Am. Soc. Inf. Sci..

[8]  Ross J. Anderson Why information security is hard - an economic perspective , 2001, Seventeenth Annual Computer Security Applications Conference.

[9]  Calton Pu,et al.  Predicting web spam with HTTP session information , 2008, CIKM '08.

[10]  Mark Culp,et al.  and Development , 1998 .

[11]  Giovanni Vigna,et al.  Prophiler: a fast filter for the large-scale detection of malicious web pages , 2011, WWW.

[12]  Lawrence K. Saul,et al.  Judging a site by its content: learning the textual, structural, and visual features of malicious web pages , 2011, AISec '11.

[13]  T. Moore,et al.  Pick your poison: pricing and inventories at unlicensed online pharmacies , 2013, EC '13.

[14]  Thamar Solorio,et al.  Lexical feature based phishing URL detection using online learning , 2010, AISec '10.

[15]  G. Stigler The Economics of Information , 1961, Journal of Political Economy.

[16]  Stefan Savage,et al.  PharmaLeaks: Understanding the Business of Online Pharmaceutical Affiliate Programs , 2012, USENIX Security Symposium.

[17]  Stefan Savage,et al.  Cloak and dagger: dynamics of web search cloaking , 2011, CCS '11.

[18]  Tyler Moore,et al.  Measuring the Influence of Perceived Cybercrime Risk on Online Service Avoidance , 2016, IEEE Transactions on Dependable and Secure Computing.

[19]  Tyler Moore,et al.  Understanding the Influence of Cybercrime Risk on the E-Service Adoption of European Internet Users , 2014, WEIS 2014.

[20]  He Liu,et al.  Click Trajectories: End-to-End Analysis of the Spam Value Chain , 2011, 2011 IEEE Symposium on Security and Privacy.

[21]  Tyler Moore,et al.  A Nearly Four-Year Longitudinal Study of Search-Engine Poisoning , 2014, CCS.

[22]  Martín Abadi,et al.  deSEO: Combating Search-Result Poisoning , 2011, USENIX Security Symposium.

[23]  Tyler Moore,et al.  Measuring and Analyzing Search-Redirection Attacks in the Illicit Online Prescription Drug Trade , 2011, USENIX Security Symposium.

[24]  P. McCullagh,et al.  Generalized Linear Models , 1984 .

[25]  Markus Kammerstetter,et al.  Vanity, cracks and malware: insights into the anti-copy protection ecosystem , 2012, CCS '12.

[26]  Lawrence K. Saul,et al.  Beyond blacklists: learning to detect malicious web sites from suspicious URLs , 2009, KDD.

[27]  Marc Najork,et al.  Detecting spam web pages through content analysis , 2006, WWW '06.

[28]  Tyler Moore,et al.  The Economics of Information Security , 2006, Science.

[29]  Eric R. Ziegel,et al.  Generalized Linear Models , 2002, Technometrics.

[30]  Tyler Moore,et al.  Fashion crimes: trending-term exploitation on the web , 2011, CCS '11.

[31]  Lawrence K. Saul,et al.  Knock it off: profiling the online storefronts of counterfeit merchandise , 2014, KDD.