Inferring the Security Performance of Providers from Noisy and Heterogenous Abuse Datasets

Abuse data offers one of the very few empirical measurements of the security performance of defenders. As such, it can play an important role in strengthening and aligning the security incentives in a variety of markets. Using abuse data to measure security performance suffers from a number of problems, however. Abuse data is notoriously noisy, highly heterogeneous, often incomplete, biased, and driven by a multitude of causal factors that are hard to disentangle. We present the first comprehensive approach to measure defender security performance from a combination of heterogeneous abuse datasets, taking all of these issues into account. We present a causal model of incidents, test for biases across seven abuse datasets and then propose a new modeling approach. Using Item Response Theory, we estimate the security performance of providers as a latent, unobservable trait. The approach also allows us to quantify the uncertainty of the performance estimates. Despite the uncertainties, we demonstrate the effectiveness of the approach by using the security performance estimates to predict a large portion of the variance in the abuse counts observed in independent datasets, after controlling for various exposure effects such as the size and business type of the providers.

[1]  Craig A. Shue,et al.  Abnormally Malicious Autonomous Systems and Their Internet Connectivity , 2012, IEEE/ACM Transactions on Networking.

[2]  Rainer Böhme,et al.  Rotten Apples or Bad Harvest? What We Are Measuring When We Are Measuring Abuse , 2017, ACM Trans. Internet Techn..

[3]  Samaneh Tajalizadehkhoob,et al.  The Role of Hosting Providers in Fighting Command and Control Infrastructure of Financial Malware , 2017, AsiaCCS.

[4]  Maciej Korczynski,et al.  Developing Security Reputation Metrics for Hosting Providers , 2016, ArXiv.

[5]  Michel van Eeten,et al.  Post-Mortem of a Zombie: Conficker Cleanup After Six Years , 2015, USENIX Security Symposium.

[6]  Aiko Pras,et al.  Bad neighborhoods on the internet , 2014, IEEE Communications Magazine.

[7]  Maciej Korczynski,et al.  Apples, oranges and hosting providers: Heterogeneity and security in the hosting market , 2016, NOMS 2016 - 2016 IEEE/IFIP Network Operations and Management Symposium.

[8]  Cristian Hesselman,et al.  Reputation Metrics Design to Improve Intermediary Incentives for Security of TLDs , 2017, 2017 IEEE European Symposium on Security and Privacy (EuroS&P).

[9]  Tyler Moore,et al.  Hacking Is Not Random: A Case-Control Study of Webserver-Compromise Risk , 2016, IEEE Transactions on Dependable and Secure Computing.

[10]  Nick Feamster,et al.  ASwatch: An AS Reputation System to Expose Bulletproof Hosting ASes , 2015, Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication.

[11]  He Liu,et al.  On the Effects of Registrar-level Intervention , 2011, LEET.

[12]  Michael D. Ward,et al.  The Separation Plot: A New Visual Method for Evaluating the Fit of Binary Models , 2011 .

[13]  Johannes M. Bauer,et al.  The Role of Internet Service Providers in Botnet Mitigation an Empirical Analysis Based on Spam Data , 2010, WEIS.

[14]  He Liu,et al.  Click Trajectories: End-to-End Analysis of the Spam Value Chain , 2011, 2011 IEEE Symposium on Security and Privacy.

[15]  Tyler Moore,et al.  Concentrating Correctly on Cybercrime Concentration , 2015, WEIS.

[16]  D. Rubin,et al.  Inference from Iterative Simulation Using Multiple Sequences , 1992 .

[17]  Benjamin Edwards,et al.  Analyzing and Modeling Longitudinal Security Data: Promise and Pitfalls , 2015, ACSAC 2015.

[18]  Andrew B. Whinston,et al.  How Would Information Disclosure Influence Organizations' Outbound Spam Volume? Evidence from a Field Experiment , 2016, J. Cybersecur..

[19]  Anil Somayaji,et al.  National-level risk assessment : A multi-country study of malware infections , 2016 .

[20]  L. Crombie,et al.  Everything you wanted to know about … , 1982, Nature.

[21]  Kevin C. Almeroth,et al.  FIRE: FInding Rogue nEtworks , 2009, 2009 Annual Computer Security Applications Conference.

[22]  Mingyan Liu,et al.  On the Mismanagement and Maliciousness of Networks , 2014, NDSS.

[23]  Wolter Pieters,et al.  A move in the security measurement stalemate: elo-style ratings to quantify vulnerability , 2012, NSPW '12.

[24]  Christian Rossow,et al.  Paint It Black: Evaluating the Effectiveness of Malware Blacklists , 2014, RAID.

[25]  Johannes Hartig,et al.  Multidimensional IRT models for the assessment of competencies , 2009 .

[26]  Parinaz Naghizadeh Ardabili,et al.  Cloudy with a Chance of Breach: Forecasting Cyber Security Incidents , 2015, USENIX Security Symposium.

[27]  Radu State,et al.  ASMATRA: Ranking ASs providing transit service to malware hosters , 2013, 2013 IFIP/IEEE International Symposium on Integrated Network Management (IM 2013).

[28]  Chris Kanich,et al.  Taster's choice: a comparative analysis of spam feeds , 2012, Internet Measurement Conference.

[29]  Craig A. Shue,et al.  Malicious Hubs: Detecting Abnormally Malicious Autonomous Systems , 2010, 2010 Proceedings IEEE INFOCOM.

[30]  Benjamin Edwards,et al.  Risky Business: Assessing Security with External Measurements , 2019, ArXiv.

[31]  Lihua Yao Reporting Valid and Reliable Overall Scores and Domain Scores , 2010 .

[32]  Elie Bursztein,et al.  The Abuse Sharing Economy: Understanding the Limits of Threat Exchanges , 2016, RAID.