Analysis of Trending Topics and Text-based Channels of Information Delivery in Cybersecurity

Computer users are generally faced with difficulties in making correct security decisions. While an increasingly fewer number of people are trying or willing to take formal security training, online sources including news, security blogs, and websites are continuously making security knowledge more accessible. Analysis of cybersecurity texts can provide insights into the trending topics and identify current security issues as well as how cyber attacks evolve over time. These in turn can support researchers and practitioners in predicting and preparing for these attacks. Comparing different sources may facilitate the learning process for normal users by persisting the security knowledge gained from different cybersecurity context. Prior studies neither systematically analysed the wide-range of digital sources nor provided any standardisation in analysing the trending topics from recent security texts. Although LDA has been widely adopted in topic generation, its generated topics cannot cover the cybersecurity concepts completely and considerably overlap. To address this issue, we propose a semi-automated classification method to generate comprehensive security categories instead of LDA-generated topics. We further compare the identified 16 security categories across different sources based on their popularity and impact. We have revealed several surprising findings. (1) The impact reflected from cyber-security texts strongly correlates with the monetary loss caused by cybercrimes. (2) For most categories, security blogs share the largest popularity and largest absolute/relative impact over time. (3) Websites deliver security information without caring about timeliness much, where one third of the articles do not specify the date and the rest have a time lag in posting emerging security issues.

[1]  Ivan Flechais,et al.  "If It's Urgent or It Is Stopping Me from Doing Something, Then I Might Just Go Straight at It": A Study into Home Data Security Decisions , 2017, HCI.

[2]  Cleotilde Gonzalez,et al.  Effects of cyber security knowledge on attack detection , 2015, Comput. Hum. Behav..

[3]  Sunny Consolvo,et al.  152 Simple Steps to Stay Safe Online: Security Advice for Non-Tech-Savvy Users , 2017, IEEE Security & Privacy.

[4]  Blase Ur,et al.  "What was that site doing with my Facebook password?": Designing Password-Reuse Notifications , 2018, CCS.

[5]  Robiah Yusof,et al.  Cyber Threat Intelligence – Issue and Challenges , 2018 .

[6]  Marshini Chetty,et al.  the Thirteenth Symposium on Usable Privacy and Security (SOUPS , 2022 .

[7]  Michael Backes,et al.  You Get Where You're Looking for: The Impact of Information Sources on Code Security , 2016, 2016 IEEE Symposium on Security and Privacy (SP).

[8]  Elissa M. Redmiles,et al.  I Think They're Trying to Tell Me Something: Advice Sources and Selection for Digital Security , 2016, 2016 IEEE Symposium on Security and Privacy (SP).

[9]  Md. Hussain,et al.  pyMannKendall: a python package for non parametric Mann Kendall family of trend tests , 2019, J. Open Source Softw..

[10]  Alessandro Acquisti,et al.  Privacy and rationality in individual decision making , 2005, IEEE Security & Privacy.

[11]  Laura A. Dabbish,et al.  Self-Efficacy-Based Game Design to Encourage Security Behavior Online , 2019, CHI Extended Abstracts.

[12]  Florian Schaub,et al.  Beyond Mandatory: Making Data Breach Notifications Useful for Consumers , 2019, IEEE Security & Privacy.

[13]  Rick Wash,et al.  Too Much Knowledge? Security Beliefs and Protective Behaviors Among United States Internet Users , 2015, SOUPS.

[14]  Nathanael Chambers,et al.  Detecting Denial-of-Service Attacks from Social Media Text: Applying NLP to Computer Security , 2018, NAACL.

[15]  Fabio Massacci,et al.  Which is the right source for vulnerability studies?: an empirical analysis on Mozilla Firefox , 2010, MetriSec '10.

[16]  Eric W. Burger,et al.  Taxonomy Model for Cyber Threat Intelligence Information Exchange Technologies , 2014, WISCS '14.

[17]  Xiaojin Zhu,et al.  The Security of Latent Dirichlet Allocation , 2015, AISTATS.

[18]  Rick Wash,et al.  Who Provides Phishing Training?: Facts, Stories, and People Like Me , 2018, CHI.

[19]  C. Elkan,et al.  Topic Models , 2008 .

[20]  Ivan Flechais,et al.  Informal Support Networks: an investigation into Home Data Security Practices , 2018, SOUPS @ USENIX Security Symposium.

[21]  Rick Wash,et al.  Stories as informal lessons about security , 2012, SOUPS.

[22]  Margaret L Danowski Grey Literature , 2019, A-Z Common Reference Questions for Academic Librarians.

[23]  Giovanni Vigna,et al.  Shell We Play A Game? CTF-as-a-service for Security Education , 2017, ASE @ USENIX Security Symposium.

[24]  Leandros A. Maglaras,et al.  Human behaviour as an aspect of cybersecurity assurance , 2016, Secur. Commun. Networks.

[25]  Il-Chul Moon,et al.  Efficient extraction of domain specific sentiment lexicon with active learning , 2015, Pattern Recognit. Lett..

[26]  Zinta S. Byrne,et al.  The Psychology of Security for the Home Computer User , 2012, 2012 IEEE Symposium on Security and Privacy.

[27]  S. Lange,et al.  Adjusting for multiple testing--when and how? , 2001, Journal of clinical epidemiology.

[28]  Julian Jang,et al.  A survey of emerging threats in cybersecurity , 2014, J. Comput. Syst. Sci..

[29]  Wiem Tounsi,et al.  A survey on technical threat intelligence in the age of sophisticated cyber attacks , 2018, Comput. Secur..

[30]  Blase Ur,et al.  Exploring User Mental Models of End-to-End Encrypted Communication Tools , 2018, FOCI @ USENIX Security Symposium.

[31]  Chinmaya Dabral,et al.  Learning a Privacy Incidents Database , 2017, HotSoS.

[32]  Tudor Dumitras,et al.  Asking for a Friend: Evaluating Response Biases in Security User Studies , 2018, CCS.

[33]  Nicolas Christin,et al.  Do or Do Not, There Is No Try: User Engagement May Not Improve Security Outcomes , 2016, SOUPS.

[34]  Florian Schaub,et al.  You `Might' Be Affected: An Empirical Analysis of Readability and Usability Issues in Data Breach Notifications , 2019, CHI.

[35]  Yang Wang,et al.  Cooperative Privacy and Security: Learning from People with Visual Impairments and Their Allies , 2019, SOUPS @ USENIX Security Symposium.

[36]  Mohammad Maifi Hasan Khan,et al.  To Follow or Not to Follow: A Study of User Motivations around Cybersecurity Advice , 2018, IEEE Internet Computing.

[37]  Elissa M. Redmiles,et al.  Where is the Digital Divide?: A Survey of Security, Privacy, and Socioeconomics , 2017, CHI.

[38]  Fabio Massimo Zanzotto,et al.  Terminology Extraction: An Analysis of Linguistic and Statistical Approaches , 2005 .

[39]  Florian Schaub,et al.  Put Your Warning Where Your Link Is: Improving and Evaluating Email Phishing Warnings , 2019, CHI.

[40]  L. Hadlington Human factors in cybersecurity; examining the link between Internet addiction, impulsivity, attitudes towards cybersecurity, and risky cybersecurity behaviours , 2017, Heliyon.

[41]  L. Cranor,et al.  Nudges for Privacy and Security , 2017, ACM Comput. Surv..

[42]  Adrienne Porter Felt,et al.  Alice in Warningland: A Large-Scale Field Study of Browser Security Warning Effectiveness , 2013, USENIX Security Symposium.

[43]  Erik Andersen,et al.  What.Hack: Engaging Anti-Phishing Training Through a Role-playing Phishing Simulation Game , 2019, CHI.

[44]  Anand M. Narasimhamurthy Theoretical bounds of majority voting performance for a binary classification problem , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[45]  Michael Röder,et al.  Exploring the Space of Topic Coherence Measures , 2015, WSDM.

[46]  Robert E. Crossler,et al.  The Mobile Privacy-Security Knowledge Gap Model: Understanding Behaviors , 2017, HICSS.

[47]  Rick Wash,et al.  Organization Interfaces—collaborative computing General Terms , 2022 .

[48]  Zhou Li,et al.  Acing the IOC Game: Toward Automatic Discovery and Analysis of Open-Source Cyber Threat Intelligence , 2016, CCS.

[49]  Edgar R. Weippl,et al.  "I Have No Idea What I'm Doing" - On the Usability of Deploying HTTPS , 2017, USENIX Security Symposium.

[50]  Béatrice Daille,et al.  Terminology Extraction with Term Variant Detection , 2016, ACL.

[51]  Ruth Breu,et al.  An analysis and classification of public information security data sources used in research and practice , 2019, Comput. Secur..

[52]  Petr Sojka,et al.  Software Framework for Topic Modelling with Large Corpora , 2010 .

[53]  Norbert Nthala,et al.  Rethinking Home Network Security , 2018 .

[54]  Xinli Yang,et al.  What Security Questions Do Developers Ask? A Large-Scale Study of Stack Overflow Posts , 2016, Journal of Computer Science and Technology.

[55]  Florian Skopik,et al.  A problem shared is a problem halved: A survey on the dimensions of collective cyber defense through security information sharing , 2016, Comput. Secur..

[56]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[57]  Kostas Stefanidis,et al.  A Sentiment-Statistical Approach for Identifying Problematic Mobile App Updates Based on User Reviews , 2020, Inf..

[58]  Florian Schaub,et al.  "I've Got Nothing to Lose": Consumers' Risk Perceptions and Protective Actions after the Equifax Data Breach , 2018, SOUPS @ USENIX Security Symposium.

[59]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[60]  Munindar P. Singh,et al.  Çorba: crowdsourcing to obtain requirements from regulations and breaches , 2019, Empirical Software Engineering.

[61]  Joseph Bonneau,et al.  Learning Assigned Secrets for Unlocking Mobile Devices , 2015, SOUPS.

[62]  Elissa M. Redmiles,et al.  The Battle for New York: A Case Study of Applied Digital Threat Modeling at the Enterprise Level , 2018, USENIX Security Symposium.

[63]  Laura A. Dabbish,et al.  The Effect of Social Influence on Security Sensitivity , 2014, SOUPS.

[64]  Thomas Zimmermann,et al.  Security Trend Analysis with CVE Topic Models , 2010, 2010 IEEE 21st International Symposium on Software Reliability Engineering.

[65]  Akira Yamada,et al.  Self-Confidence Trumps Knowledge: A Cross-Cultural Study of Security Behavior , 2017, CHI.

[66]  S. Boker,et al.  Windowed cross-correlation and peak picking for the analysis of variability in the association between behavioral time series. , 2002, Psychological methods.

[67]  Lorrie Faith Cranor,et al.  Crying Wolf: An Empirical Study of SSL Warning Effectiveness , 2009, USENIX Security Symposium.

[68]  Shane D. Johnson,et al.  What security features and crime prevention advice is communicated in consumer IoT device manuals and support pages? , 2019, J. Cybersecur..

[69]  Masahiro Fujita,et al.  An Attempt to Memorize Strong Passwords while Playing Games , 2015, 2015 18th International Conference on Network-Based Information Systems.

[70]  Laura A. Dabbish,et al.  Breaking! A Typology of Security and Privacy News and How It's Shared , 2018, CHI.

[71]  Laura A. Dabbish,et al.  Increasing Security Sensitivity With Social Proof: A Large-Scale Experimental Confirmation , 2014, CCS.

[72]  Gang Wang,et al.  End-to-End Measurements of Email Spoofing Attacks , 2018, USENIX Security Symposium.

[73]  Laura A. Dabbish,et al.  A Self-Report Measure of End-User Security Attitudes (SA-6) , 2019, SOUPS @ USENIX Security Symposium.

[74]  Ruth Shillair,et al.  Multiple Sources for Security: Seeking Online Safety Information and their Influence on Coping Self-efficacy and Protection Behavior Habits , 2017, HICSS.

[75]  Bin Li,et al.  Modeling the evolution of development topics using Dynamic Topic Models , 2015, 2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER).

[76]  Ahmed E. Hassan,et al.  Security versus performance bugs: a case study on Firefox , 2011, MSR '11.

[77]  James Nicholson,et al.  "If It's Important It Will Be A Headline": Cybersecurity Information Seeking in Older Adults , 2019, CHI.

[78]  Elissa M. Redmiles,et al.  Dancing Pigs or Externalities?: Measuring the Rationality of Security Decisions , 2018, EC.

[79]  Shouhuai Xu,et al.  Modeling and Predicting Cyber Hacking Breaches , 2018, IEEE Transactions on Information Forensics and Security.

[80]  Ahmed E. Hassan,et al.  What Do Programmers Discuss About Blockchain? A Case Study on the Use of Balanced LDA and the Reference Architecture of a Domain to Capture Online Discussions About Blockchain Platforms Across Stack Exchange Communities , 2019, IEEE Transactions on Software Engineering.

[81]  A. Tripathi,et al.  Taxonomic analysis of classification schemes in vulnerability databases , 2012, 2011 6th International Conference on Computer Sciences and Convergence Information Technology (ICCIT).

[82]  Rick Wash,et al.  Identifying patterns in informal sources of security information , 2015, J. Cybersecur..

[83]  Mohammad Maifi Hasan Khan,et al.  Why Do They Do What They Do?: A Study of What Motivates Users to (Not) Follow Computer Security Advice , 2016, SOUPS.

[84]  Sunny Consolvo,et al.  "...No one Can Hack My Mind": Comparing Expert and Non-Expert Security Practices , 2015, SOUPS.

[85]  Michelle L. Mazurek,et al.  Developers Need Support, Too: A Survey of Security Advice for Software Developers , 2017, 2017 IEEE Cybersecurity Development (SecDev).

[86]  Elissa M. Redmiles,et al.  How I Learned to be Secure: a Census-Representative Survey of Security Advice Sources and Behavior , 2016, CCS.

[87]  Rossouw von Solms,et al.  An information security knowledge sharing model in organizations , 2016, Comput. Hum. Behav..

[88]  Elissa M. Redmiles,et al.  First Steps Toward Measuring the Readability of Security Advice , 2018 .

[89]  So Young Sohn,et al.  Analyzing research trends in personal information privacy using topic modeling , 2017, Comput. Secur..