论文信息 - Text Mining in Cybersecurity

Text Mining in Cybersecurity

The growth of data volume has changed cybersecurity activities, demanding a higher level of automation. In this new cybersecurity landscape, text mining emerged as an alternative to improve the efficiency of the activities involving unstructured data. This article proposes a Systematic Literature Review (SLR) to present the application of text mining in the cybersecurity domain. Using a systematic protocol, we identified 2,196 studies, out of which 83 were summarized. As a contribution, we propose a taxonomy to demonstrate the different activities in the cybersecurity domain supported by text mining. We also detail the strategies evaluated in the application of text mining tasks and the use of neural networks to support activities involving unstructured data. The work also discusses text classification performance aiming its application in real-world solutions. The SLR also highlights open gaps for future research, such as the analysis of non-English content and the intensification in the usage of neural networks.

[1] Hassan Takabi,et al. Automatic Extraction of Access Control Policies from Natural Language Documents , 2020, IEEE Transactions on Dependable and Secure Computing.

[2] Rossouw von Solms,et al. From information security to cyber security , 2013, Comput. Secur..

[3] Vadlamani Ravi,et al. A survey of the applications of text mining in financial domain , 2016, Knowl. Based Syst..

[4] Barnali Gupta Banik,et al. Novel Text Steganography Using Natural Language Processing and Part-of-Speech Tagging , 2020 .

[5] Awais Rashid,et al. Panning for gold: Automatically analysing online social engineering attack surfaces , 2017, Comput. Secur..

[6] Paulo Shakarian,et al. Early Warnings of Cyber Threats in Online Discussions , 2017, 2017 IEEE International Conference on Data Mining Workshops (ICDMW).

[7] Carina Jacobi,et al. Quantitative analysis of large amounts of journalistic texts using topic modelling , 2016, Rethinking Research Methods in an Age of Digital Journalism.

[8] Thomas Johnson,et al. Computer Security Incident Handling Guide , 2005 .

[9] Alex S. Wilner,et al. Cybersecurity and its discontents: Artificial intelligence, the Internet of Things, and digital misinformation , 2018, International Journal: Canada's Journal of Global Policy Analysis.

[10] Vallipuram Muthukkumarasamy,et al. A survey on data leakage prevention systems , 2016, J. Netw. Comput. Appl..

[11] Robert E. Crossler,et al. Taking stock of organisations’ protection of privacy: categorising and assessing threats to personally identifiable information in the USA , 2017, Eur. J. Inf. Syst..

[12] Sinan Aral,et al. The spread of true and false news online , 2018, Science.

[13] Vishal Gupta,et al. Recent automatic text summarization techniques: a survey , 2016, Artificial Intelligence Review.

[14] Xi Chen,et al. Assessing the severity of phishing attacks: A hybrid data mining approach , 2011, Decis. Support Syst..

[15] Mikhail Petrovskiy,et al. Applying text mining methods for data loss prevention , 2015, Programming and Computer Software.

[16] Mathieu Roche,et al. A survey of the applications of text mining for agriculture , 2019, Comput. Electron. Agric..

[17] Jacob Mashiah,et al. It's time for a change. , 2005, Clinics in dermatology.

[18] Jianwu Yang,et al. A semi-structured document model for text mining , 2008, Journal of Computer Science and Technology.

[19] Katrin Franke,et al. Extracting cyber threat intelligence from hacker forums: Support vector machines versus convolutional neural networks , 2017, 2017 IEEE International Conference on Big Data (Big Data).

[20] Paul Rimba,et al. Data-Driven Cybersecurity Incident Prediction: A Survey , 2019, IEEE Communications Surveys & Tutorials.

[21] Ronen Feldman,et al. Book Reviews: The Text Mining Handbook: Advanced Approaches to Analyzing Unstructured Data by Ronen Feldman and James Sanger , 2008, CL.

[22] Pranjal Singh,et al. A comparison of classifiers and features for authorship authentication of social networking messages , 2017, Concurr. Comput. Pract. Exp..

[23] Dirk Thorleuchter,et al. Improved multilevel security with latent semantic indexing , 2012, Expert Syst. Appl..

[24] Peng Wang,et al. Semantic expansion using word embedding clustering and convolutional neural network for improving short text classification , 2016, Neurocomputing.

[25] Ying Wah Teh,et al. Text mining for market prediction: A systematic review , 2014, Expert Syst. Appl..

[26] Aitor Couce Vieira,et al. Assessing and Forecasting Cybersecurity Impacts , 2020, Decis. Anal..

[27] Eduardo B. Fernández,et al. An analysis of security issues for cloud computing , 2013, Journal of Internet Services and Applications.

[28] Michal Munk,et al. Data Pre-Processing Evaluation for Text Mining: Transaction/Sequence Model , 2013, ICCS.

[29] Amber Jaycocks,et al. Human-machine detection of online-based malign information , 2020 .

[30] Jackie Rees Ulmer,et al. The Association Between the Disclosure and the Realization of Information Security Risk Factors , 2013, Inf. Syst. Res..

[31] Gurpreet Singh Lehal,et al. A Survey of Text Mining Techniques and Applications , 2009 .

[32] Cheng Huang,et al. Analyzing and Identifying Data Breaches in Underground Forums , 2019, IEEE Access.

[33] Ivan K. Ash,et al. Improving employees’ intellectual capacity for cybersecurity through evidence-based malware training , 2019 .

[34] Liusheng Huang,et al. Steganalysis against substitution-based linguistic steganography based on context clusters , 2011, Comput. Electr. Eng..

[35] Peter K. Smith,et al. Cyberbullying: another main type of bullying? , 2008, Scandinavian journal of psychology.

[36] Miguel Morales-Sandoval,et al. A policy-based containerized filter for secure information sharing in organizational environments , 2019, Future Gener. Comput. Syst..

[37] Feng Yu,et al. Attention-based convolutional approach for misinformation identification from massive and noisy microblog posts , 2019, Comput. Secur..

[38] Peng Wang,et al. Self-Taught Convolutional Neural Networks for Short Text Clustering , 2017, Neural Networks.

[39] Barbara Gaudenzi,et al. Effects of data breaches from user-generated content: A corporate reputation analysis , 2019, European Management Journal.

[40] Rui Li,et al. Sentiment classification with adversarial learning and attention mechanism , 2020, Comput. Intell..

[41] Hassan Takabi,et al. Towards a Top-down Policy Engineering Framework for Attribute-based Access Control , 2017, SACMAT.

[42] Emilio Ferrara,et al. Deep Neural Networks for Bot Detection , 2018, Inf. Sci..

[43] M. Petró‐Turza,et al. The International Organization for Standardization. , 2003 .

[44] R. M. Chandrasekaran,et al. A comparative performance evaluation of neural network based approach for sentiment classification of online reviews , 2016, J. King Saud Univ. Comput. Inf. Sci..

[45] Georgios Kambourakis,et al. Automatic Detection of Online Recruitment Frauds: Characteristics, Methods, and a Public Dataset , 2017, Future Internet.

[46] Hongsong Zhu,et al. Social Engineering in Cybersecurity: Effect Mechanisms, Human Vulnerabilities and Attack Methods , 2021, IEEE Access.

[47] Guy Lapalme,et al. A systematic analysis of performance measures for classification tasks , 2009, Inf. Process. Manag..

[48] A. White. VULNERABILITY MANAGEMENT , 2013 .

[49] Lynne Edwards,et al. Detecting Cyberbullying Activity Across Platforms , 2020 .

[50] Flora Amato,et al. Analyse digital forensic evidences through a semantic-based methodology and NLP techniques , 2019, Future Gener. Comput. Syst..

[51] John Elder,et al. Practical Text Mining and Statistical Analysis for Non-structured Text Data Applications , 2012 .

[52] Albert Y. Zomaya,et al. A survey on text mining in social networks , 2015, The Knowledge Engineering Review.

[53] Fausto Giunchiglia,et al. Deep Feature-Based Text Clustering and its Explanation , 2022, IEEE Transactions on Knowledge and Data Engineering.

[54] Murat Can Ganiz,et al. Semantic text classification: A survey of past and recent advances , 2018, Inf. Process. Manag..

[55] Low Tang Jung,et al. Data security rules/regulations based classification of file data using TsF-kNN algorithm , 2016, Cluster Computing.

[56] Monther Aldwairi,et al. Detecting Fake News in Social Media Networks , 2018, EUSPN/ICTH.

[57] Marek Pawlicki,et al. On the Impact of Network Data Balancing in Cybersecurity Applications , 2020, ICCS.

[58] Thomas C. Eskridge,et al. A hybrid approach to improving program security , 2017, 2017 IEEE Symposium Series on Computational Intelligence (SSCI).

[59] Ying Dong,et al. A Novel Automatic Severity Vulnerability Assessment Framework , 2015, J. Commun..

[60] Richard Kissel,et al. Glossary of Key Information Security Terms , 2014 .

[61] Kim-Kwang Raymond Choo,et al. Visual Question Authentication Protocol (VQAP) , 2017, Comput. Secur..

[62] Sunny Behal,et al. Distributed Denial of Service Attacks and Defense Mechanisms: Current Landscape and Future Directions , 2018 .

[63] Ahmet Ali Süzen. A Risk-Assessment of Cyber Attacks and Defense Strategies in Industry 4.0 Ecosystem , 2020, International Journal of Computer Network and Information Security.

[64] Fang Liu,et al. Enterprise data breach: causes, challenges, prevention, and future directions , 2017, WIREs Data Mining Knowl. Discov..

[65] Yorick Wilks,et al. Cyberattack Prediction Through Public Text Analysis and Mini-Theories , 2018, 2018 IEEE International Conference on Big Data (Big Data).

[66] Tim Menzies,et al. What is wrong with topic modeling? And how to fix it using search-based software engineering , 2016, Inf. Softw. Technol..

[67] Evita March,et al. Predicting perpetration of intimate partner cyberstalking: Gender and the Dark Tetrad , 2017, Comput. Hum. Behav..

[68] Wojciech Mazurczyk,et al. Trends in steganography , 2014, Commun. ACM.

[69] Raed Abu Zitar,et al. Genetic optimized artificial immune system in spam detection: a review and a model , 2011, Artificial Intelligence Review.

[70] Gang Wang,et al. Crowdsourcing Cybersecurity: Cyber Attack Detection using Social Media , 2017, CIKM.

[71] Liusheng Huang,et al. Detection of substitution-based linguistic steganography by relative frequency analysis , 2011, Digit. Investig..

[72] Arun Kumar Sangaiah,et al. SMSAD: a framework for spam message and spam account detection , 2017, Multimedia Tools and Applications.

[73] Manar Alohaly,et al. A Deep Learning Approach for Extracting Attributes of ABAC Policies , 2018, SACMAT.

[74] Timothy W. Finin,et al. Extracting Cybersecurity Related Linked Data from Text , 2013, 2013 IEEE Seventh International Conference on Semantic Computing.

[75] Christopher D. Manning,et al. Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[76] Shigang Liu,et al. A performance evaluation of deep‐learnt features for software vulnerability detection , 2018, Concurr. Comput. Pract. Exp..

[77] Justin W. Patchin,et al. Sextortion Among Adolescents: Results From a National Survey of U.S. Youth , 2018, Sexual abuse : a journal of research and treatment.

[78] Salim Hariri,et al. Autonomic Author Identification in Internet Relay Chat (IRC) , 2018, 2018 IEEE/ACS 15th International Conference on Computer Systems and Applications (AICCSA).

[79] Charu C. Aggarwal,et al. Mining Text Data , 2012, Springer US.

[80] Asad Waqar Malik,et al. A machine learning framework for investigating data breaches based on semantic analysis of adversary's attack patterns in threat intelligence repositories , 2019, Future Gener. Comput. Syst..

[81] Richard Frank,et al. Identifying digital threats in a hacker web forum , 2015, 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[82] Nicole Beebe,et al. Clustering digital forensic string search output , 2014, Digit. Investig..

[83] Jitendra Kumar Rout,et al. Deceptive review detection using labeled and unlabeled data , 2016, Multimedia Tools and Applications.

[84] Seetha Hari,et al. Learning From Imbalanced Data , 2019, Advances in Computer and Electrical Engineering.

[85] Serkan Günal,et al. The impact of preprocessing on text classification , 2014, Inf. Process. Manag..

[86] Bülent Sankur,et al. Natural language watermarking via morphosyntactic alterations , 2009, Comput. Speech Lang..

[87] Simon Parkinson,et al. Fog computing security: a review of current applications and security solutions , 2017, Journal of Cloud Computing.

[88] Nur Al Hasan Haldar,et al. BiSAL - A bilingual sentiment analysis lexicon to analyze Dark Web forums for cyber security , 2015 .

[89] Satoshi Sekine,et al. A survey of named entity recognition and classification , 2007 .

[90] Ali Yazdian Varjani,et al. New rule-based phishing detection method , 2016, Expert Syst. Appl..

[91] Vallipuram Muthukkumarasamy,et al. Adaptable N-gram classification model for data leakage prevention , 2013, 2013, 7th International Conference on Signal Processing and Communication Systems (ICSPCS).

[92] Philip S. Yu,et al. A Survey on Text Classification: From Traditional to Deep Learning , 2020, ACM Trans. Intell. Syst. Technol..

[93] A. Nur Zincir-Heywood,et al. User identification via neural network based language models , 2019, Int. J. Netw. Manag..

[94] Selim Akyokus,et al. Deep Learning- and Word Embedding-Based Heterogeneous Classifier Ensembles for Text Classification , 2018, Complex..

[95] Justin Hsu,et al. Fake News Detection via NLP is Vulnerable to Adversarial Attacks , 2019, ICAART.

[96] Ke Wang,et al. Anonymizing bag-valued sparse data by semantic similarity-based clustering , 2013, Knowledge and Information Systems.

[97] Lefteris Angelis,et al. The impact of information security events to the stock market: A systematic literature review , 2016, Comput. Secur..

[98] Ali Dehghantanha,et al. Machine learning aided Android malware classification , 2017, Comput. Electr. Eng..

[99] Jungkook An,et al. A Data Analytics Approach to the Cybercrime Underground Economy , 2018, IEEE Access.

[100] Al-Sakib Khan Pathan,et al. Innovations of Phishing Defense: The Mechanism, Measurement and Defense Strategies , 2018, Int. J. Commun. Networks Inf. Secur..

[101] J. I. Sheeba,et al. Online Social Network Bullying Detection Using Intelligence Techniques , 2015 .

[102] Marianne Schneider. Election Security: Increasing Election Integrity by Improving Cybersecurity , 2019, The Future of Election Administration.

[103] Salwani Abdullah,et al. Approaches to Cross-Domain Sentiment Analysis: A Systematic Literature Review , 2017, IEEE Access.

[104] Daniel S. Berman,et al. A Survey of Deep Learning Methods for Cyber Security , 2019, Inf..

[105] Natalia Miloslavskaya,et al. Big Data, Fast Data and Data Lake Concepts , 2016, BICA.

[106] Romilla Syed,et al. Enterprise reputation threats on social media: A case of data breach framing , 2019, J. Strateg. Inf. Syst..

[107] Laura Ferrari,et al. A Comparison between Preprocessing Techniques for Sentiment Analysis in Twitter , 2016, KDWeb.

[108] Kim-Kwang Raymond Choo,et al. A machine learning-based FinTech cyber threat attribution framework using high-level indicators of compromise , 2019, Future Gener. Comput. Syst..

[109] Timothy W. Finin,et al. CyberTwitter: Using Twitter to generate alerts for cybersecurity threats and vulnerabilities , 2016, 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[110] Mukesh K. Mohania,et al. The Mask of ZoRRo: preventing information leakage from documents , 2014, Knowledge and Information Systems.

[111] El-Sayed M. El-Alfy,et al. Spam filtering framework for multimodal mobile communication based on dendritic cell algorithm , 2016, Future Gener. Comput. Syst..

[112] Torsten Oliver Salge,et al. The application of text mining methods in innovation research: current state, evolution patterns, and development priorities , 2020, R&D Management.

[113] Evandro Costa,et al. Text mining in education , 2019, WIREs Data Mining Knowl. Discov..

[114] Chen Huang,et al. Learning Deep Representation for Imbalanced Classification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[115] Kevin Matthe Caramancion. An Exploration of Disinformation as a Cybersecurity Threat , 2020, 2020 3rd International Conference on Information and Computer Technologies (ICICT).

[116] Mark Heitmann,et al. Comparing automated text classification methods , 2019, International Journal of Research in Marketing.

[117] Ángel Martín del Rey,et al. A New Proposal on the Advanced Persistent Threat: A Survey , 2020, Applied Sciences.

[118] Mourad Debbabi,et al. SONAR: Automatic Detection of Cyber Security Events over the Twitter Stream , 2017, ARES.

[119] Laith Mohammad Abualigah,et al. A Novel Weighting Scheme Applied to Improve the Text Document Clustering Techniques , 2018 .

[120] Walaa Medhat,et al. Sentiment analysis algorithms and applications: A survey , 2014 .

[121] Jeffrey M. Keisler,et al. What it takes to get retweeted: An analysis of software vulnerability messages , 2018, Comput. Hum. Behav..

[122] Li-Jia Wei,et al. Sec-Buzzer: cyber security emerging topic mining with open threat intelligence retrieval and timeline event annotation , 2016, Soft Computing.

[123] From unstructured data to actionable intelligence , 2003 .

[124] Yi Yang,et al. Beating the Artificial Chaos: Fighting OSN Spam Using Its Own Templates , 2016, IEEE/ACM Transactions on Networking.

[125] Yuval Elovici,et al. CoBAn: A context based model for data leakage prevention , 2014, Inf. Sci..

[126] Cheng Huang,et al. A study on Web security incidents in China by analyzing vulnerability disclosure platforms , 2016, Comput. Secur..

[127] Stephen C. Adams,et al. Selecting System Specific Cybersecurity Attack Patterns Using Topic Modeling , 2018, 2018 17th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/ 12th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE).

[128] Hang Li. Learning to Rank for Information Retrieval and Natural Language Processing , 2011, Synthesis Lectures on Human Language Technologies.

[129] Jian-hua Li,et al. Cyber security meets artificial intelligence: a survey , 2018, Frontiers of Information Technology & Electronic Engineering.

[130] Pierre Zweigenbaum,et al. Text mining applications in psychiatry: a systematic literature review , 2016, International journal of methods in psychiatric research.

[131] Ghazi Al-Naymat,et al. SMS Spam Detection using H2O Framework , 2017, EUSPN/ICTH.

[132] Mandar Mitra,et al. Information Retrieval from Documents: A Survey , 2000, Information Retrieval.

[133] Akshi Kumar,et al. Empirical Evaluation of Shallow and Deep Classifiers for Rumor Detection , 2020 .

[134] Lefteris Angelis,et al. A multi-target approach to estimate software vulnerability characteristics and severity scores , 2018, J. Syst. Softw..

[135] Vipul K. Dabhi,et al. A survey on semantic document clustering , 2015, 2015 IEEE International Conference on Electrical, Computer and Communication Technologies (ICECCT).

[136] Jong Hyuk Park,et al. S-Detector: an enhanced security model for detecting Smishing attack for mobile computing , 2017, Telecommun. Syst..

[137] Georgios P. Petasis. Machine Learning in Natural Language Processing , 2012 .

[138] Awais Ahmad,et al. Towards ontology-based multilingual URL filtering: a big data problem , 2018, The Journal of Supercomputing.

[139] Yuqing Zhang,et al. ASVC: An Automatic Security Vulnerability Categorization Framework Based on Novel Features of Vulnerability Data , 2015, J. Commun..

[140] Basemah Alshemali,et al. Improving the Reliability of Deep Neural Networks in NLP: A Review , 2020, Knowl. Based Syst..

[141] Hsinchun Chen,et al. Selecting Attributes for Sentiment Classification Using Feature Relation Networks , 2011, IEEE Transactions on Knowledge and Data Engineering.

[142] Roberto Camacho Barranco,et al. Analyzing Evolving Trends of Vulnerabilities in National Vulnerability Database , 2018, 2018 IEEE International Conference on Big Data (Big Data).

[143] Julian Jang,et al. A survey of emerging threats in cybersecurity , 2014, J. Comput. Syst. Sci..

[144] Wiem Tounsi,et al. A survey on technical threat intelligence in the age of sophisticated cyber attacks , 2018, Comput. Secur..

[145] Stavros Shiaeles,et al. Localising social network users and profiling their movement , 2019, Comput. Secur..

[146] Xingming Sun,et al. Linguistic steganalysis using the features derived from synonym frequency , 2012, Multimedia Tools and Applications.

[147] G. Manimaran,et al. Cybersecurity for Critical Infrastructures: Attack and Defense Modeling , 2010, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[148] Jay F. Nunamaker,et al. Identifying and Profiling Key Sellers in Cyber Carding Community: AZSecure Text Mining System , 2016, J. Manag. Inf. Syst..

[149] ChengXiang Zhai,et al. Text Data Management and Analysis: A Practical Introduction to Information Retrieval and Text Mining , 2016 .

[150] Nicole Beebe,et al. Post-retrieval search hit clustering to improve information retrieval effectiveness: Two digital forensics case studies , 2011, Decis. Support Syst..

[151] Kyungho Lee,et al. Detecting Potential Insider Threat: Analyzing Insiders' Sentiment Exposed in Social Media , 2018, Secur. Commun. Networks.

[152] David M. Blei,et al. Probabilistic topic models , 2012, Commun. ACM.

[153] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..

[154] Flora Amato,et al. An integrated framework for securing semi-structured health records , 2015, Knowl. Based Syst..

[155] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.

[156] Stuart E. Madnick,et al. Systematically Understanding the Cyber Attack Business , 2018, ACM Comput. Surv..

[157] W. Bruce Croft,et al. A Deep Look into Neural Ranking Models for Information Retrieval , 2019, Inf. Process. Manag..

[158] Dr. Charu C. Aggarwal. Machine Learning for Text , 2018, Springer International Publishing.

[159] Nir Kshetri,et al. The simple economics of cybercrimes , 2006, IEEE Security & Privacy Magazine.

[160] Hongxia Jin,et al. Data leakage mitigation for discretionary access control in collaboration clouds , 2011, SACMAT '11.

[161] Carolyn F. Holton,et al. Identifying disgruntled employee systems fraud risk through text mining: A simple solution for a multi-billion dollar problem , 2009, Decis. Support Syst..

[162] Cristina Nita-Rotaru,et al. Leveraging Textual Specifications for Grammar-based Fuzzing of Network Protocols , 2019, AAAI.

[163] S. Ananiadou,et al. Using text mining for study identification in systematic reviews: a systematic review of current approaches , 2015, Systematic Reviews.

[164] Lejla Turulja,et al. Text Mining for Big Data Analysis in Financial Sector: A Literature Review , 2019, Sustainability.

[165] Vishal Gupta,et al. A systematic review of text stemming techniques , 2016, Artificial Intelligence Review.

[166] William J. Buchanan,et al. Machine learning and semantic analysis of in-game chat for cyberbullying , 2018, Comput. Secur..

[167] Danny Hendler,et al. Detecting Malicious PowerShell Commands using Deep Neural Networks , 2018, AsiaCCS.

[168] Bin Zhang,et al. Examining Hacker Participation Length in Cybercriminal Internet-Relay-Chat Communities , 2016, J. Manag. Inf. Syst..

[169] Ike Vayansky,et al. A review of topic modeling methods , 2020, Inf. Syst..

[170] C. Goose,et al. Glossary of Terms , 2004, Machine Learning.

[171] Mingxing He,et al. An efficient phishing webpage detector , 2011, Expert Syst. Appl..

[172] Maode Ma,et al. A Novel Mechanism for Fast Detection of Transformed Data Leakage , 2018, IEEE Access.

[173] Fengjun Li,et al. Cyber-Physical Systems Security—A Survey , 2017, IEEE Internet of Things Journal.

[174] Laurie A. Williams,et al. Relation extraction for inferring access control rules from natural language artifacts , 2014, ACSAC.

[175] Stephen Clark,et al. Practical Linguistic Steganography using Contextual Synonym Substitution and a Novel Vertex Coding Method , 2014, CL.

[176] Lior Rokach,et al. SFEM: Structural feature extraction methodology for the detection of malicious office documents using machine learning methods , 2016, Expert Syst. Appl..

[177] Sanggil Kang,et al. Code authorship identification using convolutional neural networks , 2019, Future Gener. Comput. Syst..

[178] Rui Zhao,et al. Fuzzy Bag-of-Words Model for Document Representation , 2018, IEEE Transactions on Fuzzy Systems.

[179] Jun Zhao,et al. Recurrent Convolutional Neural Networks for Text Classification , 2015, AAAI.

[180] Bhaskar Mitra,et al. Neural Models for Information Retrieval , 2017, ArXiv.

[181] Raghu Kacker,et al. It Doesn’t Have to Be Like This: Cybersecurity Vulnerability Trends , 2017, IT Professional.