An Enhanced Multiclass Support Vector Machine Model and its Application to Classifying File Systems Affected by a Digital Crime

Abstract The digital revolution we are witnessing nowadays goes hand in hand with a revolution in cybercrime. This irrefutable fact has been a major reason for making digital forensic (DF) a pressing and timely topic to investigate. Thanks to the file system which is a rich source of digital evidence that may prove or deny a digital crime. Yet, although there are many tools that can be used to extract potentially conclusive evidence from the file system, there is still a need to develop effective techniques for evaluating the extracted evidence and link it directly to a digital crime. Machine learning can be posed as a possible solution looming in the horizon. This article proposes an Enhanced Multiclass Support Vector Machine (EMSVM) model that aims to improve the classification performance. The EMSVM suggests a new technique in selecting the most effective set of parameters when building a SVM model. In addition, since the DF is considered a multiclass classification problem duo to the fact that a file system might be accecced by more than one application, the EMSVM enhances the class assignment mechanism by supporting multi-class classification. The article then investigates the applicability of the proposed model in analysing incriminating digital evidence by inspecting the historical activities of file systems to realize if a malicious program manipulated them. The results obtained from the proposed model were promising when compared to several machine-learning algorithms.

[1]  M. Tahar Kechadi,et al.  A complete formalized knowledge representation model for advanced digital forensics timeline analysis , 2014, Digit. Investig..

[2]  Fadi A. Thabtah,et al.  Intelligent phishing detection system for e-banking using fuzzy data mining , 2010, Expert Syst. Appl..

[3]  Fadi Thabtah,et al.  Predicting Phishing Websites using Neural Network trained with Back-Propagation , 2013 .

[4]  T. L. McCluskey,et al.  Intelligent rule-based phishing websites classification , 2014, IET Inf. Secur..

[5]  Tzu-Liang Tseng,et al.  E-quality control: A support vector machines approach , 2016, J. Comput. Des. Eng..

[6]  Zhihui Lu,et al.  Selective encryption on ECG data in body sensor network based on supervised machine learning , 2020, Inf. Fusion.

[7]  Feng Chen,et al.  A Classification Algorithm of Moving Military Vehicle , 2014, CIT 2014.

[8]  Eugene H. Spafford,et al.  Automated Digital Evidence Target Definition Using Outlier Analysis and Existing Evidence , 2005, DFRWS.

[9]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[10]  Marilyn T. Miller,et al.  Henry Lee's Crime Scene Handbook , 2001 .

[11]  Sungzoon Cho,et al.  Expected margin-based pattern selection for support vector machines , 2020, Expert Syst. Appl..

[12]  Yoshua Bengio,et al.  Random Search for Hyper-Parameter Optimization , 2012, J. Mach. Learn. Res..

[13]  Rami M. Mohammad,et al.  A comparison of machine learning techniques for file system forensics analysis , 2019, J. Inf. Secur. Appl..

[14]  Rami Mustafa A. Mohammad,et al.  A Neural Network based Digital Forensics Classification , 2018, 2018 IEEE/ACS 15th International Conference on Computer Systems and Applications (AICCSA).

[15]  T. L. McCluskey,et al.  Predicting phishing websites based on self-structuring neural network , 2013, Neural Computing and Applications.

[16]  Nir Friedman,et al.  Bayesian Network Classifiers , 1997, Machine Learning.

[17]  Bill Nelson,et al.  Guide to Computer Forensics and Investigations , 2003 .

[18]  Qingzhong Liu,et al.  Feature Selection for Improved Phishing Detection , 2012, IEA/AIE.

[19]  T. L. McCluskey,et al.  An assessment of features related to phishing websites using an automated technique , 2012, 2012 International Conference for Internet Technology and Secured Transactions.

[20]  Harlan Carvey The Windows Registry as a forensic resource , 2005, Digit. Investig..

[21]  George M. Mohay,et al.  Mining e-mail content for author identification forensics , 2001, SGMD.

[22]  T. L. McCluskey,et al.  Tutorial and critical analysis of phishing websites methods , 2015, Comput. Sci. Rev..

[23]  Chao Sun,et al.  SVM-based image partitioning for vision recognition of AGV guide paths under complex illumination conditions , 2020, Robotics Comput. Integr. Manuf..

[24]  Gurpreet Singh,et al.  Prediction of Coronary Heart Disease using Machine Learning: An Experimental Analysis , 2019, ICDLT.

[25]  Lean Yu,et al.  Bio-Inspired Credit Risk Analysis: Computational Intelligence with Support Vector Machines , 2008 .

[26]  T. L. McCluskey,et al.  A dynamic self-structuring neural network model to combat phishing , 2016, 2016 International Joint Conference on Neural Networks (IJCNN).

[27]  Atif Ahmad,et al.  FIRESTORM: Exploring the Need for a Forensic Tool for Pattern Correlation in Windows NT Audit Logs , 2002 .

[28]  Brian D. Carrier,et al.  File System Forensic Analysis , 2005 .