A Digital Forensics Triage methodology based on feature manipulation techniques

The evolution of modern digital devices is outpacing the scalability and effectiveness of Digital Forensics techniques. Digital Forensics Triage is one solution to this problem as it can extract evidence quickly at the crime scene and provide vital intelligence in time critical investigations. Similarly, such methodologies can be used in a laboratory to prioritize deeper analysis of digital devices and alleviate examination backlog. Developments in Digital Forensics Triage methodologies have moved towards automating the device classification process and those which incorporate Machine Learning principles have proven to be successful. Such an approach depends on crime-related features which provide a relevant basis upon which device classification can take place. In addition, to be an accepted and viable methodology it should be also as accurate as possible. Previous work has concentrated on the issues of feature extraction and classification, where less attention has been paid to improving classification accuracy through feature manipulation. In this regard, among the several techniques available for the purpose, we concentrate on feature weighting, a process which places more importance on specific features. A twofold approach is followed: on one hand, automated feature weights are quantified using Kullback-Leibler measure and applied to the training set whereas, on the other hand, manual weights are determined with the contribution of surveyed digital forensic experts. Experimental results of manual and automatic feature weighting are described which conclude that both the techniques are effective in improving device classification accuracy in crime investigations.

[1]  Marcus K. Rogers,et al.  Computer Forensics Field Triage Process Model , 2006, J. Digit. Forensics Secur. Law.

[2]  Yoginder S. Dandass,et al.  Research toward a Partially-Automated, and Crime Specific Digital Triage Process Model , 2012, Comput. Inf. Sci..

[3]  Dejing Dou,et al.  Calculating Feature Weights in Naive Bayes with Kullback-Leibler Measure , 2011, 2011 IEEE 11th International Conference on Data Mining.

[4]  Simson L. Garfinkel,et al.  Bringing science to digital forensics with standardized forensic corpora , 2009, Digit. Investig..

[5]  Simson L. Garfinkel,et al.  An Automated Solution to the Multiuser Carved Data Ascription Problem , 2010, IEEE Transactions on Information Forensics and Security.

[6]  Usama M. Fayyad,et al.  Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning , 1993, IJCAI.

[7]  Fabio Marturana,et al.  A Machine Learning-based Triage methodology for automated categorization of digital media , 2013, Digit. Investig..

[8]  Timothy Grance,et al.  Guide to Integrating Forensic Techniques into Incident Response , 2006 .

[9]  Fabio Marturana,et al.  Mobile Forensics " triaging " : new directions for methodology , 2011 .

[10]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[11]  Houkuan Huang,et al.  Feature selection for text classification with Naïve Bayes , 2009, Expert Syst. Appl..

[12]  Marvin V. Zelkowitz,et al.  Experimental Models for Validating Technology , 1998, Computer.

[13]  Gianluigi Me,et al.  Fast User Classifying to Establish Forensic Analysis Priorities , 2009, 2009 Fifth International Conference on IT Security Incident Management and IT Forensics.

[14]  Gianluigi Me,et al.  A Quantitative Approach to Triaging in Mobile Forensics , 2011, 2011IEEE 10th International Conference on Trust, Security and Privacy in Computing and Communications.

[15]  Hilbert J. Kappen,et al.  Approximate inference for medical diagnosis , 1999, Pattern Recognit. Lett..

[16]  Simson L. Garfinkel,et al.  Automating Disk Forensic Processing with SleuthKit, XML and Python , 2009, 2009 Fourth International IEEE Workshop on Systematic Approaches to Digital Forensic Engineering.