Synthetic Data Creation for Forensic Tool Testing: Improving Performance of the 3LSPG Framework

Increasing amounts of data require improvements in effectiveness and efficiency of forensic tools. If new tools have been developed, they have to be evaluated, e.g. by applying test data. 3LSPG has recently been proposed as a framework for generating synthetic test data by simulating activities of subjects using Markov chains. However, the generation of test data should also be efficient. In this paper, we show how to improve the efficiency of 3LSPG considerably compared to its original proposal. We show how to speed-up the calculation of state transition probabilities in the Markov model of 3LSPG by proposing an algorithm that is much faster and more reliable than the one originally used. The simplex algorithm serves as basis for our algorithm although it is typically used for the different purpose of solving optimization problems. Our algorithm helps to enable the creation of synthetic data for forensic tool testing with 3LSPG in significantly shorter time.

[1]  Ehrhard Behrends,et al.  Introduction to Markov Chains , 2000 .

[2]  Erland Jonsson,et al.  A Synthetic Fraud Data Generation Methodology , 2002, ICICS.

[3]  Markus Schneider,et al.  3LSPG: Forensic Tool Evaluation by Three Layer Stochastic Process-Based Generation of Data , 2010, ICWF.

[4]  Charles W. Adams Legal Issues Pertaining to the Development of Digital Forensic Tools , 2008, 2008 Third International Workshop on Systematic Approaches to Digital Forensic Engineering.

[5]  Marcus K. Rogers,et al.  Mobile Phone Forensics Tool Testing: A Database Driven Approach , 2007, Int. J. Digit. EVid..

[6]  Simson L. Garfinkel,et al.  Bringing science to digital forensics with standardized forensic corpora , 2009, Digit. Investig..

[7]  Jill Slay,et al.  Digital Forensics: Validation and Verification in a Dynamic Work Environment , 2007, 2007 40th Annual Hawaii International Conference on System Sciences (HICSS'07).

[8]  Eoghan Casey The increasing need for automation and validation in digital forensics , 2011, Digit. Investig..

[9]  Erland Jonsson,et al.  Synthesizing test data for fraud detection systems , 2003, 19th Annual Computer Security Applications Conference, 2003. Proceedings..

[10]  Lynn Margaret Batten,et al.  Robust performance testing for digital forensic tools , 2009, Digit. Investig..

[11]  George B. Dantzig,et al.  Linear Programming 1: Introduction , 1997 .

[12]  Jill Slay,et al.  Validation and verification of computer forensic software tools-Searching Function , 2009 .

[13]  Felix C. Freiling,et al.  The Forensic Image Generator Generator (Forensig2) , 2009, 2009 Fifth International Conference on IT Security Incident Management and IT Forensics.