Big forensic data reduction: digital forensic images and electronic evidence

An issue that continues to impact digital forensics is the increasing volume of data and the growing number of devices. One proposed method to deal with the problem of “big digital forensic data”: the volume, variety, and velocity of digital forensic data, is to reduce the volume of data at either the collection stage or the processing stage. We have developed a novel approach which significantly improves on current practice, and in this paper we outline our data volume reduction process which focuses on imaging a selection of key files and data such as: registry, documents, spreadsheets, email, internet history, communications, logs, pictures, videos, and other relevant file types. When applied to test cases, a hundredfold reduction of original media volume was observed. When applied to real world cases of an Australian Law Enforcement Agency, the data volume further reduced to a small percentage of the original media volume, whilst retaining key evidential files and data. The reduction process was applied to a range of real world cases reviewed by experienced investigators and detectives and highlighted that evidential data was present in the data reduced forensic subset files. A data reduction approach is applicable in a range of areas, including: digital forensic triage, analysis, review, intelligence analysis, presentation, and archiving. In addition, the data reduction process outlined can be applied using common digital forensic hardware and software solutions available in appropriately equipped digital forensic labs without requiring additional purchase of software or hardware. The process can be applied to a wide variety of cases, such as terrorism and organised crime investigations, and the proposed data reduction process is intended to provide a capability to rapidly process data and gain an understanding of the information and/or locate key evidence or intelligence in a timely manner.

[1]  Xiangfeng Luo,et al.  Measuring the semantic discrimination capability of association relations , 2014, Concurr. Comput. Pract. Exp..

[2]  Ross Brown,et al.  Design of a Digital Forensics Image Mining System , 2005, KES.

[3]  W. Alinka,et al.  XIRAF – XML-based indexing and querying for digital forensics , 2016 .

[4]  Taher Ahmed Ghaleb Techniques and countermeasures of website/wireless traffic analysis and fingerprinting , 2015, Cluster Computing.

[5]  Dorian Pyle,et al.  Data Preparation for Data Mining , 1999 .

[6]  E. Casey,et al.  Investigation Delayed Is Justice Denied: Proposals for Expediting Forensic Examinations of Digital Evidence * , 2009, Journal of forensic sciences.

[7]  Simson L. Garfinkel,et al.  Forensic feature extraction and cross-drive analysis , 2006, Digit. Investig..

[8]  Mark Pollitt Triage: A practical solution or admission of failure , 2013, Digit. Investig..

[9]  E. J. van Eijk,et al.  Digital Forensics as a Service: A game changer , 2014, Digit. Investig..

[10]  Lan Chen,et al.  Knowle: A semantic link network based system for organizing large scale online news events , 2015, Future Gener. Comput. Syst..

[11]  Padhraic Smyth,et al.  Knowledge Discovery and Data Mining: Towards a Unifying Framework , 1996, KDD.

[12]  Golden G. Richard,et al.  Rapid Forensic Acquisition of Large Media with Sifting Collectors , 2015 .

[13]  Rajiv Ranjan,et al.  Towards building a data-intensive index for big data computing - A case study of Remote Sensing data processing , 2015, Inf. Sci..

[14]  E. J. van Eijk,et al.  Digital forensics as a service: Game on , 2015, Digit. Investig..

[15]  Ashok N. Srivastava,et al.  Data Mining: Concepts, Models, Methods, and Algorithms , 2005, J. Comput. Inf. Sci. Eng..

[16]  Philip Turner,et al.  Applying a forensic approach to incident response, network investigation and system administration using Digital Evidence Bags , 2007, Digit. Investig..

[17]  Nicole Beebe,et al.  Digital Forensic Research: The Good, the Bad and the Unaddressed , 2009, IFIP Int. Conf. Digital Forensics.

[18]  Rajiv Ranjan,et al.  IK-SVD: Dictionary Learning for Spatial Big Data via Incremental Atom Update , 2014, Computing in Science & Engineering.

[19]  Guangquan Zhang,et al.  Uncertainty Analysis for the Keyword System of Web Events , 2016, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[20]  Andrew Jones,et al.  An Ontology-Based Forensic Analysis Tool , 2013 .

[21]  O. Ribaux,et al.  Intelligence-led crime scene processing. Part I: Forensic intelligence. , 2010, Forensic science international.

[22]  Matthew Geiger,et al.  OpenLV: Empowering investigators and first-responders in the digital forensics process , 2014, Digit. Investig..

[23]  Nathan Clarke,et al.  A forensic acquisition and analysis system for IaaS , 2016, 2016 11th International Conference on Availability, Reliability and Security (ARES).

[24]  Lan Chen,et al.  Semantic based representing and organizing surveillance big data using video structural description technology , 2015, J. Syst. Softw..

[25]  Gilbert L. Peterson,et al.  Applicability of Latent Dirichlet Allocation to multi-disk search , 2014, Digit. Investig..

[26]  Eoghan Casey,et al.  Honing digital forensic processes , 2013, Digit. Investig..

[27]  Xue Chen,et al.  Building Association Link Network for Semantic Link on Web Resources , 2011, IEEE Transactions on Automation Science and Engineering.

[28]  Simson L. Garfinkel,et al.  Digital forensics research: The next 10 years , 2010, Digit. Investig..

[29]  D. Edwards Data Mining: Concepts, Models, Methods, and Algorithms , 2003 .

[30]  Lan Chen,et al.  Semantic enhanced cloud environment for surveillance data management using video structural description , 2014, Computing.

[31]  Helmut Krcmar,et al.  Big Data , 2014, Wirtschaftsinf..

[32]  G. Richard,et al.  Breaking the Performance Wall: The Case for Distributed Digital Forensics , 2004 .

[33]  Alex van Ballegooij,et al.  Engineering an online computer forensic service , 2012, Digit. Investig..

[34]  Nicole Beebe,et al.  Dealing with Terabyte Data Sets in Digital Investigations , 2005 .

[35]  Lan Chen,et al.  Semantic Link Network-Based Model for Organizing Multimedia Big Data , 2014, IEEE Transactions on Emerging Topics in Computing.

[36]  Erin E. Kenneally,et al.  Risk sensitive digital evidence collection , 2005, Digit. Investig..

[37]  Rajiv Ranjan,et al.  Geographical information system parallelization for spatial big data processing: a review , 2016, Cluster Computing.

[38]  Yunhuai Liu,et al.  Crowdsourcing based social media data analysis of urban emergency events , 2017, Multimedia Tools and Applications.

[39]  Pierre Margot,et al.  The contribution of forensic science to crime analysis and investigation: forensic intelligence. , 2006, Forensic science international.

[40]  Kim-Kwang Raymond Choo,et al.  framework for digital forensic evidence : Storage , intelligence , review and archive , 2014 .

[41]  Jun Zhang,et al.  Online Comment-Based Hotel Quality Automatic Assessment Using Improved Fuzzy Comprehensive Evaluation and Fuzzy Cognitive Map , 2015, IEEE Transactions on Fuzzy Systems.

[42]  Shunxiang Zhang,et al.  Mining temporal explicit and implicit semantic relations between entities using web search engines , 2014, Future Gener. Comput. Syst..

[43]  Andrew Sheldon The future of forensic computing , 2005, Digit. Investig..

[44]  Golden G. Richard,et al.  Rapid forensic imaging of large disks with sifting collectors , 2015, Digit. Investig..

[45]  Dowon Hong,et al.  High-speed search using Tarari content processor in digital forensics , 2008, Digit. Investig..

[46]  Lynn Greiner Sniper forensics , 2009, NTWK.

[47]  Sriram Raghavan,et al.  Digital forensic research: current state of the art , 2012, CSI Transactions on ICT.

[48]  Albert Y. Zomaya,et al.  Particle Swarm Optimization based dictionary learning for remote sensing big data , 2015, Knowl. Based Syst..

[49]  Andrew Russell,et al.  Current issues confronting well-established computer-assisted child exploitation and computer crime task forces , 2004, Digit. Investig..

[50]  Anargyros Chryssanthou,et al.  On-scene triage open source forensic tool chests: Are they effective? , 2013, Digit. Investig..

[51]  Simson L. Garfinkel,et al.  Bringing science to digital forensics with standardized forensic corpora , 2009, Digit. Investig..

[52]  D. Shalini Punithavathani,et al.  Surveillance of anomaly and misuse in critical networks to counter insider threats using computational intelligence , 2014, Cluster Computing.

[53]  Lan Chen,et al.  Generating temporal semantic context of concepts using web search engines , 2014, J. Netw. Comput. Appl..

[54]  Golden G. Richard,et al.  Chapter IV Digital Forensics Tools : The Next Generation , .

[55]  Michael Wilkinson,et al.  The use of random sampling in investigations involving child abuse material , 2012, Digit. Investig..

[56]  Harry Parsonage,et al.  Computer Forensics Case Assessment and Triage - some ideas for discussion , 2010 .

[57]  Musaed Alhussein Automatic facial emotion recognition using weber local descriptor for e-Healthcare system , 2016, Cluster Computing.

[58]  Adrian Shaw,et al.  A practical and robust approach to coping with large volumes of data submitted for digital forensic examination , 2013, Digit. Investig..

[59]  Matthew M. Shannon Forensic Relative Strength Scoring: ASCII and Entropy Scoring , 2004, Int. J. Digit. EVid..

[60]  Yunhuai Liu,et al.  Video structural description technology for the new generation video surveillance systems , 2015, Frontiers of Computer Science.