A scalable file based data store for forensic analysis

In the field of remote forensics, the GRR Response Rig has been used to access and store data from thousands of enterprise machines. Handling large numbers of machines requires efficient and scalable storage mechanisms that allow concurrent data operations and efficient data access, independent of the size of the stored data and the number of machines in the network. We studied the available GRR storage mechanisms and found them lacking in both speed and scalability. In this paper, we propose a new distributed data store that partitions data into database files that can be accessed independently so that distributed forensic analysis can be done in a scalable fashion. We also show how to use the NSRL software reference database in our scalable data store to avoid wasting resources when collecting harmless files from enterprise machines.

[1]  Zachary Parker,et al.  Comparing NoSQL MongoDB to an SQL DB , 2013, ACMSE '13.

[2]  Simson L. Garfinkel,et al.  Digital forensics research: The next 10 years , 2010, Digit. Investig..

[3]  Simson L. Garfinkel,et al.  Digital forensics XML and the DFXML toolset , 2012, Digit. Investig..

[4]  Germano Caronni,et al.  Distributed forensics and incident response in the enterprise , 2011 .

[5]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[6]  Neil C. Rowe,et al.  Testing the National Software Reference Library , 2012, Digit. Investig..

[7]  Bill Hill,et al.  Teleporter: An analytically and forensically sound duplicate transfer system , 2009, Digit. Investig..

[8]  David R. Karger,et al.  Consistent hashing and random trees: distributed caching protocols for relieving hot spots on the World Wide Web , 1997, STOC '97.

[9]  C. Partridge,et al.  Innovations in Internetworking , 1988 .

[10]  Bradley L. Schatz,et al.  Extending the advanced forensic format to accommodate multiple data sources, logical evidence, arbitrary information and forensic workflow , 2009, Digit. Investig..

[11]  Rajkumar Buyya,et al.  High Performance Cluster Computing: Architectures and Systems , 1999 .

[12]  Gary Fisher Computer Forensics Guidance , 2001 .

[13]  Bruce G. Lindsay,et al.  Transaction management in the R* distributed database management system , 1986, TODS.

[14]  Miriam A. M. Capretz,et al.  Data management in cloud environments: NoSQL and NewSQL data stores , 2013, Journal of Cloud Computing: Advances, Systems and Applications.

[15]  Weidong Shi,et al.  Forensics-as-a-Service (FaaS): Computer Forensic Workflow Management and Processing Using Cloud , 2013, CLOUD 2013.