Open Computer Forensic Architecture a Way to Process Terabytes of Forensic Disk Images

This chapter describes the Open Computer Forensics Architecture (OCFA), an automated system that dissects complex file types, extracts metadata from files and ultimately creates indexes on forensic images of seized computers. It consists of a set of collaborating processes, called modules. Each module is specialized in processing a certain file type. When it receives a so called ’evidence’, the information that has been extracted so far about the file together with the actual data, it either adds new information about the file or uses the file to derive a new ’evidence’. All evidence, original and derived, is sent to a router after being processed by a particular module. The router decides which module should process the evidence next, based upon the metadata associated with the evidence. Thus the OCFA system can recursively process images until from every compound file the embedded files, if any, are extracted, all information that the system can derive, has been derived and all extracted text is indexed. Compound files include, but are not limited to, archive- and zip-files, disk images, text documents of various formats and, for example, mailboxes. The output of an OCFA run is a repository full of derived files, a database containing all extracted information about the files and an index which can be used when searching. This is presented in a web interface. Moreover, processed data is easily fed to third party software for further analysis or to be used in data mining or text mining-tools. The main advantages of the OCFA system are: 1. Scalability, it is able to process large amounts of data. 2. Extendable, it is easy to develop and plug in custom modules. 3. Open, the output is well suited to be used as input for other systems. 4. Analysts and tactical investigators may search the evidence without the constant intervention of digital investigators.