Unique file identification in the National Software Reference Library

The National Software Reference Library (NSRL) provides a repository of known software, file profiles, and file signatures for use by law enforcement and other organizations involved with computer forensic investigations. During a forensic investigation, hundreds of thousands of files may be encountered. The NSRL is used to identify known files. This can reduce the amount of time spent examining a computer. Matches for common operating systems and applications do not need to be searched, either manually or electronically, for evidence. Additionally, the NSRL is used to determine which software applications are present on a system. This may suggest how the computer was being used and provide information on how and where to search for evidence. This paper examines whether the techniques used to create file signatures in the NSRL produce unique results-a core characteristic that the NSRL depends on for the majority of its uses. The uniqueness of the file identification is analyzed via two methods: an empirical analysis of the file signatures within the NSRL and research into the recent attacks on the hash algorithms used to generate the file signatures within the NSRL.

[1]  Donald E. Knuth The Art of Computer Programming 2 / Seminumerical Algorithms , 1971 .

[2]  Ronald Cramer,et al.  Advances in Cryptology - EUROCRYPT 2005, 24th Annual International Conference on the Theory and Applications of Cryptographic Techniques, Aarhus, Denmark, May 22-26, 2005, Proceedings , 2005, EUROCRYPT.

[3]  Air Force Air Force Materiel Command Hq FIPS-PUB-180-1 , 1995 .

[4]  Vlastimil Klíma,et al.  Tunnels in Hash Functions: MD5 Collisions Within a Minute , 2006, IACR Cryptol. ePrint Arch..

[5]  Hans Dobbertin Cryptanalysis of MD5 Compress , 1996 .

[6]  Matthew Franklin,et al.  Advances in Cryptology – CRYPTO 2004 , 2004, Lecture Notes in Computer Science.

[7]  Andrew L. Rukhin,et al.  Approximate entropy for testing randomness , 2000, Journal of Applied Probability.

[8]  Bruce Schneier,et al.  Applied cryptography (2nd ed.): protocols, algorithms, and source code in C , 1995 .

[9]  James J. Filliben,et al.  NIST/SEMATECH e-Handbook of Statistical Methods; Chapter 1: Exploratory Data Analysis , 2003 .

[10]  Vlastimil Klíma Finding MD5 Collisions - a Toy For a Notebook , 2005, IACR Cryptol. ePrint Arch..

[11]  Dengguo Feng,et al.  Collisions for Hash Functions MD4, MD5, HAVAL-128 and RIPEMD , 2004, IACR Cryptol. ePrint Arch..

[12]  P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[13]  Tim Boland,et al.  SELECTION OF HASHING ALGORITHMS , 2002 .

[14]  Bernard P. Zajac Applied cryptography: Protocols, algorithms, and source code in C , 1994 .

[15]  Alfred Menezes,et al.  Handbook of Applied Cryptography , 2018 .

[16]  Elaine B. Barker,et al.  A Statistical Test Suite for Random and Pseudorandom Number Generators for Cryptographic Applications , 2000 .

[17]  Antoine Joux,et al.  Multicollisions in Iterated Hash Functions. Application to Cascaded Constructions , 2004, CRYPTO.

[18]  Ross N. Williams A painless Guide to CRC Error Detection Algorithms , 1993 .

[19]  Ronald L. Rivest,et al.  The MD5 Message-Digest Algorithm , 1992, RFC.

[20]  Hui Chen,et al.  Cryptanalysis of the Hash Functions MD4 and RIPEMD , 2005, EUROCRYPT.

[21]  Michael Wiener,et al.  Advances in Cryptology — CRYPTO’ 99 , 1999 .

[22]  尚弘 島影 National Institute of Standards and Technologyにおける超伝導研究及び生活 , 2001 .

[23]  J. Wrench Table errata: The art of computer programming, Vol. 2: Seminumerical algorithms (Addison-Wesley, Reading, Mass., 1969) by Donald E. Knuth , 1970 .

[24]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[25]  Bruce Schneier,et al.  Second Preimages on n-bit Hash Functions for Much Less than 2n Work , 2005, IACR Cryptol. ePrint Arch..

[26]  Eli Biham,et al.  Near-Collisions of SHA-0 , 2004, CRYPTO.

[27]  S. Pincus,et al.  Randomness and degrees of irregularity. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[28]  A. Rukhin,et al.  Statistical Testing of Random Number Generators , 1999 .

[29]  Donald Ervin Knuth,et al.  The Art of Computer Programming , 1968 .