A Frequency Approach to Creation of Executable File Signatures for their Identification

The paper presents methods of executable file signature creation based on frequency distributions of their informative features to be applied for program identification. Identification here should be understood as a process of file recognition by establishing its coincidence with a particular program. A new approach to creation of the archive of program signatures, both in terms of byte-frequency distribution of a program's binary code, and in terms of frequency distribution of assembler commands in their disassembler codes, is presented. The new method of executable file identification is offered and the results of experiments on their identification using a statistical criterion of φ * -Fisher and analysis of the slope are provided. The proposed method can be used to audit data-storage medium.