Malware classification using instruction frequencies

Developing variants of malware is a common and effective method to avoid the signature detection of antivirus programs. Malware analysis and signature abstraction are essential technologies to update the detection signature DB for malware detection. Since most malware binary analysis processes are performed manually, malware binary analysis is a time-consuming job. Therefore, efficient malware classification can be used to speed up malware binary analysis. As malware variants of the same malware family may share a portion of their binary code, the sequences of instructions may be similar, or even identical. In this paper, we propose a malware classification method that uses instruction frequencies. Our test results show that there are clear distinctions among malware and normal programs.