Chi-squared distance and metamorphic virus detection

Metamorphic malware changes its internal structure with each generation, while maintaining its original behavior. Current commercial antivirus software generally scan for known malware signatures; therefore, they are not able to detect metamorphic malware that sufficiently morphs its internal structure. Machine learning methods such as hidden Markov models (HMM) have shown promise for detecting hacker-produced metamorphic malware. However, previous research has shown that it is possible to evade HMM-based detection by carefully morphing with content from benign files. In this paper, we combine HMM detection with a statistical technique based on the chi-squared test to build an improved detection method. We discuss our technique in detail and provide experimental evidence to support our claim of improved detection.

[1]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[2]  M R Belsheim,et al.  A flash in the pan. , 1981, Canadian Medical Association journal.

[3]  Mark Stamp,et al.  Hunting for metamorphic engines , 2006, Journal in Computer Virology.

[4]  Aaas News,et al.  Book Reviews , 1893, Buffalo Medical and Surgical Journal.

[5]  James P. Egan,et al.  Signal detection theory and ROC analysis , 1975 .

[6]  Ludovic Mé,et al.  Code obfuscation techniques for metamorphic viruses , 2008, Journal in Computer Virology.

[7]  Sami Khuri,et al.  ANALYSIS AND DETECTION OF METAMORPHIC COMPUTER VIRUSES , 2006 .

[8]  H TodericiAnnie,et al.  Chi-squared distance and metamorphic virus detection , 2013 .

[9]  Richard L. C. Wang Flash in the pan , 1998 .

[10]  Alfred V. Aho,et al.  Efficient string matching , 1975, Commun. ACM.

[11]  San Jos,et al.  CHI-SQUARED DISTANCE AND METAMORPHIC VIRUS DETECTION , 2012 .

[12]  Steve R. White,et al.  An Undetectable Computer Virus , 2000 .

[13]  Peter Szor,et al.  HUNTING FOR METAMORPHIC , 2001 .

[14]  John Aycock,et al.  Computer Viruses and Malware , 2006, Advances in Information Security.

[15]  Mark Stamp,et al.  A highly metamorphic virus generator , 2010, Int. J. Multim. Intell. Secur..

[16]  John Aycock Computer Viruses and Malware (Advances in Information Security) , 2006 .

[17]  A. Kohn [Computer viruses]. , 1989, Harefuah.

[18]  Peter Szor,et al.  The Art of Computer Virus Research and Defense , 2005 .

[19]  Mark Stamp,et al.  Metamorphic worm that carries its own morphing engine , 2013, Journal of Computer Virology and Hacking Techniques.

[20]  Salvatore J. Stolfo,et al.  Data mining methods for detection of new malicious executables , 2001, Proceedings 2001 IEEE Symposium on Security and Privacy. S&P 2001.

[21]  William Nick Street,et al.  Learning to Rank by Maximizing AUC with Linear Programming , 2006, The 2006 IEEE International Joint Conference on Neural Network Proceedings.

[22]  Marcus A. Maloof,et al.  Learning to Detect and Classify Malicious Executables in the Wild , 2006, J. Mach. Learn. Res..

[23]  Mark Stamp,et al.  Exploring Hidden Markov Models for Virus Analysis: A Semantic Approach , 2013, 2013 46th Hawaii International Conference on System Sciences.

[24]  Marcus A. Maloof,et al.  Learning to detect malicious executables in the wild , 2004, KDD.

[25]  Mark Stamp,et al.  Hunting for undetectable metamorphic viruses , 2011, Journal in Computer Virology.

[26]  Eric Filiol,et al.  A statistical model for undecidable viral detection , 2007, Journal in Computer Virology.

[27]  Seymour Geisser,et al.  8. Predictive Inference: An Introduction , 1995 .

[28]  Mark Stamp,et al.  Information security - principles and practice , 2005 .

[29]  Mehryar Mohri,et al.  AUC Optimization vs. Error Rate Minimization , 2003, NIPS.