ELF-Miner: using structural knowledge and data mining methods to detect new (Linux) malicious executables

Linux malware can pose a significant threat—its (Linux) penetration is exponentially increasing—because little is known or understood about Linux OS vulnerabilities. We believe that now is the right time to devise non-signature based zero-day (previously unknown) malware detection strategies before Linux intruders take us by surprise. Therefore, in this paper, we first do a forensic analysis of Linux executable and linkable format (ELF) files. Our forensic analysis provides insight into different features that have the potential to discriminate malicious executables from benign ones. As a result, we can select a features’ set of 383 features that are extracted from an ELF headers. We quantify the classification potential of features using information gain and then remove redundant features by employing preprocessing filters. Finally, we do an extensive evaluation among classical rule-based machine learning classifiers—RIPPER, PART, C4.5 Rules, and decision tree J48—and bio-inspired classifiers—cAnt Miner, UCS, XCS, and GAssist—to select the best classifier for our system. We have evaluated our approach on an available collection of 709 Linux malware samples from vx heavens and offensive computing. Our experiments show that ELF-Miner provides more than 99% detection accuracy with less than 0.1% false alarm rate.

[1]  Subhasish Mazumdar,et al.  Supervised inductive learning with Lotka–Volterra derived models , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[2]  Wenke Lee,et al.  McBoost: Boosting Scalability in Malware Collection and Analysis Using Statistical Classification of Executables , 2008, 2008 Annual Computer Security Applications Conference (ACSAC).

[3]  Muhammad Zubair Shafiq,et al.  PE-Miner: Mining Structural Information to Detect Malicious Executables in Realtime , 2009, RAID.

[4]  Jaume Bacardit,et al.  Bloat Control and Generalization Pressure Using the Minimum Description Length Principle for a Pittsburgh Approach Learning Classifier System , 2005, IWLCS.

[5]  S. Momina Tabish,et al.  PE-Probe: Leveraging Packer Detection and Structural Information to Detect Malicious Portable Executables , 2009 .

[6]  S. Momina Tabish,et al.  A Framework for Efficient Mining of Structural Information to Detect Zero-Day Malicious Portable Executables , 2009 .

[7]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[8]  María José del Jesús,et al.  KEEL: a software tool to assess evolutionary algorithms for data mining problems , 2008, Soft Comput..

[9]  Myeong-Kwan Kevin Cheon,et al.  Frank and I , 2012 .

[10]  Bhavani M. Thuraisingham,et al.  A scalable multi-level feature extraction technique to detect malicious executables , 2007, Inf. Syst. Frontiers.

[11]  Stewart W. Wilson Classifier Fitness Based on Accuracy , 1995, Evolutionary Computation.

[12]  William W. Cohen Fast Effective Rule Induction , 1995, ICML.

[13]  Marco Colombetti,et al.  What Is a Learning Classifier System? , 1999, Learning Classifier Systems.

[14]  David Middleton,et al.  On optimum multiple-alternative detection of signals in noise , 1955, IRE Trans. Inf. Theory.

[15]  Marcus A. Maloof,et al.  Learning to detect malicious executables in the wild , 2004, KDD.

[16]  Don H. Johnson,et al.  Symmetrizing the Kullback-Leibler Distance , 2001 .

[17]  Richong Zhang,et al.  An information gain-based approach for recommending useful product reviews , 2011, Knowledge and Information Systems.

[18]  Salvatore J. Stolfo,et al.  Data mining methods for detection of new malicious executables , 2001, Proceedings 2001 IEEE Symposium on Security and Privacy. S&P 2001.

[19]  Tom Fawcett,et al.  ROC Graphs: Notes and Practical Considerations for Researchers , 2007 .

[20]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[21]  Alex Alves Freitas,et al.  cAnt-Miner: An Ant Colony Classification Algorithm to Cope with Continuous Attributes , 2008, ANTS Conference.

[22]  J. R. Quinlan,et al.  MDL and Categorical Theories (Continued) , 1995, ICML.

[23]  Ian H. Witten,et al.  Generating Accurate Rule Sets Without Global Optimization , 1998, ICML.

[24]  Ester Bernadó-Mansilla,et al.  Accuracy-Based Learning Classifier Systems: Models, Analysis and Applications to Classification Tasks , 2003, Evolutionary Computation.

[25]  S D Walter,et al.  The partial area under the summary ROC curve , 2005, Statistics in medicine.

[26]  José Francisco Martínez Trinidad,et al.  RP-Miner: a relaxed prune algorithm for frequent similar pattern mining , 2011, Knowledge and Information Systems.