Ensemble Learning for Effective Run-Time Hardware-Based Malware Detection: A Comprehensive Analysis and Classification

Malware detection at the hardware level has emerged recently as a promising solution to improve the security of computing systems. Hardware-based malware detectors take advantage of Machine Learning (ML) classifiers to detect pattern of malicious applications at run-time. These ML classifiers are trained using low-level features such as processor Hardware Performance Counters (HPCs) data which are captured at run-time to appropriately represent the application behaviour. Recent studies show the potential of standard ML-based classifiers for detecting malware using analysis of large number of microarchitectural events, more than the very limited number of HPC registers available in today’s microprocessors which varies from 2 to 8. This results in executing the application more than once to collect the required data, which in turn makes the solution less practical for effective run-time malware detection. Our results show a clear trade-off between the performance of standard ML classifiers and the number and diversity of HPCs available in modern microprocessors. This paper proposes a machine learning-based solution to break this trade-off to realize effective run-time detection of malware. We propose ensemble learning techniques to improve the performance of the hardware-based malware detectors despite using a very small number of microarchitectural events that are captured at run-time by existing HPCs, eliminating the need to run an application several times. For this purpose, eight robust machine learning models and two well-known ensemble learning classifiers applied on all studied ML models (sixteen in total) are implemented for malware detection and precisely compared and characterized in terms of detection accuracy, robustness, performance (accuracy × robustness), and hardware overheads. The experimental results show that the proposed ensemble learning-based malware detection with just 2 HPCs using ensemble technique outperforms standard classifiers with 8 HPCs by up to 17%. In addition, it can match the robustness and performance of standard ML-based detectors with 16 HPCs while using only 4 HPCs allowing effective run-time detection of malware.

[1]  Nael B. Abu-Ghazaleh,et al.  Malware-aware processors: A framework for efficient online malware detection , 2015, 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA).

[2]  Avesta Sasan,et al.  Analyzing hardware based malware detectors , 2017, 2017 54th ACM/EDAC/IEEE Design Automation Conference (DAC).

[3]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[4]  Houman Homayoun,et al.  Scheduling multithreaded applications onto heterogeneous composite cores architecture , 2017, 2017 Eighth International Green and Sustainable Computing Conference (IGSC).

[5]  Avesta Sasan,et al.  Machine Learning-Based Approaches for Energy-Efficiency Prediction and Scheduling in Composite Cores Architectures , 2017, 2017 IEEE International Conference on Computer Design (ICCD).

[6]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[7]  Salvatore J. Stolfo,et al.  Unsupervised Anomaly-Based Malware Detection Using Hardware Features , 2014, RAID.

[8]  Lesley Shannon,et al.  Performance impacts and limitations of hardware memory access trace collection , 2017, Design, Automation & Test in Europe Conference & Exhibition (DATE), 2017.

[9]  Salvatore J. Stolfo,et al.  On the feasibility of online malware detection with performance counters , 2013, ISCA.

[10]  Houman Homayoun,et al.  Comprehensive assessment of run-time hardware-supported malware detection using general and ensemble learning , 2018, CF.

[11]  Ramesh Karri,et al.  Are hardware performance counters a cost effective way for integrity checking of programs , 2011, STC '11.

[12]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[13]  Nael B. Abu-Ghazaleh,et al.  Ensemble Learning for Low-Level Hardware-Supported Malware Detection , 2015, RAID.

[14]  Eric Filiol,et al.  Behavioral detection of malware: from a survey towards an established taxonomy , 2008, Journal in Computer Virology.

[15]  Michail Maniatakos,et al.  ConFirm: Detecting firmware modifications in embedded systems using Hardware Performance Counters , 2015, 2015 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[16]  SpruntBrinkley The Basics of Performance-Monitoring Hardware , 2002 .

[17]  Trevor Mudge,et al.  MiBench: A free, commercially representative embedded benchmark suite , 2001 .

[18]  Houman Homayoun,et al.  Power conversion efficiency-aware mapping of multithreaded applications on heterogeneous architectures: A comprehensive parameter tuning , 2018, 2018 23rd Asia and South Pacific Design Automation Conference (ASP-DAC).

[19]  Alberto Garcia-Serrano,et al.  Anomaly Detection for malware identification using Hardware Performance Counters , 2015, ArXiv.

[20]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[21]  Mahdi Abadi,et al.  HPCMalHunter: Behavioral malware detection using hardware performance counters and singular value decomposition , 2014, 2014 4th International Conference on Computer and Knowledge Engineering (ICCKE).