Preliminary Analysis of Malware Detection in Opcode Sequences within IoT Environment

With the technological development and means of communication, the Internet of Things (IoT) has become an essential role in providing many services in daily life through millions of heterogeneous but interconnected devices and nodes. This development is opening to many security and privacy challenges that can cause complete network breakdown, bypassed access control or the loss of critical data. This paper attempts to provide a preliminary analysis for malware detection within data generated by IoT-based devices and services in the form of operational codes (Opcode) sequences. Three machine learning algorithms are evaluated and compared for accuracy, precision, recall and F-measure. The results showed that the Random Forest (RF) achieved the best accuracy of 98%, followed by SVM and k-NN, both with 91%. The results are further analyzed based on the Receiver Operating Characteristic (ROC) curve and Precision-Recall curve to further illustrate the difference in performance of all three algorithms when dealing with IoT data.

[1]  Ali Dehghantanha,et al.  Application of Machine Learning Algorithms for Android Malware Detection , 2018 .

[2]  Ali Dehghantanha,et al.  Machine learning aided Android malware classification , 2017, Comput. Electr. Eng..

[3]  Monika Mittal,et al.  KNN and PCA classifier with Autoregressive modelling during different ECG signal interpretation , 2018 .

[4]  Yu Wang,et al.  Designing collaborative blockchained signature-based intrusion detection in IoT environments , 2019, Future Gener. Comput. Syst..

[5]  Krys J. Kochut,et al.  A Brief Survey of Text Mining: Classification, Clustering and Extraction Techniques , 2017, ArXiv.

[6]  G. Geethakumari,et al.  HTTP Botnet Detection in IOT Devices using Network Traffic Analysis , 2019, 2019 International Conference on Recent Advances in Energy-efficient Computing and Communication (ICRAECC).

[7]  B. K. Tripathy,et al.  A novel malware analysis for malware detection and classification using machine learning algorithms , 2017, SIN.

[8]  Jens Keilwagen,et al.  PRROC: computing and visualizing precision-recall and receiver operating characteristic curves in R , 2015, Bioinform..

[9]  Ali Dehghantanha,et al.  A deep Recurrent Neural Network based approach for Internet of Things malware threat hunting , 2018, Future Gener. Comput. Syst..

[10]  Shyamal Patel,et al.  A review of wearable sensors and systems with application in rehabilitation , 2012, Journal of NeuroEngineering and Rehabilitation.

[11]  Laurence T. Yang,et al.  Data Exfiltration From Internet of Things Devices: iOS Devices as Case Studies , 2017, IEEE Internet of Things Journal.

[12]  SAGAR S. NIkAM,et al.  A Comparative Study of Classification Techniques in Data Mining Algorithms , 2015 .

[13]  Kenli Li,et al.  A Parallel Random Forest Algorithm for Big Data in a Spark Cloud Computing Environment , 2017, IEEE Transactions on Parallel and Distributed Systems.

[14]  Niraj K. Jha,et al.  A Comprehensive Study of Security of Internet-of-Things , 2017, IEEE Transactions on Emerging Topics in Computing.

[15]  Ashish Sabharwal,et al.  How Good Are My Predictions? Efficiently Approximating Precision-Recall Curves for Massive Datasets , 2017, UAI.

[16]  Teng Joon Lim,et al.  EDIMA: Early Detection of IoT Malware Network Activity Using Machine Learning Techniques , 2019, 2019 IEEE 5th World Forum on Internet of Things (WF-IoT).

[17]  Huimin Lu,et al.  Facial Emotion Recognition Based on Biorthogonal Wavelet Entropy, Fuzzy Support Vector Machine, and Stratified Cross Validation , 2016, IEEE Access.

[18]  Gaël Varoquaux,et al.  Cross-validation failure: Small sample sizes lead to large error bars , 2017, NeuroImage.

[19]  Yang Wang,et al.  Applications of Support Vector Machine (SVM) Learning in Cancer Genomics. , 2018, Cancer genomics & proteomics.

[20]  M. M. A. Hashem,et al.  Attack and anomaly detection in IoT sensors in IoT sites using machine learning approaches , 2019, Internet Things.

[21]  Colin Tankard,et al.  The security issues of the Internet of Things , 2015 .

[22]  Mark Goadrich,et al.  The relationship between Precision-Recall and ROC curves , 2006, ICML.

[23]  Xiao Zhou,et al.  ASSCA: API based Sequence and Statistics features Combined malware detection Architecture , 2017, International Conference on Identification, Information, and Knowledge in the Internet of Things.

[24]  Joel J. P. C. Rodrigues,et al.  IoMT Malware Detection Approaches: Analysis and Research Challenges , 2019, IEEE Access.

[25]  Fei Wang,et al.  Comparative Study on KNN and SVM Based Weather Classification Models for Day Ahead Short Term Solar PV Power Forecasting , 2017 .

[26]  Ali Dehghantanha,et al.  A Comparison Between Different Machine Learning Models for IoT Malware Detection , 2020 .

[27]  Martin Kappas,et al.  Comparison of Random Forest, k-Nearest Neighbor, and Support Vector Machine Classifiers for Land Cover Classification Using Sentinel-2 Imagery , 2017, Sensors.

[28]  Evon M. O. Abu-Taieh,et al.  Comparative Study , 2020, Definitions.

[29]  Yuval Elovici,et al.  Detection of Unauthorized IoT Devices Using Machine Learning Techniques , 2017, ArXiv.

[30]  Chang Liu,et al.  Technology acceptance model for wireless Internet , 2003, Internet Res..

[31]  Mamun Bin Ibne Reaz,et al.  A novel SVM-kNN-PSO ensemble method for intrusion detection system , 2016, Appl. Soft Comput..

[32]  Simin Nadjm-Tehrani,et al.  Crowdroid: behavior-based malware detection system for Android , 2011, SPSM '11.

[33]  Simen Rune Bragen Malware detection through opcode sequence analysis using machine learning , 2015 .

[34]  Myung-Sup Kim,et al.  Linear SVM-Based Android Malware Detection for Reliable IoT Services , 2014, J. Appl. Math..

[35]  Richard G. Vedder,et al.  Security issues on the internet , 1997, SGSC.

[36]  Aida Mustapha,et al.  Effective Dimensionality Reduction of Payload-Based Anomaly Detection in TMAD Model for HTTP Payload , 2016, KSII Trans. Internet Inf. Syst..

[37]  Ali Dehghantanha,et al.  An opcode‐based technique for polymorphic Internet of Things malware detection , 2020, Concurr. Comput. Pract. Exp..

[38]  Jinjun Chen,et al.  Threats to Networking Cloud and Edge Datacenters in the Internet of Things , 2016, IEEE Cloud Computing.

[39]  M. A. Jabbar,et al.  Random Forest Modeling for Network Intrusion Detection System , 2016 .

[40]  Chris Callison-Burch,et al.  PPDB 2.0: Better paraphrase ranking, fine-grained entailment relations, word embeddings, and style classification , 2015, ACL.

[41]  Tao Ban,et al.  Machine Learning Framework to Analyze IoT Malware Using ELF and Opcode Features , 2020, Digital Threats: Research and Practice.

[42]  Arvind Mahindru,et al.  Dynamic Permissions based Android Malware Detection using Machine Learning Techniques , 2017, ISEC.

[43]  David M. W. Powers,et al.  Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation , 2011, ArXiv.

[44]  Wenyong Wang,et al.  A Multimodal Malware Detection Technique for Android IoT Devices Using Various Features , 2019, IEEE Access.

[45]  Georgios Kambourakis,et al.  DDoS in the IoT: Mirai and Other Botnets , 2017, Computer.

[46]  Ali Dehghantanha,et al.  Robust Malware Detection for Internet of (Battlefield) Things Devices Using Deep Eigenspace Learning , 2019, IEEE Transactions on Sustainable Computing.

[47]  Sang Won Yoon,et al.  A support vector machine-based ensemble algorithm for breast cancer diagnosis , 2017, Eur. J. Oper. Res..

[48]  R. Bro,et al.  Centering and scaling in component analysis , 2003 .

[49]  Ali Dehghantanha,et al.  Digital forensics: the missing piece of the Internet of Things promise , 2016 .

[50]  Dennis Sylvester,et al.  A2: Analog Malicious Hardware , 2016, 2016 IEEE Symposium on Security and Privacy (SP).

[51]  Sung Wook Baik,et al.  Machine learning-assisted signature and heuristic-based detection of malwares in Android devices , 2017, Comput. Electr. Eng..

[52]  Johannes R. Sveinsson,et al.  Random Forests for land cover classification , 2006, Pattern Recognit. Lett..