Machine Learning for Anomaly Detection and Categorization in Multi-Cloud Environments

Cloud computing has been widely adopted by application service providers (ASPs) and enterprises to reduce both capital expenditures (CAPEX) and operational expenditures (OPEX). Applications and services previously running on private data centers are now being migrated to private or public clouds. Since most of the ASPs and enterprises have globally distributed user bases, their services need to be distributed across multiple clouds, spread across the globe which can achieve better performance in terms of latency, scalability and load balancing. The shift has eventually led the research community to study multi-cloud environments. However, the widespread acceptance of such environments has been hampered by major security concerns. Firewalls and traditional rule-based security protection techniques are not sufficient to protect user-data in multi-cloud scenarios. Recently, advances in machine learning techniques have attracted the attention of the research community to build intrusion detection systems (IDS) that can detect anomalies in the network traffic. Most of the research works, however, do not differentiate among different types of attacks. This is, in fact, necessary for appropriate countermeasures and defense against attacks. In this paper, we investigate both detecting and categorizing anomalies rather than just detecting, which is a common trend in the contemporary research works. We have used a popular publicly available dataset to build and test learning models for both detection and categorization of different attacks. To be precise, we have used two supervised machine learning techniques, namely linear regression (LR) and random forest (RF). We show that even if detection is perfect, categorization can be less accurate due to similarities between attacks. Our results demonstrate more than 99% detection accuracy and categorization accuracy of 93.6%, with the inability to categorize some attacks. Further, we argue that such categorization can be applied to multi-cloud environments using the same machine learning techniques.

[1]  Anil K. Jain,et al.  Feature Selection: Evaluation, Application, and Small Sample Performance , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[3]  Dong Seong Kim,et al.  Genetic algorithm to improve SVM based network intrusion detection system , 2005, 19th International Conference on Advanced Information Networking and Applications (AINA'05) Volume 1 (AINA papers).

[4]  Armando Eduardo De Giusti,et al.  Cloud Computing. Concepts, Technology & Architecture , 2013 .

[5]  Kien A. Hua,et al.  Decision tree classifier for network intrusion detection with GA-based feature selection , 2005, ACM Southeast Regional Conference.

[6]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[7]  D. Lalitha Bhaskari,et al.  Intrusion Detection Using Random Forests Classifier with SMOTE and Feature Reduction , 2013, 2013 International Conference on Cloud & Ubiquitous Computing & Emerging Technologies.

[8]  Somnuk Phon-Amnuaisuk,et al.  A cascaded classifier approach for improving detection rates on rare attack categories in network intrusion detection , 2010, Applied Intelligence.

[9]  Ali A. Ghorbani,et al.  Detecting P2P botnets through network behavior analysis and machine learning , 2011, 2011 Ninth Annual International Conference on Privacy, Security and Trust.

[10]  Anamika Yadav,et al.  Increasing performance Of intrusion detection system using neural network , 2014, 2014 IEEE International Conference on Advanced Communications, Control and Computing Technologies.

[11]  Muttukrishnan Rajarajan,et al.  A survey of intrusion detection techniques in Cloud , 2013, J. Netw. Comput. Appl..

[12]  Jill Slay,et al.  The evaluation of Network Anomaly Detection Systems: Statistical analysis of the UNSW-NB15 data set and the comparison with the KDD99 data set , 2016, Inf. Secur. J. A Glob. Perspect..

[13]  Ali A. Ghorbani,et al.  Toward developing a systematic approach to generate benchmark datasets for intrusion detection , 2012, Comput. Secur..

[14]  Thaier Hayajneh,et al.  Performance and Information Security Evaluation with Firewalls , 2013 .

[15]  Nour Moustafa,et al.  UNSW-NB15: a comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set) , 2015, 2015 Military Communications and Information Systems Conference (MilCIS).

[16]  Keke Gai,et al.  A Classification Algorithm Based on Ensemble Feature Selections for Imbalanced-Class Dataset , 2016, 2016 IEEE 2nd International Conference on Big Data Security on Cloud (BigDataSecurity), IEEE International Conference on High Performance and Smart Computing (HPSC), and IEEE International Conference on Intelligent Data and Security (IDS).

[17]  Nils Gruschka,et al.  Security and Privacy-Enhancing Multicloud Architectures , 2013, IEEE Transactions on Dependable and Secure Computing.

[18]  T. Subbulakshmi,et al.  Multiple learning based classifiers using layered approach and Feature Selection for attack detection , 2013, 2013 IEEE International Conference ON Emerging Trends in Computing, Communication and Nanotechnology (ICECCN).

[19]  Benny Pinkas,et al.  Side Channels in Cloud Services: Deduplication in Cloud Storage , 2010, IEEE Security & Privacy.

[20]  Gregory J. Conti,et al.  Toward Instrumenting Network Warfare Competitions to Generate Labeled Datasets , 2009, CSET.

[21]  Jugal K. Kalita,et al.  MLH-IDS: A Multi-Level Hybrid Intrusion Detection Method , 2014, Comput. J..

[22]  Gail-Joon Ahn,et al.  Detecting and Resolving Firewall Policy Anomalies , 2012, IEEE Transactions on Dependable and Secure Computing.

[23]  John McHugh,et al.  Testing Intrusion detection systems: a critique of the 1998 and 1999 DARPA intrusion detection system evaluations as performed by Lincoln Laboratory , 2000, TSEC.

[24]  Michael I. Jordan,et al.  On Discriminative vs. Generative Classifiers: A comparison of logistic regression and naive Bayes , 2001, NIPS.

[25]  Wei-Yang Lin,et al.  Intrusion detection by machine learning: A review , 2009, Expert Syst. Appl..

[26]  Ali Feizollah,et al.  Evaluation of machine learning classifiers for mobile malware detection , 2014, Soft Computing.

[27]  Taghi M. Khoshgoftaar,et al.  A New Intrusion Detection Benchmarking System , 2015, FLAIRS Conference.

[28]  Taeshik Shon,et al.  A hybrid machine learning approach to network anomaly detection , 2007, Inf. Sci..

[29]  V Balasaraswathi,et al.  IDS Using Machine Learning - Current State of Art and Future Directions , 2016 .

[30]  Nie Min,et al.  Anomaly intrusion detection based on wavelet kernel LS-SVM , 2013, Proceedings of 2013 3rd International Conference on Computer Science and Network Technology.