Experimenting and assessing machine learning tools for detecting and analyzing malicious behaviors in complex environments

This paper proposes applying and experimentally assessing machine learning tools to solve security issues in complex environments, specifically identifying and analyzing malicious behaviors. To evaluate the effectiveness of machine learning algorithms to detect anomalies, we consider the following three real-world case studies: (i) detecting and analyzing Tor traffic, on the basis of a machine learning-based discrimination technique; (ii) identifying and analyzing CAN bus attacks via deep learning; (iii) detecting and analyzing mobile malware, with particular regard to ransomware in Android environments, by means of structural entropy-based classification. Derived observations confirm the effectiveness of machine learning in supporting security of complex environments.

[1]  Antonella Santone,et al.  Ransomware Steals Your Phone. Formal Methods Rescue It , 2016, FORTE.

[2]  Giancarlo Fortino,et al.  Managing Data and Processes in Cloud-Enabled Large-Scale Sensor Networks: State-of-the-Art and Future Research Directions , 2013, 2013 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing.

[3]  Ivan Sorokin,et al.  Comparing files using structural entropy , 2011, Journal in Computer Virology.

[4]  Igor Santos,et al.  Countering entropy measure attacks on packed software detection , 2012, 2012 IEEE Consumer Communications and Networking Conference (CCNC).

[5]  Fernando de la Prieta,et al.  Artificial neural networks used in optimization problems , 2018, Neurocomputing.

[6]  Moises Goldszmidt Bayesian Network Classifiers , 2011 .

[7]  Sergio Greco,et al.  A distributed system for answering range queries on sensor network data , 2005, Third IEEE International Conference on Pervasive Computing and Communications Workshops.

[8]  Gerardo Canfora,et al.  An HMM and structural entropy based detector for Android malware: An empirical study , 2016, Comput. Secur..

[9]  Stefano Zanero,et al.  HelDroid: Dissecting and Detecting Mobile Ransomware , 2015, RAID.

[10]  Ali A. Ghorbani,et al.  DNA-Droid: A Real-Time Android Ransomware Detection Framework , 2017, NSS.

[11]  Robert C. Holte,et al.  Very Simple Classification Rules Perform Well on Most Commonly Used Datasets , 1993, Machine Learning.

[12]  Waseem Shahzad,et al.  Feature subset selection using association rule mining and JRip classifier , 2013 .

[13]  Fabio Martinelli,et al.  Driver and Path Detection through Time-Series Classification , 2018 .

[14]  Yu Yang,et al.  Automated Detection and Analysis for Android Ransomware , 2015, 2015 IEEE 17th International Conference on High Performance Computing and Communications, 2015 IEEE 7th International Symposium on Cyberspace Safety and Security, and 2015 IEEE 12th International Conference on Embedded Software and Systems.

[15]  Alfredo Cuzzocrea,et al.  Enabling OLAP in mobile environments via intelligent data cube compression techniques , 2008, Journal of Intelligent Information Systems.

[16]  Junzhou Luo,et al.  Inferring Application Type Information from Tor Encrypted Traffic , 2014, 2014 Second International Conference on Advanced Cloud and Big Data.

[17]  J. Palous,et al.  Machine Learning and Data Mining , 2002 .

[18]  Antonella Santone,et al.  Who's Driving My Car? A Machine Learning based Approach to Driver Identification , 2018, ICISSP.

[19]  Gerardo Canfora,et al.  A Classifier of Malicious Android Applications , 2013, 2013 International Conference on Availability, Reliability and Security.

[20]  Dirk Grunwald,et al.  Shining Light in Dark Places: Understanding the Tor Network , 2008, Privacy Enhancing Technologies.

[21]  Gerardo Canfora,et al.  Metamorphic Malware Detection Using Code Metrics , 2014, Inf. Secur. J. A Glob. Perspect..

[22]  Paul S. Addison,et al.  The Illustrated Wavelet Transform Handbook Introductory Theory And Applications In Science , 2002 .

[23]  Yajin Zhou,et al.  Dissecting Android Malware: Characterization and Evolution , 2012, 2012 IEEE Symposium on Security and Privacy.

[24]  Fabio Martinelli,et al.  Evaluating Convolutional Neural Network for Effective Mobile Malware Detection , 2017, KES.

[25]  Mohammed Saeed Al-kahtani,et al.  Survey on security attacks in Vehicular Ad hoc Networks (VANETs) , 2012, 2012 6th International Conference on Signal Processing and Communication Systems.

[26]  Christian Wolf,et al.  Sequential Deep Learning for Human Action Recognition , 2011, HBU.

[27]  Ghassan Samara,et al.  Security Analysis of Vehicular Ad Hoc Nerworks (VANET) , 2010, 2010 Second International Conference on Network Applications, Protocols and Services.

[28]  Olatz Arbelaitz,et al.  Combining multiple class distribution modified subsamples in a single tree , 2007, Pattern Recognit. Lett..

[29]  Mohamed Ali Kâafar,et al.  Digging into Anonymous Traffic: A Deep Analysis of the Tor Anonymizing Network , 2010, 2010 Fourth International Conference on Network and System Security.

[30]  Ali A. Ghorbani,et al.  Characterization of Encrypted and VPN Traffic using Time-related Features , 2016, ICISSP.

[31]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[32]  Pericles A. Mitkas,et al.  Applying Machine Learning Techniques on Air Quality Data for Real-Time Decision Support , 2003 .

[33]  D. Adam The illustrated wavelet transform handbook: introductory theory and applications in science, engineering, medicine and finance , 2004 .

[34]  Philip S. Yu,et al.  Top 10 algorithms in data mining , 2007, Knowledge and Information Systems.

[35]  Fabio Martinelli,et al.  R-PackDroid: API package-based characterization and detection of mobile ransomware , 2017, SAC.

[36]  Angelos D. Keromytis,et al.  On the Effectiveness of Traffic Analysis against Anonymity Networks Using Flow Records , 2014, PAM.

[37]  Antonella Santone,et al.  Heuristic search for equivalence checking , 2014, Software & Systems Modeling.

[38]  Alfredo Cuzzocrea,et al.  Improving range-sum query evaluation on data cubes via polynomial approximation , 2006, Data Knowl. Eng..

[39]  Laurence T. Yang,et al.  Big Data - Algorithms, Analytics, and Applications , 2015 .

[40]  Antonella Santone,et al.  Car hacking identification through fuzzy logic algorithms , 2017, 2017 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE).

[41]  Aniello Cimitile,et al.  Talos: no more ransomware victims with formal methods , 2018, International Journal of Information Security.

[42]  Sanggeun Song,et al.  The Effective Ransomware Prevention Technique Using Process Monitoring on Android Platform , 2016, Mob. Inf. Syst..

[43]  Chris Cornelis,et al.  A New Approach to Fuzzy-Rough Nearest Neighbour Classification , 2008, RSCTC.

[44]  Mario Cannataro,et al.  A Probabilistic Approach to Model Adaptive Hypermedia Systems , 2001, WebDyn@ICDT.

[45]  Maria Luisa Villani,et al.  Model Checking Multithreaded Programs by Means of Reduced Models , 2004, LDTA@ETAPS.

[46]  Peter Hannay,et al.  Using Traffic Analysis to Identify the Second Generation Onion Router , 2011, 2011 IFIP 9th International Conference on Embedded and Ubiquitous Computing.

[47]  Hisao Ishibuchi,et al.  Fuzzy rule selection by multi-objective genetic local search algorithms and rule evaluation measures in data mining , 2004, Fuzzy Sets Syst..

[48]  Monica Borda,et al.  Fundamentals in Information Theory and Coding , 2011 .

[49]  Huy Kang Kim,et al.  Know your master: Driver profiling-based anti-theft method , 2016, 2016 14th Annual Conference on Privacy, Security and Trust (PST).

[50]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[51]  Nir Friedman,et al.  Bayesian Network Classifiers , 1997, Machine Learning.

[52]  Antonella Santone,et al.  GreASE: A Tool for Efficient “Nonequivalence” Checking , 2014, TSEM.

[53]  Antonella Santone,et al.  Diabetes Mellitus Affected Patients Classification and Diagnosis through Machine Learning Techniques , 2017, KES.

[54]  Bander Ali Saleh Al-rimy,et al.  Ransomware threat success factors, taxonomy, and countermeasures: A survey and research directions , 2018, Comput. Secur..

[55]  Remco R. Bouckaert,et al.  Bayesian network classifiers in Weka , 2004 .

[56]  Diana Inkpen,et al.  Identification of Translationese: A Machine Learning Approach , 2010, CICLing.

[57]  Geoffrey I. Webb Decision Tree Grafting From the All Tests But One Partition , 1999, IJCAI.

[58]  Nikita Borisov,et al.  A Tune-up for Tor: Improving Security and Performance in the Tor Network , 2008, NDSS.

[59]  S. Sasikala,et al.  REPTREE CLASSIFIER FOR IDENTIFYING LINK SPAM IN WEB SEARCH ENGINES , 2013, SOCO 2013.

[60]  Aniello Cimitile,et al.  Machine Learning Meets iOS Malware: Identifying Malicious Applications on Apple Environment , 2017, ICISSP.

[61]  Alfredo Cuzzocrea,et al.  Effectively and Efficiently Mining Frequent Patterns from Dense Graph Streams on Disk , 2014, KES.

[62]  Gene Tsudik,et al.  Towards an Analysis of Onion Routing Security , 2000, Workshop on Design Issues in Anonymity and Unobservability.

[63]  Aniello Cimitile,et al.  Mobile Malware Detection in the Real World , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering Companion (ICSE-C).

[64]  Miroslaw Malek,et al.  Extinguishing Ransomware - A Hybrid Approach to Android Ransomware Detection , 2017, FPS.

[65]  Peter E.D. Love,et al.  A deep hybrid learning model to detect unsafe behavior: Integrating convolution neural networks and long short-term memory , 2018 .

[66]  Mark Stamp,et al.  Structural entropy and metamorphic malware , 2013, Journal of Computer Virology and Hacking Techniques.

[67]  Arun Kumar Sangaiah,et al.  Android malware detection based on system call sequences and LSTM , 2019, Multimedia Tools and Applications.

[68]  Antonella Santone,et al.  Ransomware Inside Out , 2016, 2016 11th International Conference on Availability, Reliability and Security (ARES).

[69]  Alfredo Cuzzocrea Accuracy Control in Compressed Multidimensional Data Cubes for Quality of Answer-based OLAP Tools , 2006, 18th International Conference on Scientific and Statistical Database Management (SSDBM'06).

[70]  Nick Mathewson,et al.  Tor: The Second-Generation Onion Router , 2004, USENIX Security Symposium.

[71]  Chris Cornelis,et al.  Fuzzy-Rough Nearest Neighbour Classification , 2011, Trans. Rough Sets.

[72]  Robert Lyda,et al.  Using Entropy Analysis to Find Encrypted and Packed Malware , 2007, IEEE Security & Privacy.

[73]  Eyke Hüllermeier,et al.  FURIA: an algorithm for unordered fuzzy rule induction , 2009, Data Mining and Knowledge Discovery.

[74]  Ghassan Samara,et al.  Security issues and challenges of Vehicular Ad Hoc Networks (VANET) , 2010, 4th International Conference on New Trends in Information Science and Service Science.

[75]  Ali A. Ghorbani,et al.  Characterization of Tor Traffic using Time based Features , 2017, ICISSP.

[76]  Antonella Santone,et al.  Identification of Android Malware Families with Model Checking , 2016, ICISSP.

[77]  Mario Cannataro,et al.  Modeling Adaptive Hypermedia with an Object-Oriented Approach and XML , 2002, WebDyn@WWW.

[78]  Alfredo Cuzzocrea,et al.  Combining multidimensional user models and knowledge representation and management techniques for making web services knowledge-aware , 2006, Web Intell. Agent Syst..