Modeling and clustering attacker activities in IoT through machine learning techniques

Abstract With the rise of the Internet of Things, malicious attacks pose serious threats to the massive vulnerable IoT devices. Recently, attackers have initiated increasingly coordinated attack activities commonly pertaining to botnets. However, how to effectively detect the botnet based on attacker activities is proven challenging. In this paper, we propose a Machine Learning-based method for modeling attacker activities based on the following intuitive observations: attackers in the same botnet tend to launch temporally close attacks. We then directly model attack temporal patterns using a special class of point process called Multivariate Hawkes Process. Intuitively, Multivariate Hawkes Process identifies the latent influences between attackers by utilizing the mutually exciting properties. We then cluster the attacker activities based on the inferred weighted influence matrix with resort to the graph-based clustering approach. To evaluate the applicability of our method, we deployed 10 honeypots in a real-world environment, and conduct experiments on the collected dataset. The results show that we can identify the activity pattern and the structure of botnets effectively.

[1]  Jure Leskovec,et al.  SEISMIC: A Self-Exciting Point Process Model for Predicting Tweet Popularity , 2015, KDD.

[2]  Sergey Andreev,et al.  Internet of Things, Smart Spaces, and Next Generation Networking , 2012, Lecture Notes in Computer Science.

[3]  Alejandro Zunino,et al.  An empirical comparison of botnet detection methods , 2014, Comput. Secur..

[4]  Elisa Bertino,et al.  Botnets and Internet of Things Security , 2017, Computer.

[5]  Xueqi Cheng,et al.  DeepHawkes: Bridging the Gap between Prediction and Understanding of Information Cascades , 2017, CIKM.

[6]  Jianxin Li,et al.  Towards an efficient snapshot approach for virtual machines in clouds , 2017, Inf. Sci..

[7]  Yao Zheng,et al.  PeerClean: Unveiling peer-to-peer botnets through dynamic group behavior analysis , 2015, 2015 IEEE Conference on Computer Communications (INFOCOM).

[8]  Georgios Kambourakis,et al.  DDoS in the IoT: Mirai and Other Botnets , 2017, Computer.

[9]  Joseph B. Kadane,et al.  Using uncleanliness to predict future botnet addresses , 2007, IMC '07.

[10]  Cheng Li,et al.  DeepCas: An End-to-end Predictor of Information Cascades , 2016, WWW.

[11]  Peter D. Hoff,et al.  Modeling homophily and stochastic equivalence in symmetric relational data , 2007, NIPS.

[12]  Shuang-Hong Yang,et al.  Mixture of Mutually Exciting Processes for Viral Diffusion , 2013, ICML.

[13]  Marc Dacier,et al.  A framework for attack patterns' discovery in honeynet data , 2008 .

[14]  Scott W. Linderman,et al.  Discovering Latent Network Structure in Point Process Data , 2014, ICML.

[15]  Le Song,et al.  Constructing Disease Network and Temporal Progression Model via Context-Sensitive Hawkes Process , 2015, 2015 IEEE International Conference on Data Mining.

[16]  Prateek Mittal,et al.  BotGrep: Finding P2P Bots with Structured Graph Analysis , 2010, USENIX Security Symposium.

[17]  Michael K. Reiter,et al.  Are Your Hosts Trading or Plotting? Telling P2P File-Sharing and Bots Apart , 2010, 2010 IEEE 30th International Conference on Distributed Computing Systems.

[18]  Jin Li,et al.  Insight of the protection for data security under selective opening attacks , 2017, Inf. Sci..

[19]  Ke Li,et al.  POSTER: A Lightweight Unknown HTTP Botnets Detecting and Characterizing System , 2014, CCS.

[20]  Sharath Chandra Guntuku,et al.  Big Data Analytics framework for Peer-to-Peer Botnet detection using Random Forests , 2014, Inf. Sci..

[21]  Javier López,et al.  Modelling trust dynamics in the Internet of Things , 2017, Inf. Sci..

[22]  Jure Leskovec,et al.  Can cascades be predicted? , 2014, WWW.

[23]  Xiang Lian,et al.  Development of foundation models for Internet of Things , 2010, Frontiers of Computer Science in China.

[24]  T. Taimre,et al.  Hawkes Processes , 2015, 1507.02822.

[25]  T. J. Mitchell,et al.  Bayesian Variable Selection in Linear Regression , 1988 .

[26]  Kun Zhang,et al.  Learning Network of Multivariate Hawkes Processes: A Time Series Approach , 2016, UAI.

[27]  Jie Wu,et al.  Dependable Structural Health Monitoring Using Wireless Sensor Networks , 2015, IEEE Transactions on Dependable and Secure Computing.

[28]  Ali A. Ghorbani,et al.  Botnet detection based on traffic behavior analysis and flow intervals , 2013, Comput. Secur..

[29]  Jin Li,et al.  Secure attribute-based data sharing for resource-limited users in cloud computing , 2018, Comput. Secur..

[30]  Lior Rokach,et al.  Identifying Attack Propagation Patterns in Honeypots Using Markov Chains Modeling and Complex Networks Analysis , 2016, 2016 IEEE International Conference on Software Science, Technology and Engineering (SWSTE).

[31]  Yingjie Tian,et al.  A Comprehensive Survey of Clustering Algorithms , 2015, Annals of Data Science.

[32]  Xiapu Luo,et al.  Detecting stealthy P2P botnets using statistical traffic fingerprints , 2011, 2011 IEEE/IFIP 41st International Conference on Dependable Systems & Networks (DSN).

[33]  Chen Gang,et al.  Survey of Probabilistic Graphical Models , 2013, IEEE WISA.

[34]  Shishir Nagaraja Botyacc: Unified P2P Botnet Detection Using Behavioural Analysis and Graph Analysis , 2014, ESORICS.

[35]  Antonio Iera,et al.  The Internet of Things: A survey , 2010, Comput. Networks.