Data Clustering for Anomaly Detection in Content-Centric Networks

Content-Centric Networks (CCNs) have recently emerged as an innovative trend to overcome many inherent security problems in the IP-based (host-based) networks by securing the content it- self rather than the channel through which it travels. In this net- work architecture new kinds of attacks -ranging from DoS to pri- vacy attacks- will appear. Therefore, it is becoming necessary to design a flexible and powerful mechanism to be able to detect them in an intelligent manner the first time they are employed. In this paper, a novel anomaly detection system has been pro- posed to detect known and previously unknown types of attacks using an efficient unsupervised learning engine that utilizes clus- tering with the optimal number of clusters, high detection rate, and low false positive rate in the same time over the CCN traf- fics flows. This paper compares the performance of five different clustering algorithms in the proposed anomaly detection system in- cluding K-means and Farthest First as Partitioning clustering, Cob- web as Hierarchical clustering, DBSCAN as Density-based clus- tering and Self Organizing Map (SOM) as Model-based cluster- ing. Results show that DBSCAN method is the most efficient one for this purpose since it outperforms the other ones in terms of high detection rate and low false positive rate in the same time. General Terms:

[1]  Ian H. Witten,et al.  Data mining - practical machine learning tools and techniques, Second Edition , 2005, The Morgan Kaufmann series in data management systems.

[2]  Mukaddim Pathan,et al.  Next generation content networks: trends and challenges , 2009, UPGRADE-CN '09.

[3]  Pascal Bouvry,et al.  Anomaly detection in TCP/IP networks using immune systems paradigm , 2007, Comput. Commun..

[4]  Georg Carle,et al.  Traffic Anomaly Detection Using K-Means Clustering , 2007 .

[5]  Geoff Holmes,et al.  Clustering Large Datasets Using Cobweb and K-Means in Tandem , 2004, Australian Conference on Artificial Intelligence.

[6]  Tobias Lauinger,et al.  Security & Scalability of Content-Centric Networking , 2010 .

[7]  Vipin Kumar,et al.  Introduction to Data Mining, (First Edition) , 2005 .

[8]  K. Mumtaz An Analysis on Density Based Clustering of Multi Dimensional Spatial Data , 2010 .

[9]  J LoboL.M.R.,et al.  A Comparative Study for Selecting the Best Unsupervised Learning Algorithm in E-Learning System , 2012 .

[10]  Ronen Feldman,et al.  The Data Mining and Knowledge Discovery Handbook , 2005 .

[11]  D. Pham,et al.  Selection of K in K-means clustering , 2005 .

[12]  Amutha Prabakar Muniyandi,et al.  Network Anomaly Detection by Cascading K-Means Clustering and C4.5 Decision Tree algorithm , 2012 .

[13]  Ujjwal Maulik,et al.  Performance Evaluation of Some Clustering Algorithms and Validity Indices , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Anirban Mahanti,et al.  Traffic classification using clustering algorithms , 2006, MineNet '06.

[15]  Albert Fornells,et al.  Explanations of unsupervised learning clustering applied to data security analysis , 2009, Neurocomputing.

[16]  Gene Tsudik,et al.  DoS and DDoS in Named Data Networking , 2012, 2013 22nd International Conference on Computer Communication and Networks (ICCCN).

[17]  Nicola Blefari-Melazzi,et al.  CONET: a content centric inter-networking architecture , 2011, ICN '11.

[18]  Pekka Nikander,et al.  Secure naming in information-centric networks , 2010, ReARCH '10.

[19]  J. Dunn Well-Separated Clusters and Optimal Fuzzy Partitions , 1974 .

[20]  Jose F. Nieves,et al.  Data Clustering for Anomaly Detection in Network Intrusion Detection , 2009 .

[21]  Douglas H. Fisher,et al.  Knowledge Acquisition Via Incremental Conceptual Clustering , 1987, Machine Learning.

[22]  Holly E. Rushmeier,et al.  A Scalable Parallel Algorithm for Self-Organizing Maps with Applications to Sparse Data Mining Problems , 1999, Data Mining and Knowledge Discovery.

[23]  Lior Rokach,et al.  A survey of Clustering Algorithms , 2010, Data Mining and Knowledge Discovery Handbook.

[24]  Marco Furini,et al.  International Journal of Computer and Applications , 2010 .

[25]  Dirk Van Rooy,et al.  Trust and privacy in the future internet—a research perspective , 2010 .

[26]  Esa Alhoniemi,et al.  Clustering of the self-organizing map , 2000, IEEE Trans. Neural Networks Learn. Syst..

[27]  Indra Widjaja,et al.  Towards a flexible resource management system for Content Centric Networking , 2012, 2012 IEEE International Conference on Communications (ICC).

[28]  Amin Karami,et al.  UTILIZATION AND COMPARISON OF MULTI ATTRIBUTE DECISION MAKING TECHNIQUES TO RANK BAYESIAN NETWORK OPTIONS Master Degree Project in Informatics One year Level ECTS 30 Spring term Year 2011 , 2011 .

[29]  M. V. Velzen,et al.  Self-organizing maps , 2007 .

[30]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[31]  Deborah Estrin,et al.  Named Data Networking (NDN) Project , 2010 .

[32]  T. Caliński,et al.  A dendrite method for cluster analysis , 1974 .

[33]  Csaba Legány,et al.  Cluster validity measurement techniques , 2006 .

[34]  Qiang Wang,et al.  A performance evaluation framework for association mining in spatial data , 2010, Journal of Intelligent Information Systems.

[35]  Donald W. Bouldin,et al.  A Cluster Separation Measure , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[37]  Daniel T. Larose,et al.  Discovering Knowledge in Data: An Introduction to Data Mining , 2005 .

[38]  Michael Stonebraker,et al.  The Morgan Kaufmann Series in Data Management Systems , 1999 .

[39]  Bing Liu,et al.  Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data , 2006, Data-Centric Systems and Applications.

[40]  P. Langley,et al.  Concept formation in structured domains , 1991 .

[41]  M. Pazzani,et al.  Concept formation knowledge and experience in unsupervised learning , 1991 .

[42]  Bengt Ahlgren,et al.  A Survey of Information-Centric Networking (Draft) , 2010, Information-Centric Networking.

[43]  Rui Xu,et al.  Survey of clustering algorithms , 2005, IEEE Transactions on Neural Networks.

[44]  Mengjun Xie,et al.  Enhancing cache robustness for content-centric networking , 2012, 2012 Proceedings IEEE INFOCOM.

[45]  Salvatore J. Stolfo,et al.  A framework for constructing features and models for intrusion detection systems , 2000, TSEC.

[46]  Van Jacobson,et al.  Networking named content , 2009, CoNEXT '09.