Anomaly detection using firefly harmonic clustering algorithm

The performance of communication networks can be affected by a number of factors including misconfigu-ration, equipments outages, attacks originated from legitimate behavior or not, software errors, among many other causes. These factors may cause an unexpected change in the traffic behavior, creating what we call anomalies that may represent a loss of performance or breach of network security. Knowing the behavior pattern of the network is essential to detect and characterize an anomaly. Therefore, this paper presents an algorithm based on the use of Digital Signature of Network Segment (DSNS), used to model the traffic behavior pattern. We propose a clustering algorithm, K-Harmonic means (KHM), combined with a new heuristic approach, Firefly Algorithm (FA), for network volume anomaly detection. The KHM calculate a weighting function of each point to calculate new centroids and circumventing the initialization problem present in most center based clustering algorithm and exploits the search capability of FA from escaping local optima. Processing the DSNS data and real traffic adata is possible to detect and point intervals considered anomalous with a trade-off between the 90% true-positive rate and 30% false-positive rate.

[1]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[2]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[3]  Mohammed J. Zaki,et al.  ADMIT: anomaly-based data mining for intrusions , 2002, KDD.

[4]  Jung-Min Park,et al.  An overview of anomaly detection techniques: Existing solutions and latest technological trends , 2007, Comput. Networks.

[5]  Zülal Güngör,et al.  K-harmonic means data clustering with simulated annealing heuristic , 2007, Appl. Math. Comput..

[6]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[7]  Joel J. P. C. Rodrigues,et al.  Parameterized Anomaly Detection System with Automatic Configuration , 2009, GLOBECOM 2009 - 2009 IEEE Global Telecommunications Conference.

[8]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[9]  Sameh Otri,et al.  Data clustering using the bees algorithm , 2007 .

[10]  Tieli Sun,et al.  An efficient hybrid data clustering method based on K-harmonic means and Particle Swarm Optimization , 2009, Expert Syst. Appl..

[11]  Mario Lemes Proença,et al.  Baseline to help with network management , 2004, e-Business and Telecommunication Networks.

[12]  Qingbo Yang,et al.  A Survey of Anomaly Detection Methods in Networks , 2009, 2009 International Symposium on Computer Network and Multimedia Technology.

[13]  Shokri Z. Selim,et al.  K-Means-Type Algorithms: A Generalized Convergence Theorem and Characterization of Local Optimality , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Xin-She Yang,et al.  Firefly Algorithms for Multimodal Optimization , 2009, SAGA.

[15]  Umeshwar Dayal,et al.  K-Harmonic Means - A Data Clustering Algorithm , 1999 .

[16]  Joel J. P. C. Rodrigues,et al.  Anomaly detection using baseline and K-means clustering , 2010, SoftCOM 2010, 18th International Conference on Software, Telecommunications and Computer Networks.

[17]  Xin-She Yang,et al.  Nature-Inspired Metaheuristic Algorithms , 2008 .