An Algorithm for Automatic Clustering Number Determination in Networks Intrusion Detection

To address the issue in fuzzy C-means algorithm (FCM) that clustering number has to be pre-defined,a clustering algorithm,F-CMSVM (fuzzy C-means and support vector machine algorithm),is proposed for automatic clustering number determination.Above all,the data set is classifed into two clusters by FCM.Then,support vector machine (SVM) with a fuzzy membership function is used to testify whether the data set can be classified further. Finally,the result of clusters can be obtained by repeating the computation process.Because affiliating matrix, obtained by the introduction of SVM into FCM,is defined to be the fuzzy membership function,each different input data sample can have different penalty value,and the separating hyper-plane is optimized.F-CMSVM is an unsupervised algorithm in which it is neither needed to label training data set nor specify clustering number.As shown from our simulation experiment over networks connection records from KDD CUP 1999 data set,F-CMSVM has efficient performance in clustering number optimization and intrusion detection.