A multi-protocol network log clustering method based on grid
暂无分享,去创建一个
To deal with large scale network log and provide brief data sources for the later log analysis, the thesis propose a multi-protocol network log clustering method based on grid, which plot every log into data grid and first clustering in the grid,then According to similarity judgment, make the initial cluster secondary clustering, finally output clustered log, some sparse data and outlier data. Experiment result shows that the method can effectively compress log storage, reduce the time complexity, deal with actual dynamic data and realize incremental clustering.
[1] Joshua Zhexue Huang,et al. Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values , 1998, Data Mining and Knowledge Discovery.