Identifying dominant network flows is important for network anomaly detection. Estan et al. proposed an algorithm that effectively detects dominant network flows by constructing multidimensional clusters based on a "natural hierarchy" existing in the five-tuple information of network flows. Wang et al. improved this algorithm by significantly reducing its computational complexity. In practice, however, the algorithm's execution time may be relatively long when handling large volumes of traffic with a low threshold. In this paper, we introduce a practical technique that further improves the time efficiency of Wang et al.'s algorithm. Our approach simplifies network traffic's hierarchical structure by utilizing local IP subnet information. The comparative performance of our approach and Wang et al.'s algorithm is evaluated using real NetFlow data collected at a large campus network. The experimental results demonstrate that our algorithm is much more time efficient than Wang et al.'s algorithm.
[1]
George Varghese,et al.
Automatically inferring patterns of resource consumption in network traffic
,
2003,
SIGCOMM '03.
[2]
Rajeev Motwani,et al.
Approximate Frequency Counts over Data Streams
,
2012,
VLDB.
[3]
George Kesidis,et al.
Efficient Mining of the Multidimensional Traffic Cluster Hierarchy for Digesting, Visualization, and Anomaly Identification
,
2006,
IEEE Journal on Selected Areas in Communications.
[4]
Graham Cormode,et al.
What's new: finding significant differences in network data streams
,
2004,
IEEE/ACM Transactions on Networking.