An effective similarity metric for application traffic classification

Application level traffic classification is one of the major issues in network monitoring and traffic engineering. In our previous study, we proposed a new traffic classification method that utilizes a flow similarity function based on Cosine Similarity. This paper compares the classification accuracy of three similarity metrics, Jaccard Similarity, Cosine Similarity, and Gaussian Radius Based Function, to select appropriate similarity metrics for application traffic classification. This paper also defines a new two-stage traffic classification algorithm that can guarantee high classification accuracy even under an asymmetric routing environment, with reasonable complexity.

[1]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[2]  Anirban Mahanti,et al.  Traffic classification using clustering algorithms , 2006, MineNet '06.

[3]  Robert S. Boyer,et al.  A fast string searching algorithm , 1977, CACM.

[4]  James Won-Ki Hong,et al.  Traffic Classification Based on Flow Similarity , 2009, IPOM.

[5]  Gerard Salton,et al.  A vector space model for automatic indexing , 1975, CACM.

[6]  Konstantina Papagiannaki,et al.  Toward the Accurate Identification of Network Applications , 2005, PAM.

[7]  Krishna P. Gummadi,et al.  Measurement, modeling, and analysis of a peer-to-peer file-sharing workload , 2003, SOSP '03.

[8]  Oliver Spatscheck,et al.  Accurate, scalable in-network identification of p2p traffic using application signatures , 2004, WWW '04.

[9]  Michalis Faloutsos,et al.  BLINC: multilevel traffic classification in the dark , 2005, SIGCOMM '05.

[10]  Fulvio Risso,et al.  Lightweight, Payload-Based Traffic Classification: An Experimental Evaluation , 2008, 2008 IEEE International Conference on Communications.

[11]  James Won-Ki Hong,et al.  Towards automated application signature generation for traffic identification , 2008, NOMS 2008 - 2008 IEEE Network Operations and Management Symposium.

[12]  Konstantina Papagiannaki,et al.  Design, Measurement and Management of Large-Scale IP Networks: Traffic classification in the dark , 2008 .

[13]  Michalis Faloutsos,et al.  Transport layer identification of P2P traffic , 2004, IMC '04.

[14]  Taesang Choi,et al.  Content-aware Internet application traffic measurement and analysis , 2004, 2004 IEEE/IFIP Network Operations and Management Symposium (IEEE Cat. No.04CH37507).

[15]  John A. Hartigan,et al.  Clustering Algorithms , 1975 .

[16]  Andrew W. Moore,et al.  Internet traffic classification using bayesian analysis techniques , 2005, SIGMETRICS '05.

[17]  Patrick Haffner,et al.  ACAS: automated construction of application signatures , 2005, MineNet '05.