论文信息 - Improving Performance of Network Traffic Classification Systems by Cleaning Training Data

Improving Performance of Network Traffic Classification Systems by Cleaning Training Data

In this paper we propose to apply an algorithm for finding out and cleaning mislabeled training sample in an adversarial learning context, in which a malicious user tries to camouflage training patterns in order to limit the classification system performance. In particular, we describe how this algorithm can be effectively applied to the problem of identifying HTTP traffic flowing through port TCP 80, where mislabeled samples can be forced by using port-spoofing attacks.

Carlo Sansone | Francesco Gargiulo

[1] Blaine Nelson,et al. Can machine learning be secure? , 2006, ASIACCS '06.

[2] Carlo Sansone,et al. SOCIAL: Self-Organizing ClassIfier ensemble for Adversarial Learning , 2010, MCS.

[3] Konstantina Papagiannaki,et al. Toward the Accurate Identification of Network Applications , 2005, PAM.

[4] Sebastian Zander,et al. A preliminary performance comparison of five machine learning algorithms for practical IP traffic flow classification , 2006, CCRV.

[5] Carlo Sansone,et al. Network Protocol Verification by a Classifier Selection Ensemble , 2009, MCS.

[6] Luca Salgarelli,et al. Support Vector Machines for TCP traffic classification , 2009, Comput. Networks.

[7] Antonio Pescapè,et al. Traffic classification and its applications to modern networks , 2009, Comput. Networks.