Unexpected means of protocol inference

Network managers are inevitably called upon to associate network traffic with particular applications. Indeed, this operation is critical for a wide range of management functions ranging from debugging and security to analytics and policy support. Traditionally, managers have relied on application adherence to a well established global port mapping: Web traffic on port 80, mail traffic on port 25 and so on. However, a range of factors - including firewall port blocking, tunneling, dynamic port allocation, and a bloom of new distributed applications - has weakened the value of this approach. We analyze three alternative mechanisms using statistical and structural content models for automatically identifying traffic that uses the same application-layer protocol, relying solely on flow content. In this manner, known applications may be identified regardless of port number, while traffic from one unknown application will be identified as distinct from another. We evaluate each mechanism's classification performance using real-world traffic traces from multiple sites.

[1]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[2]  King-Sun Fu,et al.  A distance measure between attributed relational graphs for pattern recognition , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[3]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[4]  Vern Paxson,et al.  Bro: a system for detecting network intruders in real-time , 1998, Comput. Networks.

[5]  kc claffy,et al.  The nature of the beast: Recent traffic measurements from an Internet backbone , 1998 .

[6]  David Plonka,et al.  FlowScan: A Network Traffic Flow Reporting and Visualization Tool , 2000, LISA.

[7]  David Mazières,et al.  Kademlia: A Peer-to-Peer Information System Based on the XOR Metric , 2002, IPTPS.

[8]  Chase Cotton,et al.  Packet-level traffic measurements from the Sprint IP backbone , 2003, IEEE Netw..

[9]  Anja Feldmann,et al.  An analysis of Internet chat systems , 2003, IMC '03.

[10]  Michalis Faloutsos,et al.  Transport layer identification of P2P traffic , 2004, IMC '04.

[11]  Michalis Faloutsos,et al.  Is P2P dying or just hiding? [P2P traffic measurement] , 2004, IEEE Global Telecommunications Conference, 2004. GLOBECOM '04..

[12]  Oliver Spatscheck,et al.  Accurate, scalable in-network identification of p2p traffic using application signatures , 2004, WWW '04.

[13]  Sebastian Zander,et al.  Self-Learning IP Traffic Classification Based on Statistical Flow Characteristics , 2005, PAM.

[14]  Patrick Haffner,et al.  ACAS: automated construction of application signatures , 2005, MineNet '05.

[15]  Konstantina Papagiannaki,et al.  Toward the Accurate Identification of Network Applications , 2005, PAM.

[16]  Bertil Schmidt,et al.  Hyper customized processors for bio-sequence database scanning on FPGAs , 2005, FPGA '05.

[17]  Michalis Faloutsos,et al.  BLINC: multilevel traffic classification in the dark , 2005, SIGCOMM '05.

[18]  Andrew W. Moore,et al.  Internet traffic classification using bayesian analysis techniques , 2005, SIGMETRICS '05.

[19]  B. Schmidt,et al.  Using Graphics Hardware to Accelerate Biological Sequence Database Scanning , 2005, TENCON 2005 - 2005 IEEE Region 10 Conference.

[20]  Renata Teixeira,et al.  Traffic classification on the fly , 2006, CCRV.

[21]  Henning Schulzrinne,et al.  An Analysis of the Skype Peer-to-Peer Internet Telephony Protocol , 2004, Proceedings IEEE INFOCOM 2006. 25TH IEEE International Conference on Computer Communications.