Fine‐grained traffic classification based on functional separation

SUMMARY Current efforts to classify Internet traffic highlight accuracy. Previous studies have focused on the detection of major applications such as P2P and streaming applications. However, these applications can generate various types of traffic which are often considered as minor and ignorant traffic portions. As network applications become more complex, the price paid for not concentrating on minor traffic classes is in reduction of accuracy and completeness. In this context, we propose a fine-grained traffic classification scheme and its detailed method, called functional separation. Our proposal can detect, according to functionalities, different types of traffic generated by a single application and should increase completeness by reducing the amount of undetected traffic. We verify our method with real-world traffic. Our performance comparison against existing DPI-based classification frameworks shows that the fine-grained classification scheme achieves consistently higher accuracyand completeness. Copyright © 2013 John Wiley & Sons, Ltd.

[1]  Maurizio Dusi,et al.  Tunnel Hunter: Detecting application-layer tunnels with statistical fingerprinting , 2009, Comput. Networks.

[2]  A. L. Narasimha Reddy,et al.  Image-Based Anomaly Detection Technique: Algorithm, Implementation and Effectiveness , 2006, IEEE Journal on Selected Areas in Communications.

[3]  Yanghee Choi,et al.  Internet traffic classification demystified: on the sources of the discriminative power , 2010, CoNEXT.

[4]  Nevil Brownlee,et al.  Traffic Flow Measurement: Architecture , 1999, RFC.

[5]  Ronald L. Rivest,et al.  Introduction to Algorithms, Second Edition , 2001 .

[6]  Michalis Faloutsos,et al.  BLINC: multilevel traffic classification in the dark , 2005, SIGCOMM '05.

[7]  Virgílio A. F. Almeida,et al.  Characterizing broadband user behavior , 2004, NRBC '04.

[8]  C. Papadopoulos,et al.  Inherent Behaviors for On-line Detection of Peer-to-Peer File Sharing , 2007, 2007 IEEE Global Internet Symposium.

[9]  Renata Teixeira,et al.  Traffic classification on the fly , 2006, CCRV.

[10]  Marios Iliofotou Exploring Graph-Based Network Traffic Monitoring , 2009, IEEE INFOCOM Workshops 2009.

[11]  James Won-Ki Hong,et al.  Towards automated application signature generation for traffic identification , 2008, NOMS 2008 - 2008 IEEE Network Operations and Management Symposium.

[12]  M. Mitzenmacher Graption : Automated Detection of P 2 P Applications using Traffic Dispersion Graphs ( TDGs ) , 2008 .

[13]  David Moore,et al.  The CoralReef Software Suite as a Tool for System and Network Administrators , 2001, LISA.

[14]  Anirban Mahanti,et al.  Traffic classification using clustering algorithms , 2006, MineNet '06.

[15]  James Won-Ki Hong,et al.  Characteristic analysis of internet traffic from the perspective of flows , 2006, Comput. Commun..

[16]  Gerard Salton,et al.  A vector space model for automatic indexing , 1975, CACM.

[17]  Sven Ehlert,et al.  Analysis and Signature of Skype VoIP Session Traffic , 2006 .

[18]  Andrew W. Moore,et al.  Internet traffic classification using bayesian analysis techniques , 2005, SIGMETRICS '05.

[19]  Hao Wang,et al.  Towards automatic generation of vulnerability-based signatures , 2006, 2006 IEEE Symposium on Security and Privacy (S&P'06).

[20]  Geoff Holmes,et al.  Benchmarking Attribute Selection Techniques for Discrete Class Data Mining , 2003, IEEE Trans. Knowl. Data Eng..

[21]  Guillaume Urvoy-Keller,et al.  Challenging statistical classification for operational usage: the ADSL case , 2009, IMC '09.

[22]  Patrick Haffner,et al.  ACAS: automated construction of application signatures , 2005, MineNet '05.

[23]  Anirban Mahanti,et al.  Byte me: a case for byte accuracy in traffic classification , 2007, MineNet '07.

[24]  James Won-Ki Hong,et al.  Application‐Level Traffic Monitoring and an Analysis on IP Networks , 2005 .

[25]  Michalis Faloutsos,et al.  Is P2P dying or just hiding? [P2P traffic measurement] , 2004, IEEE Global Telecommunications Conference, 2004. GLOBECOM '04..

[26]  Renata Teixeira,et al.  Early application identification , 2006, CoNEXT '06.

[27]  Shingo Ata,et al.  Fast, Accurate, and Lightweight Real-Time Traffic Identification Method Based on Flow Statistics , 2007, PAM.

[28]  James Newsome,et al.  Dynamic Taint Analysis for Automatic Detection, Analysis, and SignatureGeneration of Exploits on Commodity Software , 2005, NDSS.

[29]  Ming-Yang Kao,et al.  Hamsa: fast signature generation for zero-day polymorphic worms with provable attack resilience , 2006, 2006 IEEE Symposium on Security and Privacy (S&P'06).

[30]  Renata Teixeira,et al.  Early Recognition of Encrypted Applications , 2007, PAM.

[31]  Matthew Roughan,et al.  Class-of-service mapping for QoS: a statistical signature-based approach to IP traffic classification , 2004, IMC '04.

[32]  Fabrice Guillemin,et al.  Analysis of ADSL traffic on an IP backbone link , 2003, GLOBECOM '03. IEEE Global Telecommunications Conference (IEEE Cat. No.03CH37489).

[33]  George C. Polyzos,et al.  Tracking long-term growth of the NSFNET , 1994, CACM.

[34]  Sándor Molnár,et al.  Identification and Analysis of Peer-to-Peer Traffic , 2006, J. Commun..

[35]  Chin-Tser Huang,et al.  Wavelet-based Real Time Detection of Network Traffic Anomalies , 2006, 2006 Securecomm and Workshops.

[36]  Mark Crovella,et al.  Mining anomalies using traffic feature distributions , 2005, SIGCOMM '05.

[37]  Konstantina Papagiannaki,et al.  Flow classification by histograms: or how to go on safari in the internet , 2004, SIGMETRICS '04/Performance '04.

[38]  B. Karp,et al.  Autograph: Toward Automated, Distributed Worm Signature Detection , 2004, USENIX Security Symposium.

[39]  Donald F. Towsley,et al.  Characterizing and Detecting Skype-Relayed Traffic , 2006, Proceedings IEEE INFOCOM 2006. 25TH IEEE International Conference on Computer Communications.

[40]  Kensuke Fukuda,et al.  Seven Years and One Day: Sketching the Evolution of Internet Traffic , 2009, IEEE INFOCOM 2009.

[41]  Sebastian Zander,et al.  Automated traffic classification and application identification using machine learning , 2005, The IEEE Conference on Local Computer Networks 30th Anniversary (LCN'05)l.

[42]  James Newsome,et al.  Polygraph: automatically generating signatures for polymorphic worms , 2005, 2005 IEEE Symposium on Security and Privacy (S&P'05).

[43]  Igor Kononenko,et al.  Estimating Attributes: Analysis and Extensions of RELIEF , 1994, ECML.

[44]  Carey L. Williamson,et al.  A Longitudinal Study of P2P Traffic Classification , 2006, 14th IEEE International Symposium on Modeling, Analysis, and Simulation.

[45]  Ajita John,et al.  PISA: Automatic Extraction of Traffic Signatures , 2005, NETWORKING.

[46]  Paramvir Bahl,et al.  Characterizing user behavior and network performance in a public wireless LAN , 2002, SIGMETRICS '02.

[47]  Carey L. Williamson,et al.  Categories and Subject Descriptors: C.4 [Computer Systems Organization]Performance of Systems , 2022 .

[48]  Anthony McGregor,et al.  Flow Clustering Using Machine Learning Techniques , 2004, PAM.

[49]  Kun-Chan Lan,et al.  A measurement study of correlations of Internet flow characteristics , 2006, Comput. Networks.

[50]  Evangelos P. Markatos,et al.  Exclusion-based Signature Matching for Intrusion Detection , 2002 .

[51]  Sebastian Zander,et al.  A preliminary performance comparison of five machine learning algorithms for practical IP traffic flow classification , 2006, CCRV.

[52]  Anja Feldmann,et al.  An analysis of Internet chat systems , 2003, IMC '03.

[53]  Hans-Peter Kriegel,et al.  Density-Based Clustering in Spatial Databases: The Algorithm GDBSCAN and Its Applications , 1998, Data Mining and Knowledge Discovery.

[54]  Martin Roesch,et al.  Snort - Lightweight Intrusion Detection for Networks , 1999 .

[55]  Michael Mitzenmacher,et al.  Network Traffic Analysis using Traffic Dispersion Graphs (TDGs): Techniques and Hardware Implementation , 2007 .

[56]  Chi-Hung Lin,et al.  Towards fine-grained traffic classification for web applications , 2014, 2014 Australasian Telecommunication Networks and Applications Conference (ATNAC).

[57]  Carsten Lund,et al.  Predicting resource usage and estimation accuracy in an IP flow measurement collection infrastructure , 2003, IMC '03.

[58]  Panayiotis Mavrommatis,et al.  Identifying Known and Unknown Peer-to-Peer Traffic , 2006, Fifth IEEE International Symposium on Network Computing and Applications (NCA'06).

[59]  George Varghese,et al.  Network monitoring using traffic dispersion graphs (tdgs) , 2007, IMC '07.

[60]  Marko Robnik-Sikonja,et al.  Theoretical and Empirical Analysis of ReliefF and RReliefF , 2003, Machine Learning.

[61]  Marco Mellia,et al.  Revealing skype traffic: when randomness plays with you , 2007, SIGCOMM 2007.

[62]  James Won-Ki Hong,et al.  A Hybrid Approach for Accurate Application Traffic Identification , 2006, 2006 4th IEEE/IFIP Workshop on End-to-End Monitoring Techniques and Services.

[63]  Andrew W. Moore,et al.  Bayesian Neural Networks for Internet Traffic Classification , 2007, IEEE Transactions on Neural Networks.

[64]  Michael K. Reiter,et al.  Hit-List Worm Detection and Bot Identification in Large Networks Using Protocol Graphs , 2007, RAID.

[65]  Hiroshi Esaki,et al.  The impact and implications of the growth in residential user-to-user traffic , 2006, SIGCOMM.

[66]  Sebastian Zander,et al.  Self-Learning IP Traffic Classification Based on Statistical Flow Characteristics , 2005, PAM.

[67]  Anja Feldmann,et al.  On dominant characteristics of residential broadband internet traffic , 2009, IMC '09.

[68]  Larry A. Rendell,et al.  A Practical Approach to Feature Selection , 1992, ML.

[69]  Huan Liu,et al.  Feature Selection for Classification , 1997, Intell. Data Anal..

[70]  Stefan Savage,et al.  Unexpected means of protocol inference , 2006, IMC '06.

[71]  Konstantina Papagiannaki,et al.  Toward the Accurate Identification of Network Applications , 2005, PAM.

[72]  Antonio Pescapè,et al.  Classification of Network Traffic via Packet-Level Hidden Markov Models , 2008, IEEE GLOBECOM 2008 - 2008 IEEE Global Telecommunications Conference.

[73]  Patrick Brown,et al.  Analysis of Peer-to-Peer Traffic on ADSL , 2005, PAM.

[74]  Andrew W. Moore,et al.  Traffic Classification Using a Statistical Approach , 2005, PAM.

[75]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[76]  Maurizio Dusi,et al.  A Preliminary Look at the Privacy of SSH Tunnels , 2008, 2008 Proceedings of 17th International Conference on Computer Communications and Networks.

[77]  Michalis Faloutsos,et al.  Transport layer identification of P2P traffic , 2004, IMC '04.

[78]  Kimberly Claffy,et al.  Internet traffic characterization , 1994 .

[79]  Wolfgang John,et al.  Heuristics to Classify Internet Backbone Traffic based on Connection Patterns , 2008, 2008 International Conference on Information Networking.

[80]  R. Wilder,et al.  Wide-area Internet traffic patterns and characteristics , 1997, IEEE Netw..

[81]  Augustin Soule,et al.  Blind application recognition through behavioral classification , 2006 .

[82]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[83]  Maurizio Dusi,et al.  Traffic classification through simple statistical fingerprinting , 2007, CCRV.

[84]  Tomas Olovsson,et al.  Trends and Differences in Connection-Behavior within Classes of Internet Backbone Traffic , 2008, PAM.

[85]  Carey L. Williamson,et al.  Identifying and discriminating between web and peer-to-peer traffic in the network core , 2007, WWW '07.

[86]  M. Papadopouli,et al.  Appmon : An Application for Accurate Per-Application Network Traffic Characterization , 2006 .

[87]  Jia Wang,et al.  Analyzing peer-to-peer traffic across large networks , 2002, IMW '02.

[88]  James Won-Ki Hong,et al.  An effective similarity metric for application traffic classification , 2010, 2010 IEEE Network Operations and Management Symposium - NOMS 2010.

[89]  Luca Deri,et al.  Monitoring networks using ntop , 2001, 2001 IEEE/IFIP International Symposium on Integrated Network Management Proceedings. Integrated Network Management VII. Integrated Management Strategies for the New Millennium (Cat. No.01EX470).

[90]  John Heidemann,et al.  Estimating P2P Traffic Volume at USC , 2007 .

[91]  Nevil Brownlee,et al.  Traffic Flow Measurement: Experiences with NeTraMet , 1997, RFC.

[92]  Wolfgang John,et al.  Analysis of internet backbone traffic and header anomalies observed , 2007, IMC '07.

[93]  Carey L. Williamson,et al.  Offline/realtime traffic classification using semi-supervised learning , 2007, Perform. Evaluation.

[94]  J. Erman,et al.  QRP05-4: Internet Traffic Identification using Machine Learning , 2006, IEEE Globecom 2006.

[95]  István Szabó,et al.  Accurate Traffic Classification , 2007, 2007 IEEE International Symposium on a World of Wireless, Mobile and Multimedia Networks.

[96]  Grenville J. Armitage,et al.  A survey of techniques for internet traffic classification using machine learning , 2008, IEEE Communications Surveys & Tutorials.

[97]  Krishna P. Gummadi,et al.  Measurement, modeling, and analysis of a peer-to-peer file-sharing workload , 2003, SOSP '03.

[98]  Oliver Spatscheck,et al.  Accurate, scalable in-network identification of p2p traffic using application signatures , 2004, WWW '04.

[99]  Michalis Faloutsos,et al.  Internet traffic classification demystified: myths, caveats, and the best practices , 2008, CoNEXT '08.

[100]  Hiroshi Esaki,et al.  Observing slow crustal movement in residential user traffic , 2008, CoNEXT '08.