Detection and classification of peer-to-peer traffic: A survey

The emergence of new Internet paradigms has changed the common properties of network data, increasing the bandwidth consumption and balancing traffic in both directions. These facts raise important challenges, making it necessary to devise effective solutions for managing network traffic. Since traditional methods are rather ineffective and easily bypassed, particular attention has been paid to the development of new approaches for traffic classification. This article surveys the studies on peer-to-peer traffic detection and classification, making an extended review of the literature. Furthermore, it provides a comprehensive analysis of the concepts and strategies for network monitoring.

[1]  Vern Paxson,et al.  Bro Intrusion Detection System , 2006 .

[2]  Niccolo Cascarano,et al.  An Experimental Evaluation of the Computational Cost of a DPI Traffic Classifier , 2009, GLOBECOM 2009 - 2009 IEEE Global Telecommunications Conference.

[3]  tcpdump Tcpdump/Libpcap public repository , 2010 .

[4]  Carey L. Williamson,et al.  A Longitudinal Study of P2P Traffic Classification , 2006, 14th IEEE International Symposium on Modeling, Analysis, and Simulation.

[5]  P. Salvador,et al.  Identification of Peer-to-Peer Applications' Flow Patterns , 2008, 2008 Next Generation Internet Networks.

[6]  George Lawton Is peer-to-peer secure enough for corporate use? , 2004, Computer.

[7]  Shudong Jin,et al.  IMRG workshop on application classification and identification report , 2008, CCRV.

[8]  Grenville J. Armitage,et al.  Training on multiple sub-flows to optimise the use of Machine Learning classifiers in real-world IP networks , 2006, Proceedings. 2006 31st IEEE Conference on Local Computer Networks.

[9]  Ryszard Erazm Jurga,et al.  Packet Sampling for Network Monitoring , 2007 .

[10]  A. Mahanti Internet Traffic Measurement , 2005 .

[11]  Andrew W. Moore,et al.  Internet traffic classification using bayesian analysis techniques , 2005, SIGMETRICS '05.

[12]  Yuan-Cheng Lai,et al.  Application classification using packet size distribution and port association , 2009, J. Netw. Comput. Appl..

[13]  Ece Guran Schmidt,et al.  An intrusion detection based approach for the scalable detection of P2P traffic in the national academic network backbone , 2006, 2006 International Symposium on Computer Networks.

[14]  Patrick Haffner,et al.  ACAS: automated construction of application signatures , 2005, MineNet '05.

[15]  Andrew W. Moore,et al.  Discriminators for use in flow-based classification , 2013 .

[16]  Renata Teixeira,et al.  Early Recognition of Encrypted Applications , 2007, PAM.

[17]  Maurizio Dusi,et al.  Detection of Encrypted Tunnels Across Network Boundaries , 2008, 2008 IEEE International Conference on Communications.

[18]  Matthew Roughan,et al.  P2P the gorilla in the cable , 2003 .

[19]  Roberto Di Pietro,et al.  A methodology for P2P file-sharing traffic detection , 2005, Second International Workshop on Hot Topics in Peer-to-Peer Systems.

[20]  Jin Cao,et al.  Online Identification of Applications Using Statistical Behavior Analysis , 2008, IEEE GLOBECOM 2008 - 2008 IEEE Global Telecommunications Conference.

[21]  T. McGregor,et al.  Quality in measurement: beyond the deployment barrier , 2002, Proceedings 2002 Symposium on Applications and the Internet (SAINT) Workshops.

[22]  Andrew W. Moore,et al.  Bayesian Neural Networks for Internet Traffic Classification , 2007, IEEE Transactions on Neural Networks.

[23]  Hajime Inoue,et al.  NetADHICT: A Tool for Understanding Network Traffic , 2007, LISA.

[24]  B. Raahemi,et al.  Classification of Peer-to-Peer traffic using incremental neural networks (Fuzzy ARTMAP) , 2008, 2008 Canadian Conference on Electrical and Computer Engineering.

[25]  Sebastian Zander,et al.  Self-Learning IP Traffic Classification Based on Statistical Flow Characteristics , 2005, PAM.

[26]  Judith Kelner,et al.  A Survey on Internet Traffic Identification , 2009, IEEE Communications Surveys & Tutorials.

[27]  Mark Crovella,et al.  Mining anomalies using traffic feature distributions , 2005, SIGCOMM '05.

[28]  Marco Canini,et al.  Efficient application identification and the temporal and spatial stability of classification schema , 2009, Comput. Networks.

[29]  Fulvio Risso,et al.  Lightweight, Payload-Based Traffic Classification: An Experimental Evaluation , 2008, 2008 IEEE International Conference on Communications.

[30]  James Won-Ki Hong,et al.  Towards automated application signature generation for traffic identification , 2008, NOMS 2008 - 2008 IEEE Network Operations and Management Symposium.

[31]  M. Mitzenmacher Graption : Automated Detection of P 2 P Applications using Traffic Dispersion Graphs ( TDGs ) , 2008 .

[32]  Benoit Claise,et al.  Advanced network monitoring brings life to the awareness plane , 2008, IEEE Communications Magazine.

[33]  Ralph Weischedel,et al.  PERFORMANCE MEASURES FOR INFORMATION EXTRACTION , 2007 .

[34]  Sebastian Zander,et al.  Automated traffic classification and application identification using machine learning , 2005, The IEEE Conference on Local Computer Networks 30th Anniversary (LCN'05)l.

[35]  A. Nur Zincir-Heywood,et al.  A Preliminary Investigation of Skype Traffic Classification Using a Minimalist Feature Set , 2008, 2008 Third International Conference on Availability, Reliability and Security.

[36]  Stefan Saroiu,et al.  A Measurement Study of Peer-to-Peer File Sharing Systems , 2001 .

[37]  Konstantina Papagiannaki,et al.  Toward the Accurate Identification of Network Applications , 2005, PAM.

[38]  Liu Bin,et al.  A methodology for P2P traffic measurement using application signature work-in-progress , 2007, InfoScale '07.

[39]  Paulo Salvador,et al.  A Framework for Detecting Internet Applications , 2007, ICOIN.

[40]  Randy H. Katz,et al.  High speed deep packet inspection with hardware support , 2006 .

[41]  kc claffy,et al.  Measuring the Immeasurable: Global Internet Measurement Infrastructure , 2001 .

[42]  Stefan Savage,et al.  Unexpected means of protocol inference , 2006, IMC '06.

[43]  Somesh Jha,et al.  Deflating the big bang: fast and scalable deep packet inspection with extended finite automata , 2008, SIGCOMM '08.

[44]  Carey L. Williamson,et al.  Identifying and discriminating between web and peer-to-peer traffic in the network core , 2007, WWW '07.

[45]  Jeffrey Erman,et al.  Internet Traffic Identification using Machine Learning , 2006 .

[46]  Aiko Pras,et al.  A Labeled Data Set for Flow-Based Intrusion Detection , 2009, IPOM.

[47]  M. Eric Johnson,et al.  Why file sharing networks are dangerous? , 2009, CACM.

[48]  Fulvio Risso,et al.  NetPDL: An extensible XML-based language for packet header description , 2006, Comput. Networks.

[49]  Roger Larsen,et al.  BRO - an Intrusion Detection System , 2011 .

[50]  Nick Duffield,et al.  Sampling for Passive Internet Measurement: A Review , 2004 .

[51]  Christian Callegari,et al.  A Real-Time Algorithm for Skype Traffic Detection and Classification , 2009, NEW2AN.

[52]  Benoit Claise,et al.  Specification of the IP Flow Information Export (IPFIX) Protocol for the Exchange of IP Traffic Flow Information , 2008, RFC.

[53]  Antonio Pescapè,et al.  TIE: A Community-Oriented Traffic Classification Platform , 2009, TMA.

[54]  Panayiotis Mavrommatis,et al.  Identifying Known and Unknown Peer-to-Peer Traffic , 2006, Fifth IEEE International Symposium on Network Computing and Applications (NCA'06).

[55]  George Varghese,et al.  Network monitoring using traffic dispersion graphs (tdgs) , 2007, IMC '07.

[56]  Jing Liu,et al.  Peer-to-Peer Traffic Identification by Mining IP Layer Data Streams Using Concept-Adapting Very Fast Decision Tree , 2008, 2008 20th IEEE International Conference on Tools with Artificial Intelligence.

[57]  Tao Li,et al.  Modeling and analyzing the spread of active worms based on P2P systems , 2007, Comput. Secur..

[58]  Gary M. Weiss Mining with rarity: a unifying framework , 2004, SKDD.

[59]  Michalis Faloutsos,et al.  File-sharing in the Internet: A characterization of P2P traffic in the backbone , 2003 .

[60]  E.G. Schmidt,et al.  An accurate evaluation of machine learning algorithms for flow-based P2P traffic detection , 2007, 2007 22nd international symposium on computer and information sciences.

[61]  George C. Polyzos,et al.  A Parameterizable Methodology for Internet Traffic Flow Profiling , 1995, IEEE J. Sel. Areas Commun..

[62]  Chase Cotton,et al.  Packet-level traffic measurements from the Sprint IP backbone , 2003, IEEE Netw..

[63]  Zhou Xusheng,et al.  Notice of Violation of IEEE Publication PrinciplesApplication of Markov Chain in IP Traffic Classification , 2009, 2009 International Conference on Networks Security, Wireless Communications and Trusted Computing.

[64]  Niccolo Cascarano,et al.  Improving cost and accuracy of DPI traffic classifiers , 2010, SAC '10.

[65]  Manuela Pereira,et al.  Towards the Detection of Encrypted BitTorrent Traffic through Deep Packet Inspection , 2009, FGIT-SecTech.

[66]  Michalis Faloutsos,et al.  Transport layer identification of P2P traffic , 2004, IMC '04.

[67]  David Plonka,et al.  FlowScan: A Network Traffic Flow Reporting and Visualization Tool , 2000, LISA.

[68]  Carsten Lund,et al.  Estimating flow distributions from sampled flow statistics , 2005, TNET.

[69]  Jun Kyun Choi,et al.  Pattern Matching of Packet Payload for Network Traffic Classification , 2006, COIN-NGNCON 2006 - The Joint International Conference on Optical Internet and Next Generation Network.

[70]  Patrick Crowley,et al.  Algorithms to accelerate multiple regular expressions matching for deep packet inspection , 2006, SIGCOMM.

[71]  Yun Wang,et al.  Statistical Techniques for Network Security: Modern Statistically-Based Intrusion Detection and Protection , 2008 .

[72]  共立出版株式会社 コンピュータ・サイエンス : ACM computing surveys , 1978 .

[73]  Vern Paxson,et al.  Strategies for sound internet measurement , 2004, IMC '04.

[74]  Errin W. Fulp,et al.  In-the-Dark Network Traffic Classification Using Support Vector Machines , 2008, AAAI.

[75]  Dario Rossi,et al.  Stochastic Packet Inspection for TCP Traffic , 2010, 2010 IEEE International Conference on Communications.

[76]  Dario Rossi,et al.  Accurate, Fine-Grained Classification of P2P-TV Applications by Simply Counting Packets , 2009, TMA.

[77]  Luca Salgarelli,et al.  A statistical approach to IP-level classification of network traffic , 2006, 2006 IEEE International Conference on Communications.

[78]  Balachander Krishnamurthy,et al.  Traffic classification for application specific peering , 2002, IMW '02.

[79]  Benfano Soewito,et al.  High-speed string matching for network intrusion detection , 2009 .

[80]  Hui Liu,et al.  A Peer-To-Peer Traffic Identification Method Using Machine Learning , 2007, 2007 International Conference on Networking, Architecture, and Storage (NAS 2007).

[81]  Wolfgang John,et al.  Heuristics to Classify Internet Backbone Traffic based on Connection Patterns , 2008, 2008 International Conference on Information Networking.

[82]  Paulo Salvador,et al.  Towards the On-line Identification of Peer-to-peer Flow Patterns , 2009, J. Networks.

[83]  Dario Rossi,et al.  Revealing skype traffic: when randomness plays with you , 2007, SIGCOMM '07.

[84]  Jan Seedorf Security challenges for peer-to-peer SIP , 2006, IEEE Network.

[85]  Nathaniel Leibowitz,et al.  ARE FILE SWAPPING NETWORKS CACHEABLE? CHARACTERIZING P2P TRAFFIC , 2002 .

[86]  David Moore,et al.  The CoralReef Software Suite as a Tool for System and Network Administrators , 2001, LISA.

[87]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[88]  Anirban Mahanti,et al.  Traffic classification using clustering algorithms , 2006, MineNet '06.

[89]  Fabrice Guillemin,et al.  Analysis of ADSL traffic on an IP backbone link , 2003, GLOBECOM '03. IEEE Global Telecommunications Conference (IEEE Cat. No.03CH37489).

[90]  Sándor Molnár,et al.  Identification and Analysis of Peer-to-Peer Traffic , 2006, J. Commun..

[91]  Antonio Pescapè,et al.  TIE: Traffic Identification Engine , 2013, PIK Prax. Informationsverarbeitung Kommun..

[92]  John S. Heidemann,et al.  Understanding passive and active service discovery , 2007, IMC '07.

[93]  M. Baldi,et al.  Service-based traffic classification: Principles and validation , 2009, 2009 IEEE Sarnoff Symposium.

[94]  F.J. Gonzalez-Castano,et al.  Support Vector Machine Detection of Peer-to-Peer Traffic , 2006, 2006 IEEE International Conference on Computational Intelligence for Measurement Systems and Applications.

[95]  Grenville J. Armitage,et al.  Rapid identification of Skype traffic flows , 2009, NOSSDAV '09.

[96]  Nen-Fu Huang,et al.  Early Identifying Application Traffic with Application Characteristics , 2008, 2008 IEEE International Conference on Communications.

[97]  Maurizio Dusi,et al.  Tunnel Hunter: Detecting application-layer tunnels with statistical fingerprinting , 2009, Comput. Networks.

[98]  Hermann de Meer,et al.  Cross-Layer Peer-to-Peer Traffic Identification and Optimization Based on Active Networking , 2009, IWAN.

[99]  Marco Canini,et al.  GTVS: Boosting the Collection of Application Traffic Ground Truth , 2009, TMA.

[100]  Raj Jain,et al.  Packet Trains-Measurements and a New Model for Computer Network Traffic , 1986, IEEE J. Sel. Areas Commun..

[101]  Catherine Rosenberg,et al.  Behavioral authentication of server flows , 2003, 19th Annual Computer Security Applications Conference, 2003. Proceedings..

[102]  P MonteiroPaulo,et al.  Detection and classification of peer-to-peer traffic , 2013 .

[103]  David L. Olson,et al.  Advanced Data Mining Techniques , 2008 .

[104]  Niccolo Cascarano,et al.  GT: picking up the truth from the ground for internet traffic , 2009, CCRV.

[105]  Carey L. Williamson,et al.  The Extensive Challenges of Internet Application Measurement , 2007, IEEE Network.

[106]  Lillian N. Cassel,et al.  Management of sampled real-time network measurements , 1989, [1989] Proceedings. 14th Conference on Local Computer Networks.

[107]  Manuela Pereira,et al.  Detection of Encrypted Traffic in eDonkey Network through Application Signatures , 2009, 2009 First International Conference on Advances in P2P Systems.

[108]  M. Eric Johnson,et al.  The Evolution of the Peer-to-Peer File Sharing Industry and the Security Risks for Users , 2008, Proceedings of the 41st Annual Hawaii International Conference on System Sciences (HICSS 2008).

[109]  Krishna P. Gummadi,et al.  An analysis of Internet content delivery systems , 2002, OPSR.

[110]  István Szabó,et al.  On the Validation of Traffic Classification Algorithms , 2008, PAM.

[111]  Luca Salgarelli,et al.  Pattern Recognition Approaches for Classifying IP Flows , 2008, SSPR/SPR.

[112]  Anthony McGregor,et al.  Flow Clustering Using Machine Learning Techniques , 2004, PAM.

[113]  Jenq-Neng Hwang,et al.  Generalization performance analysis of flow-based peer-to-peer traffic identification , 2008, 2008 IEEE Workshop on Machine Learning for Signal Processing.

[114]  Judith Kelner,et al.  Better network traffic identification through the independent combination of techniques , 2010, J. Netw. Comput. Appl..

[115]  Satoshi Ohzahata,et al.  A Traffic Identification Method and Evaluations for a Pure P2P Application , 2005, PAM.

[116]  Krishna P. Gummadi,et al.  Measuring and analyzing the characteristics of Napster and Gnutella hosts , 2003, Multimedia Systems.

[117]  Fulvio Risso,et al.  Comparing P2PTV Traffic Classifiers , 2010, 2010 IEEE International Conference on Communications.

[118]  Henning Schulzrinne,et al.  Peer-to-peer overlays for real-time communication: security issues and solutions , 2009, IEEE Communications Surveys & Tutorials.

[119]  Supranamaya Ranjan,et al.  DoWitcher: Effective Worm Detection and Containment in the Internet Core , 2007, IEEE INFOCOM 2007 - 26th IEEE International Conference on Computer Communications.

[120]  Balachander Krishnamurthy,et al.  Internet Measurement - Infrastructure, Traffic, and Applications , 2006 .

[121]  Carey L. Williamson,et al.  Offline/realtime traffic classification using semi-supervised learning , 2007, Perform. Evaluation.

[122]  Vern Paxson,et al.  Issues and etiquette concerning use of shared measurement data , 2007, IMC '07.

[123]  Anja Feldmann,et al.  Measurement and analysis of IP network usage and behavior , 2000, IEEE Commun. Mag..

[124]  Luca Salgarelli,et al.  Support Vector Machines for TCP traffic classification , 2009, Comput. Networks.

[125]  Ke Xu,et al.  Tod-Cache: Peer-to-Peer Traffic Management and Optimization Using Combined Caching and Redirection , 2008, IEEE GLOBECOM 2008 - 2008 IEEE Global Telecommunications Conference.

[126]  Steve Chien,et al.  A First Look at Peer-to-Peer Worms: Threats and Defenses , 2005, IPTPS.

[127]  Steve Romig,et al.  The OSU Flow-tools Package and CISCO NetFlow Logs , 2000, LISA.

[128]  Zhengding Qiu,et al.  Identification peer-to-peer traffic for high speed networks using packet sampling and application signatures , 2008, 2008 9th International Conference on Signal Processing.

[129]  John C. S. Lui,et al.  Application Identification Based on Network Behavioral Profiles , 2008, 2008 16th Interntional Workshop on Quality of Service.

[130]  George Varghese,et al.  Automated Worm Fingerprinting , 2004, OSDI.

[131]  Mattia Monga,et al.  LISABETH: automated content-based signature generator for zero-day polymorphic worms , 2008, SESS '08.

[132]  Antonio Pescapè,et al.  Classification of Network Traffic via Packet-Level Hidden Markov Models , 2008, IEEE GLOBECOM 2008 - 2008 IEEE Global Telecommunications Conference.

[133]  Charles V. Wright,et al.  On Inferring Application Protocol Behaviors in Encrypted Network Traffic , 2006, J. Mach. Learn. Res..

[134]  Kevin Jeffay,et al.  What TCP/IP protocol headers can tell us about the web , 2001, SIGMETRICS '01.

[135]  Andrew W. Moore,et al.  Traffic Classification Using a Statistical Approach , 2005, PAM.

[136]  Sakir Sezer,et al.  Accelerating pattern matching for DPI , 2007, 2007 IEEE International SOC Conference.

[137]  Ronaldo M. Salles,et al.  Detecting VoIP calls hidden in web traffic , 2008, IEEE Transactions on Network and Service Management.

[138]  István Szabó,et al.  Accurate Traffic Classification , 2007, 2007 IEEE International Symposium on a World of Wireless, Mobile and Multimedia Networks.

[139]  Ronaldo M. Salles,et al.  Detecting Skype flows in Web traffic , 2008, NOMS 2008 - 2008 IEEE Network Operations and Management Symposium.

[140]  Grenville J. Armitage,et al.  A survey of techniques for internet traffic classification using machine learning , 2008, IEEE Communications Surveys & Tutorials.

[141]  Oliver Spatscheck,et al.  Accurate, scalable in-network identification of p2p traffic using application signatures , 2004, WWW '04.

[142]  Michalis Faloutsos,et al.  Internet traffic classification demystified: myths, caveats, and the best practices , 2008, CoNEXT '08.

[143]  Dario Rossi,et al.  KISS: Stochastic Packet Inspection , 2009, TMA.

[144]  Michalis Faloutsos,et al.  Comparison of Internet Traffic Classification Tools , 2007 .

[145]  Renata Teixeira,et al.  Traffic classification on the fly , 2006, CCRV.

[146]  Michalis Faloutsos,et al.  Is P2P dying or just hiding? [P2P traffic measurement] , 2004, IEEE Global Telecommunications Conference, 2004. GLOBECOM '04..

[147]  Renata Teixeira,et al.  Early application identification , 2006, CoNEXT '06.

[148]  Manuela Pereira,et al.  Analysis of Peer-to-Peer Traffic Using a Behavioural Method Based on Entropy , 2008, 2008 IEEE International Performance, Computing and Communications Conference.

[149]  Michalis Faloutsos,et al.  BLINC: multilevel traffic classification in the dark , 2005, SIGCOMM '05.

[150]  C. Papadopoulos,et al.  Inherent Behaviors for On-line Detection of Peer-to-Peer File Sharing , 2007, 2007 IEEE Global Internet Symposium.

[151]  Carey L. Williamson,et al.  A comparative analysis of web and peer-to-peer traffic , 2008, WWW.

[152]  Sven Ehlert,et al.  Analysis and Signature of Skype VoIP Session Traffic , 2006 .

[153]  Luca Salgarelli,et al.  Comparing traffic classifiers , 2007, CCRV.

[154]  Francesco Palmieri,et al.  A nonlinear, recurrence-based approach to traffic classification , 2009, Comput. Networks.

[155]  Kimberly C. Claffy,et al.  OC3MON: Flexible, Affordable, High Performance Staistics Collection , 1996, LISA.

[156]  Somesh Jha,et al.  An architecture for generating semantics-aware signatures , 2005 .

[157]  Sebastian Zander,et al.  A preliminary performance comparison of five machine learning algorithms for practical IP traffic flow classification , 2006, CCRV.

[158]  Anja Feldmann,et al.  An analysis of Internet chat systems , 2003, IMC '03.

[159]  Pablo Rodriguez,et al.  Should internet service providers fear peer-assisted content distribution? , 2005, IMC '05.

[160]  Dirk Grunwald,et al.  Legal issues surrounding monitoring during network research , 2007, IMC '07.

[161]  Maurizio Dusi,et al.  Traffic classification through simple statistical fingerprinting , 2007, CCRV.

[162]  John C. S. Lui,et al.  Profiling and identification of P2P traffic , 2009, Comput. Networks.

[163]  Grenville J. Armitage,et al.  Clustering to Assist Supervised Machine Learning for Real-Time IP Traffic Classification , 2008, 2008 IEEE International Conference on Communications.

[164]  George Varghese,et al.  Graph-Based P2P Traffic Classification at the Internet Backbone , 2009, IEEE INFOCOM Workshops 2009.

[165]  M. Papadopouli,et al.  Appmon : An Application for Accurate Per-Application Network Traffic Characterization , 2006 .

[166]  Jia Wang,et al.  Analyzing peer-to-peer traffic across large networks , 2002, IMW '02.

[167]  Szymon Wilk,et al.  Extending Rule-Based Classifiers to Improve Recognition of Imbalanced Classes , 2009 .

[168]  Pablo Belzarena,et al.  Early traffic classification using support vector machines , 2009, LANC.