CobWeb: A System for Automated In-Network Cobbling of Web Service Traffic

We consider the problem of in-network categorization of all traffic associated with a given set of web services. While this problem can be viewed as a generalization of per-session traffic monitoring, a key difficulty is that we have to construct the entire session tree that represents the transitive closure of all traffic downloaded as a result of a user accessing a given web service. Such in-network session tree construction and monitoring is useful for many measurement, monitoring, and new types of billing services such as ‘reverse billing’ where usage charges are paid for by either the service provider or the ISP itself as an incentive to the user. Automated construction of the session tree based on network traffic observation is challenging and to our knowledge unaddressed. The challenges arise due to the complexities inherent in today’s web services and the lack of universal standards that are followed when designing web services. This necessitates the use of heuristics that rely upon prevalent web service design practices. In this paper, we present a system, called COBWEB, that performs this automated innetwork cobbling and monitoring of web services traffic. We evaluate the classification accuracy of COBWEB by extensive experimentation using controlled downloads and by analysis of about 100 popular web sites using large traffic traces (over 700 GB) collected at a major university’s gateway. Our experiments suggests that COBWEB can achieve good accuracy with low (< 5%) false positive and negative rates.

[1]  Fang Hao,et al.  Unreeling netflix: Understanding and improving multi-CDN movie delivery , 2012, 2012 Proceedings IEEE INFOCOM.

[2]  Ramana Rao Kompella,et al.  The power of slicing in internet flow measurement , 2005, IMC '05.

[3]  Stefan Savage,et al.  Unexpected means of protocol inference , 2006, IMC '06.

[4]  George Varghese,et al.  Building a better NetFlow , 2004, SIGCOMM 2004.

[5]  Andrew W. Moore,et al.  Internet traffic classification using bayesian analysis techniques , 2005, SIGMETRICS '05.

[6]  Michalis Faloutsos,et al.  BLINC: multilevel traffic classification in the dark , 2005, SIGCOMM '05.

[7]  Matthew Roughan,et al.  Class-of-service mapping for QoS: a statistical signature-based approach to IP traffic classification , 2004, IMC '04.

[8]  Benjamin Livshits,et al.  AjaxScope: a platform for remotely monitoring the client-side behavior of web 2.0 applications , 2007, TWEB.

[9]  Fang Hao,et al.  On-line detection of real time multimedia traffic , 2009, 2009 17th IEEE International Conference on Network Protocols.

[10]  Nick Feamster,et al.  Fast monitoring of traffic subpopulations , 2008, IMC '08.

[11]  Marco Mellia,et al.  Revealing skype traffic: when randomness plays with you , 2007, SIGCOMM 2007.

[12]  Grenville J. Armitage,et al.  A survey of techniques for internet traffic classification using machine learning , 2008, IEEE Communications Surveys & Tutorials.

[13]  Amit P. Sheth,et al.  METEOR-S Web Service Annotation Framework with Machine Learning Classification , 2004, SWSWPC.

[14]  Myungjin Lee,et al.  AjaxTracker: Active Measurement System for High-Fidelity Characterization of AJAX Applications , 2010, WebApps.

[15]  Sachin Agarwal,et al.  The New Web: Characterizing AJAX Traffic , 2008, PAM.

[16]  Andrea Montanari,et al.  Counter braids: a novel counter architecture for per-flow measurement , 2008, SIGMETRICS '08.

[17]  Chen-Nee Chuah,et al.  ProgME: Towards Programmable Network MEasurement , 2007, IEEE/ACM Transactions on Networking.

[18]  Georgios Meditskos,et al.  On the Combination of Textual and Semantic Descriptions for Automated Semantic Web Service Classification , 2009, AIAI.