SVM learning of IP address structure for latency prediction

We examine the ability to exploit the hierarchical structure of Internet addresses in order to endow network agents with predictive capabilities. Specifically, we consider Support Vector Machines (SVMs) for prediction of round-trip latency to random network destinations the agent has not previously interacted with. We use kernel functions to transform the structured, yet fragmented and discontinuous, IP address space into a feature space amenable to SVMs. Our SVM approach is accurate, fast, suitable to on-line learning and generalizes well. SVM regression on a large, randomly collected data set of 30,000 Internet latencies yields a mean prediction error of 25ms using only 20% of the samples for training. Our results are promising for equipping end-nodes with intelligence for service selection, user-directed routing, resource scheduling and network inference. Finally, feature selection analysis finds that the eight most significant IP address bits provide surprisingly strong discriminative power.

[1]  Stefan Savage,et al.  Inferring Internet denial-of-service activity , 2001, TOCS.

[2]  Bruce M. Maggs,et al.  Efficient content location using interest-based locality in peer-to-peer systems , 2003, IEEE INFOCOM 2003. Twenty-second Annual Joint Conference of the IEEE Computer and Communications Societies (IEEE Cat. No.03CH37428).

[3]  Vern Paxson,et al.  An analysis of using reflectors for distributed denial-of-service attacks , 2001, CCRV.

[4]  Robert Tappan Morris,et al.  Vivaldi: a decentralized network coordinate system , 2004, SIGCOMM '04.

[5]  D. Saunders The brave new world , 1999 .

[6]  Krishna P. Gummadi,et al.  The impact of DHT routing geometry on resilience and proximity , 2003, SIGCOMM '03.

[7]  Krishna P. Gummadi,et al.  King: estimating latency between arbitrary internet end hosts , 2002, IMW '02.

[8]  Vinod Yegneswaran,et al.  Characteristics of internet background radiation , 2004, IMC '04.

[9]  Paul Ferguson,et al.  Network Ingress Filtering: Defeating Denial of Service Attacks which employ IP Source Address Spoofing , 1998, RFC.

[10]  Panos Kalnis,et al.  Real Datasets for File-Sharing Peer-to-Peer Systems , 2005, DASFAA.

[11]  Yakov Rekhter,et al.  Address Allocation for Private Internets , 1994, RFC.

[12]  Donald F. Towsley,et al.  On characterizing BGP routing table growth , 2002, Global Telecommunications Conference, 2002. GLOBECOM '02. IEEE.

[13]  Barbara van Schewick Towards an Economic Framework for Network Neutrality Regulation , 2005, J. Telecommun. High Technol. Law.

[14]  David D. Clark,et al.  A knowledge plane for the internet , 2003, SIGCOMM '03.

[15]  Jon Postel,et al.  Internet Registry IP Allocation Guidelines , 1996, RFC.

[16]  David D. Clark,et al.  Name, addresses, ports, and routes , 1982, RFC.

[17]  Balachander Krishnamurthy,et al.  Predicting short-transfer latency from TCP arcana: a trace-based validation , 2005, IMC '05.

[18]  Ramayya Krishnan,et al.  Intelligent Club Management in Peer-to-Peer Networks , 2003 .

[19]  Jonathan Schmidt Dynamic Port 25 Blocking to Control SPAM Zombies , 2006, CEAS.

[20]  Hari Balakrishnan,et al.  Resilient overlay networks , 2001, SOSP.

[21]  Songwu Lu,et al.  IPv4 address allocation and the BGP routing table evolution , 2005, CCRV.

[22]  Bobby Bhattacharjee,et al.  Are Virtualized Overlay Networks Too Much of a Good Thing? , 2002, IPTPS.

[23]  Michalis Faloutsos,et al.  Transport layer identification of P2P traffic , 2004, IMC '04.

[24]  Daniel Stutzbach,et al.  Characterizing Unstructured Overlay Topologies in Modern P2P File-Sharing Systems , 2005, IEEE/ACM Transactions on Networking.

[25]  Fred Baker,et al.  Ingress Filtering for Multihomed Networks , 2004, RFC.

[26]  Elizabeth Masiello,et al.  Service identification in TCP/IP : well-known versus random port numbers , 2005 .

[27]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .

[28]  Yiming Yang,et al.  A Comparative Study on Feature Selection in Text Categorization , 1997, ICML.

[29]  Ben Y. Zhao,et al.  Impact of Neighbor Selection on Performance and Resilience of Structured P2P Networks , 2005, IPTPS.

[30]  Hector Garcia-Molina,et al.  Designing a super-peer network , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[31]  Alexander J. Smola,et al.  Support Vector Method for Function Approximation, Regression Estimation and Signal Processing , 1996, NIPS.

[32]  Robert E. Beverly,et al.  Reorganization in network regions for optimality and fairness , 2004 .

[33]  Peter A. Dinda,et al.  An empirical study of the multiscale predictability of network traffic , 2004, Proceedings. 13th IEEE International Symposium on High performance Distributed Computing, 2004..

[34]  Xiaowei Yang,et al.  NIRA: a new Internet routing architecture , 2003, FDNA '03.

[35]  Nick Feamster,et al.  Geographic locality of IP prefixes , 2005, IMC '05.

[36]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[37]  Alex C. Snoeren,et al.  Hash-based IP traceback , 2001, SIGCOMM '01.

[38]  Daniel Stutzbach,et al.  On the Long-term Evolution of the Two-Tier Gnutella Overlay , 2006, Proceedings IEEE INFOCOM 2006. 25TH IEEE International Conference on Computer Communications.

[39]  Emin Gün Sirer,et al.  Meridian: a lightweight network location service without virtual coordinates , 2005, SIGCOMM '05.

[40]  David D. Clark,et al.  Rethinking the design of the Internet , 2001, ACM Trans. Internet Techn..

[42]  Scott Shenker,et al.  Making gnutella-like P2P systems scalable , 2003, SIGCOMM '03.

[43]  Mark Handley,et al.  Topologically-aware overlay construction and server selection , 2002, Proceedings.Twenty-First Annual Joint Conference of the IEEE Computer and Communications Societies.

[44]  Yiming Yang,et al.  A re-examination of text categorization methods , 1999, SIGIR '99.

[45]  Sandhya Dwarkadas,et al.  Peer-to-peer information retrieval using self-organizing semantic overlay networks , 2003, SIGCOMM '03.

[46]  kc claffy,et al.  Otter: A general-purpose network visualization tool , 1999 .

[47]  Kang G. Shin,et al.  Hop-count filtering: an effective defense against spoofed DDoS traffic , 2003, CCS '03.

[48]  Robert Beverly,et al.  The spoofer project: inferring the extent of source address filtering on the internet , 2005 .