IDES: An Internet Distance Estimation Service for Large Networks

The responsiveness of networked applications is limited by communications delays, making network distance an important parameter in optimizing the choice of communications peers. Since accurate global snapshots are difficult and expensive to gather and maintain, it is desirable to use sampling techniques in the Internet to predict unknown network distances from a set of partially observed measurements. This paper makes three contributions. First, we present a model for representing and predicting distances in large-scale networks by matrix factorization which can model suboptimal and asymmetric routing policies, an improvement on previous approaches. Second, we describe two algorithms-singular value decomposition and non-negative matrix factorization-for representing a matrix of network distances as the product of two smaller matrices. Third, based on our model and algorithms, we have designed and implemented a scalable system-Internet Distance Estimation Service (IDES)-that predicts large numbers of network distances from limited samples of Internet measurements. Extensive simulations on real-world data sets show that IDES leads to more accurate, efficient and robust predictions of latencies in large-scale networks than existing approaches

[1]  Lawrence K. Saul,et al.  Modeling distances in large-scale networks by matrix factorization , 2004, IMC '04.

[2]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[3]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[4]  Peter Druschel,et al.  Pastry: Scalable, distributed object location and routing for large-scale peer-to- , 2001 .

[5]  Jon Crowcroft,et al.  On the accuracy of embeddings for internet coordinate systems , 2005, IMC '05.

[6]  Robert Tappan Morris,et al.  Practical, distributed network coordinates , 2004, Comput. Commun. Rev..

[7]  Krishna P. Gummadi,et al.  King: estimating latency between arbitrary internet end hosts , 2002, IMW '02.

[8]  Antony I. T. Rowstron,et al.  Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems , 2001, Middleware.

[9]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[10]  Ben Y. Zhao,et al.  An Infrastructure for Fault-tolerant Wide-area Location and Routing , 2001 .

[11]  Miguel Castro,et al.  PIC: practical Internet coordinates for distance estimation , 2004, 24th International Conference on Distributed Computing Systems, 2004. Proceedings..

[12]  Mark Handley,et al.  A scalable content-addressable network , 2001, SIGCOMM '01.

[13]  Paul Francis,et al.  IDMaps: a global internet host distance estimation service , 2001, TNET.

[14]  Mark Crovella,et al.  Geometric Exploration of the Landmark Selection Problem , 2004, PAM.

[15]  Hari Balakrishnan,et al.  Resilient overlay networks , 2001, SOSP.

[16]  Suman Banerjee,et al.  The Interdomain Connectivity of PlanetLab Nodes , 2004, PAM.

[17]  Robert Tappan Morris,et al.  Vivaldi: a decentralized network coordinate system , 2004, SIGCOMM '04.

[18]  Mark Crovella,et al.  Virtual landmarks for the internet , 2003, IMC '03.

[19]  Venkata N. Padmanabhan,et al.  Some findings on the network performance of broadband hosts , 2003, IMC '03.

[20]  I. Jolliffe Principal Component Analysis , 2002 .

[21]  Krishna P. Gummadi,et al.  King: estimating latency between arbitrary internet end hosts , 2002, IMW '02.

[22]  David R. Karger,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM '01.

[23]  Vern Paxson,et al.  End-to-end routing behavior in the Internet , 1996, TNET.

[24]  Hui Zhang,et al.  Predicting Internet network distance with coordinates-based approaches , 2002, Proceedings.Twenty-First Annual Joint Conference of the IEEE Computer and Communications Societies.

[25]  Hyuk Lim,et al.  Constructing Internet coordinate system based on delay measurement , 2003, IEEE/ACM Transactions on Networking.