Measuring the effects of internet path faults on reactive routing

Empirical evidence suggests that reactive routing systems improve resilience to Internet path failures. They detect and route around faulty paths based on measurements of path performance. This paper seeks to understand why and under what circumstances these techniques are effective.To do so, this paper correlates end-to-end active probing experiments, loss-triggered traceroutes of Internet paths, and BGP routing messages. These correlations shed light on three questions about Internet path failures: (1) Where do failures appear? (2) How long do they last? (3) How do they correlate with BGP routing instability?Data collected over 13 months from an Internet testbed of 31 topologically diverse hosts suggests that most path failures last less than fifteen minutes. Failures that appear in the network core correlate better with BGP instability than failures that appear close to end hosts. On average, most failures precede BGP messages by about four minutes, but there is often increased BGP traffic both before and after failures. Our findings suggest that reactive routing is most effective between hosts that have multiple connections to the Internet. The data set also suggests that passive observations of BGP routing messages could be used to predict about 20% of impending failures, allowing re-routing systems to react more quickly to failures.

[1]  Yin Zhang,et al.  On the constancy of internet path properties , 2001, IMW '01.

[2]  Michael Dahlin,et al.  End-to-end WAN service availability , 2001, TNET.

[3]  Ratul Mahajan,et al.  Measuring ISP topologies with Rocketfuel , 2004, IEEE/ACM Transactions on Networking.

[4]  Daniel Massey,et al.  Observation and analysis of BGP behavior under stress , 2002, IMW '02.

[5]  Farnam Jahanian,et al.  Experimental Study of Internet Stabil-ity and Wide-Area Backbone Failures , 1998 .

[6]  Ratul Mahajan,et al.  Understanding BGP misconfiguration , 2002, SIGCOMM 2002.

[7]  Ratul Mahajan,et al.  Measuring ISP topologies with rocketfuel , 2002, SIGCOMM 2002.

[8]  Vern Paxson,et al.  End-to-end routing behavior in the Internet , 1996, TNET.

[9]  V. Paxson End-to-end routing behavior in the internet , 2006, CCRV.

[10]  James P. Egan,et al.  Signal detection theory and ROC analysis , 1975 .

[11]  Ramesh Govindan,et al.  Route flap damping exacerbates internet routing convergence , 2002, SIGCOMM 2002.

[12]  George Varghese,et al.  Route flap damping exacerbates internet routing convergence , 2002, SIGCOMM '02.

[13]  Ramesh Govindan,et al.  An empirical study of router response to large BGP routing table load , 2002, IMW '02.

[14]  Lixin Gao On inferring autonomous system relationships in the internet , 2001, TNET.

[15]  Anees Shaikh,et al.  Issues with inferring Internet topological attributes , 2004, Comput. Commun..

[16]  Farnam Jahanian,et al.  Experimental study of Internet stability and backbone failures , 1999, Digest of Papers. Twenty-Ninth Annual International Symposium on Fault-Tolerant Computing (Cat. No.99CB36352).

[17]  Hari Balakrishnan,et al.  Resilient overlay networks , 2001, SOSP.

[18]  Edith Cohen,et al.  Predicting and bypassing end-to-end internet service degradations , 2002, IMW '02.

[19]  Anees Shaikh,et al.  Issues with inferring Internet topological attributes , 2002, SPIE ITCom.

[20]  Ratul Mahajan,et al.  Understanding BGP misconfiguration , 2002, SIGCOMM '02.

[21]  Abhijit Bose,et al.  Delayed Internet routing convergence , 2000, SIGCOMM.