Troubleshooting on intra-domain routing instability

Routing instability is a problem directly a affecting the reliability of the Internet. While a great deal of effort has been committed to inter-domain routing instability, studies on intra-domain routing have been quite limited. Most network operators still do not have sufficient knowledge on this problem and often complain that: (i) They do not know to what extent the intra-domain routing instability can occur on their networks because this is difficult to detect, and (ii) the causes of this instability are difficult to find. In this paper, we first present the results of some passive measurements we did on intra-domain routing instability. We show the statistical results of OSPF routing information (for both IPv4 and IPv6) we collected on the WIDE Internet and APAN Tokyo-XP network. Through the statistics, we demonstrate how seriously routing instability can occur on a service network. We then propose an approach to help network operators isolate the causes of this. We emphasize the importance of gathering useful data for troubleshooting in event-driven fashion and propose using SNMP or telnet for this. We then explain what kind of data should be collected for the purposes of troubleshooting and how to use this data to isolate the problem.

[1]  Fred Baker,et al.  OSPF Version 2 Management Information Base , 1991, RFC.

[2]  Aman Shaikh,et al.  Routing stability in congested networks: experimentation and analysis , 2000, SIGCOMM 2000.

[3]  David Watson,et al.  Experiences with monitoring OSPF on a regional service provider network , 2003, 23rd International Conference on Distributed Computing Systems, 2003. Proceedings..