A principled approach to managing routing in large isp networks

Internet Service Providers (ISPs) are the core building blocks of the Internet, and play a crucial role in keeping the Internet well-connected and stable, as well as providing services that meet the needs of other ASes (and their users). As a result, an ISP plays different roles in its operation: (1) as part of the Internet, an ISP is expected to help keep the global network stable; (2) when interacting with neighboring networks, an ISP faces diverse requirements from different neighbors about the kinds of routes they prefer; and (3) internally, an ISP needs to maintain and upgrade its own network periodically, and wants avoid disruptions during those operations as much as possible. As the Internet has become an integral part of the world’s communications infrastructure, today’s ISPs face a number of routing management challenges at these different scopes, which include: (i) maintaining the stability of the global Internet while meeting the increasingly demands for providing diverse routes from its customers, (ii) supporting more flexible routing policy configuration in bilateral contractual relationships with its neighbors, and (iii) making network maintenance and other network management operations in their own networks easier and less disruptive to routing protocols and data traffic. This dissertation takes a principled approach to addressing these challenges. We propose three abstractions that guide the design and implementation of our system solutions. First, we propose the abstraction of a “neighbor-specific route selection problem” and a corresponding “Neighbor-Specific BGP” (NS-BGP) model that capture the requirement of customized route selection for different neighbors. Since one ISP’s route selection decisions could cause the global Internet to become unstable, we prove the conditions under which the Internet is guaranteed to remain stable even if individual ISPs make the transition to this more flexible route-selection model. Second, we model policy configuration as a decision problem, which offers an abstraction that supports the reconciliation of multiple objectives. Guided by this abstraction and the Analytic Hierarchy Process, a decision-theoretic technique for balancing conflicting objectives, we designed and implemented a prototype of an extensible routing control platform (Morpheus) that enables an ISP to select routes for different neighbors individually and make flexible trade-offs among policy objectives through a simple and intuitive configuration interface. Finally, we propose the abstraction of the separation between “physical” and “logical” configurations of routers, which leads us to the design and prototype implementation of “virtual router migration” (VROOM), a new, generic technique to simplify and enable a broad range of network management tasks, from planned maintenance to reducing power consumption. Collectively, the contributions of the dissertation provide simple system solutions for an ISP to autonomously manage its routing more flexibly and effectively without affecting global routing stability.

[1]  Nick Feamster,et al.  Implications of Autonomy for the Expressiveness of Policy Routing , 2005, IEEE/ACM Transactions on Networking.

[2]  Amin Vahdat,et al.  Usher: An Extensible Framework for Managing Clusters of Virtual Machines , 2007, LISA.

[3]  Stefan Savage,et al.  The end-to-end effects of Internet path selection , 1999, SIGCOMM '99.

[4]  Joan Feigenbaum,et al.  A BGP-based mechanism for lowest-cost routing , 2002, PODC '02.

[5]  Dutch T. Meyer,et al.  Remus: High Availability via Asynchronous Virtual Machine Replication. (Best Paper) , 2008, NSDI.

[6]  Jennifer Rexford,et al.  Autonomous security for autonomous systems , 2008, Comput. Networks.

[7]  Jennifer Rexford,et al.  Don't Secure Routing Protocols, Secure Data Delivery , 2006, HotNets.

[8]  Lixin Gao,et al.  How to lease the internet in your spare time , 2007, CCRV.

[9]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[10]  T. Saaty Fundamentals of Decision Making and Priority Theory With the Analytic Hierarchy Process , 2000 .

[11]  Srinivasan Seshan,et al.  RouterFarm: towards a dynamic, manageable network edge , 2006, INM '06.

[12]  Nick Feamster,et al.  Design and implementation of a routing control platform , 2005, NSDI.

[13]  Abhijit Bose,et al.  Delayed internet routing convergence , 2000, SIGCOMM.

[14]  Farnam Jahanian,et al.  Internet routing instability , 1997, SIGCOMM '97.

[15]  A. Greenberg,et al.  Dynamic connectivity management with an intelligent route service control point , 2006, INM '06.

[16]  Theodor J. Stewart,et al.  Multiple Criteria Decision Analysis , 2001 .

[17]  Ramesh Govindan,et al.  BGP Route Flap Damping , 1998, RFC.

[18]  Gordon T. Wilfong,et al.  An analysis of BGP convergence properties , 1999, SIGCOMM '99.

[19]  Jennifer Rexford,et al.  Stable internet routing without global coordination , 2001, TNET.

[20]  Stewart Bryant,et al.  Pseudo Wire Emulation Edge-to-Edge (PWE3) Architecture , 2005, RFC.

[21]  Yi Wang,et al.  VROOM: Virtual ROuters On the Move , 2007, HotNets.

[22]  Anja Feldmann,et al.  Building an AS-topology model that captures route diversity , 2006, SIGCOMM 2006.

[23]  Jacobus E. van der Merwe,et al.  Switchlets and Dynamic Virtual ATM Networks , 1997, Integrated Network Management.

[24]  Enke Chen,et al.  Graceful Restart Mechanism for BGP , 2007, RFC.

[25]  Joan Feigenbaum,et al.  Mechanism design for policy routing , 2004, PODC.

[26]  Jennifer Rexford,et al.  From Optimization to Regret Minimization and Back Again , 2008, SysML.

[27]  A. Shaikh,et al.  Avoiding instability during graceful shutdown of multiple OSPF routers , 2006, IEEE/ACM Transactions on Networking.

[28]  Chi-Kin Chau Policy-based routing with non-strict preferences , 2006, SIGCOMM 2006.

[29]  Jaideep Chandrashekar,et al.  Limiting path exploration in BGP , 2005, Proceedings IEEE 24th Annual Joint Conference of the IEEE Computer and Communications Societies..

[30]  Ariel Rubinstein,et al.  A Course in Game Theory , 1995 .

[31]  Ratul Mahajan,et al.  Mutually Controlled Routing with Independent ISPs , 2007, NSDI.

[32]  Andrew Warfield,et al.  Xen and the art of virtualization , 2003, SOSP '03.

[33]  Michael Schapira,et al.  Interdomain routing and games , 2008, SIAM J. Comput..

[34]  K. K. Ramakrishnan,et al.  Packet aware transport for metro networks , 2004, The 13th IEEE Workshop on Local and Metropolitan Area Networks, 2004. LANMAN 2004..

[35]  Nick Feamster,et al.  The case for separating routing from routers , 2004, FDNA '04.

[36]  Chen-Nee Chuah,et al.  Feasibility of IP restoration in a tier 1 backbone , 2004, IEEE Network.

[37]  Arun Venkataramani,et al.  Black-box and Gray-box Strategies for Virtual Machine Migration , 2007, NSDI.

[38]  Andrew Warfield,et al.  Live migration of virtual machines , 2005, NSDI.

[39]  John G. Scudder,et al.  BGP Monitoring Protocol , 2008 .

[40]  E. H. Sargent,et al.  An Optical Integrated system for Implementation of M N × Optical Cross-connect, Beam Splitter, Mux/demux and Combiner , 2006 .

[41]  Yakov Rekhter,et al.  A Border Gateway Protocol 4 (BGP-4) , 1994, RFC.

[42]  George Varghese,et al.  On Scalable Attack Detection in the Network , 2004, IEEE/ACM Transactions on Networking.

[43]  Yi Wang,et al.  Virtual routers on the move: live router migration as a network-management primitive , 2008, SIGCOMM '08.

[44]  Stephen J. Wright,et al.  Power Awareness in Network Design and Routing , 2008, IEEE INFOCOM 2008 - The 27th Conference on Computer Communications.

[45]  Pavlin Radoslavov,et al.  Designing extensible IP router software , 2005, NSDI.

[46]  Kadangode K. Ramakrishnan,et al.  Convergence through packet-aware transport [Invited] , 2006 .

[47]  Roger Wattenhofer,et al.  The impact of Internet policy and topology on delayed routing convergence , 2001, Proceedings IEEE INFOCOM 2001. Conference on Computer Communications. Twentieth Annual Joint Conference of the IEEE Computer and Communications Society (Cat. No.01CH37213).

[48]  Suresh Singh,et al.  Greening of the internet , 2003, SIGCOMM '03.

[49]  Olivier Bonaventure,et al.  Achieving sub-50 milliseconds recovery upon BGP peering link failures , 2007, TNET.