NOX: towards an operating system for networks

As anyone who has operated a large network can attest, enterprise networks are difficult to manage. That they have remained so despite significant commercial and academic efforts suggests the need for a different network management paradigm. Here we turn to operating systems as an instructive example in taming management complexity. In the early days of computing, programs were written in machine languages that had no common abstractions for the underlying physical resources. This made programs hard to write, port, reason about, and debug. Modern operating systems facilitate program development by providing controlled access to high-level abstractions for resources (e.g., memory, storage, communication) and information (e.g., files, directories). These abstractions enable programs to carry out complicated tasks safely and efficiently on a wide variety of computing hardware. In contrast, networks are managed through low-level configuration of individual components. Moreover, these configurations often depend on the underlying network; for example, blocking a user’s access with an ACL entry requires knowing the user’s current IP address. More complicated tasks require more extensive network knowledge; forcing guest users’ port 80 traffic to traverse an HTTP proxy requires knowing the current network topology and the location of each guest. In this way, an enterprise network resembles a computer without an operating system, with network-dependent component configuration playing the role of hardware-dependent machine-language programming. What we clearly need is an “operating system” for networks, one that provides a uniform and centralized programmatic interface to the entire network. Analogous to the read and write access to various resources provided by computer operating systems, a network operating system provides the ability to observe and control a network. A network operating system does not manage the network itself; it merely provides a programmatic interface. Applications implemented on top of the network operating system perform the actual management tasks. The programmatic interface should be general enough to support a broad spectrum of network management applications. Such a network operating system represents two major conceptual departures from the status quo. First, the network operating system presents programs with a centralized programming model; programs are written as if the entire network were present on a single machine (i.e., one would use Dijkstra to compute shortest paths, not Bellman-Ford). This requires (as in [3, 8, 14] and elsewhere) centralizing network state. Second, programs are written in terms of high-level abstractions (e.g., user and host names), not low-level configuration parameters (e.g., IP and MAC addresses). This allows management directives to be enforced independent of the underlying network topology, but it requires that the network operating system carefully maintain the bindings (i.e., mappings) between these abstractions and the low-level configurations. Thus, a network operating system allows management applications to be written as centralized programs over highlevel names as opposed to the distributed algorithms over low-level addresses we are forced to use today. While clearly a desirable goal, achieving this transformation from distributed algorithms to centralized programming presents significant technical challenges, and the question we pose here is: Can one build a network operating system at significant scale?