Virtual network diagnosis as a service

Today's cloud network platforms allow tenants to construct sophisticated virtual network topologies among their VMs on a shared physical network infrastructure. However, these platforms provide little support for tenants to diagnose problems in their virtual networks. Network virtualization hides the underlying infrastructure from tenants as well as prevents deploying existing network diagnosis tools. This paper makes a case for providing virtual network diagnosis as a service in the cloud. We identify a set of technical challenges in providing such a service and propose a Virtual Network Diagnosis (VND) framework. VND exposes abstract configuration and query interfaces for cloud tenants to troubleshoot their virtual networks. It controls software switches to collect flow traces, distributes traces storage, and executes distributed queries for different tenants for network diagnosis. It reduces the data collection and processing overhead by performing local flow capture and on-demand query execution. Our experiments validate VND's functionality and shows its feasibility in terms of quick service response and acceptable overhead; our simulation proves the VND architecture scales to the size of a real data center network.

[1]  Rami Cohen,et al.  Designing Modular Overlay Solutions for Network Virtualization , 2012 .

[2]  Brighten Godfrey,et al.  VeriFlow: verifying network-wide invariants in real time , 2012, HotSDN '12.

[3]  Minlan Yu,et al.  SIMPLE-fying middlebox policy enforcement using SDN , 2013, SIGCOMM.

[4]  Minlan Yu,et al.  Profiling Network Performance for Multi-tier Data Center Applications , 2011, NSDI.

[5]  Aditya Akella,et al.  Stratos: Virtual Middleboxes as First-Class Entities , 2012 .

[6]  Donald Kossmann,et al.  The state of the art in distributed query processing , 2000, CSUR.

[7]  Vyas Sekar,et al.  Making middleboxes someone else's problem: network processing as a cloud service , 2012, SIGCOMM '12.

[8]  Vyas Sekar,et al.  Multi-resource fair queueing for packet processing , 2012, CCRV.

[9]  Vyas Sekar,et al.  SmartRE: an architecture for coordinated network-wide redundancy elimination , 2009, SIGCOMM '09.

[10]  Martín Casado,et al.  Onix: A Distributed Control Platform for Large-scale Production Networks , 2010, OSDI.

[11]  Anees Shaikh,et al.  CloudNaaS: a cloud networking platform for enterprise applications , 2011, SoCC.

[12]  George Varghese,et al.  Header Space Analysis: Static Checking for Networks , 2012, NSDI.

[13]  Brighten Godfrey,et al.  Debugging the data plane with anteater , 2011, SIGCOMM.

[14]  Minlan Yu,et al.  FlowTags: enforcing network-wide policies in the presence of dynamic middlebox actions , 2013, HotSDN '13.

[15]  David A. Maltz,et al.  Network traffic characteristics of data centers in the wild , 2010, IMC '10.

[16]  Nick McKeown,et al.  Where is the debugger for my software-defined network? , 2012, HotSDN '12.

[17]  David Walker,et al.  Frenetic: a network programming language , 2011, ICFP.

[18]  Randy H. Katz,et al.  X-Trace: A Pervasive Network Tracing Framework , 2007, NSDI.

[19]  Anja Feldmann,et al.  OFRewind: Enabling Record and Replay Troubleshooting for Networks , 2011, USENIX Annual Technical Conference.