Verification in the Age of Microservices

Many large applications are now built using collections of microservices, each of which is deployed in isolated containers and which interact with each other through the use of remote procedure calls (RPCs). The use of microservices improves scalability -- each component of an application can be scaled independently -- and deployability. However, such applications are inherently distributed and current tools do not provide mechanisms to reason about and ensure their global behavior. In this paper we argue that recent advances in formal methods and software packet processing pave the path towards building mechanisms that can ensure correctness for such systems, both when they are being built and at runtime. These techniques impose minimal runtime overheads and are amenable to production deployments.

[1]  Vasco M. Manquinho,et al.  Community-Based Partitioning for MaxSAT Solving , 2013, SAT.

[2]  Sylvia Ratnasamy,et al.  SoftNIC: A Software NIC to Augment Hardware , 2015 .

[3]  Christine Paulin-Mohring,et al.  The coq proof assistant reference manual , 2000 .

[4]  Rupak Majumdar,et al.  Rely/Guarantee Reasoning for Asynchronous Programs , 2015, CONCUR.

[5]  Koen Claessen,et al.  QuickCheck: a lightweight tool for random testing of Haskell programs , 2000, ICFP.

[6]  Dongsu Han,et al.  mOS: A Reusable Networking Stack for Flow Monitoring Middleboxes , 2017, NSDI.

[7]  Randy H. Katz,et al.  X-Trace: A Pervasive Network Tracing Framework , 2007, NSDI.

[8]  Martín Casado,et al.  The Design and Implementation of Open vSwitch , 2015, NSDI.

[9]  Fausto Giunchiglia,et al.  NUSMV: a new symbolic model checker , 2000, International Journal on Software Tools for Technology Transfer.

[10]  Hugo Herbelin,et al.  The Coq proof assistant : reference manual, version 6.1 , 1997 .

[11]  Donald Beaver,et al.  Dapper, a Large-Scale Distributed Systems Tracing Infrastructure , 2010 .

[12]  Jeffrey I. Schiller,et al.  An Authentication Service for Open Network Systems. In , 1998 .

[13]  Nicolas Christin,et al.  Push-Button Verification of File Systems via Crash Refinement , 2016, USENIX Annual Technical Conference.

[14]  Helmut Veith,et al.  Counterexample-guided abstraction refinement for symbolic model checking , 2003, JACM.

[15]  Koen Claessen,et al.  Finding race conditions in Erlang with QuickCheck and PULSE , 2009, ICFP.

[16]  Cliff B. Jones,et al.  Specification and Design of (Parallel) Programs , 1983, IFIP Congress.

[17]  A Saritha,et al.  A system for detecting network intruders in real-time , 2016 .

[18]  Marcos K. Aguilera,et al.  Performance debugging for distributed systems of black boxes , 2003, SOSP '03.

[19]  Martín Casado,et al.  Network Virtualization in Multi-tenant Datacenters , 2014, NSDI.

[20]  Xi Wang,et al.  Verdi: a framework for implementing and formally verifying distributed systems , 2015, PLDI.

[21]  Srinath T. V. Setty,et al.  IronFleet: proving practical distributed systems correct , 2015, SOSP.

[22]  George C. Necula,et al.  Minimizing Faulty Executions of Distributed Systems , 2016, NSDI.

[23]  B. Clifford Neuman,et al.  Kerberos: An Authentication Service for Open Network Systems , 1988, USENIX Winter.

[24]  Eric A. Brewer,et al.  Borg, Omega, and Kubernetes , 2016, ACM Queue.

[25]  Vern Paxson,et al.  Bro: a system for detecting network intruders in real-time , 1998, Comput. Networks.