A Hardware-Software Stack for Serverless Edge Swarms

Swarms of autonomous devices are increasing in ubiquity and size, making the need for rethinking their hardwaresoftware system stack critical. We present HiveMind, the first swarm coordination platform that enables programmable execution of complex task workflows between cloud and edge resources in a performant and scalable manner. HiveMind is a software-hardware platform that includes a domain-specific language to simplify programmability of cloud-edge applications, a program synthesis tool to automatically explore task placement strategies, a centralized controller that leverages serverless computing to elastically scale cloud resources, and a reconfigurable hardware acceleration fabric for network and remote memory

[1]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Tamás Vicsek,et al.  Optimized flocking of autonomous drones in confined environments , 2018, Science Robotics.

[3]  Christina Delimitrou,et al.  Dagger: Towards Efficient RPCs in Cloud Microservices With Near-Memory Reconfigurable NICs , 2020, IEEE Computer Architecture Letters.

[4]  Christina Delimitrou,et al.  µqSim: Enabling Accurate and Scalable Simulation for Interactive Microservices , 2019, 2019 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).

[5]  Wenzhi Cui,et al.  MAVBench: Micro Aerial Vehicle Benchmarking , 2018, 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[6]  Christina Delimitrou,et al.  Seer : Leveraging Big Data to Navigate The Complexity of Cloud Debugging , 2018 .

[7]  Rajeev Wankar,et al.  Serverless Management of Sensing Systems for Fog Computing Framework , 2020, IEEE Sensors Journal.

[8]  Byung-Gon Chun,et al.  CloneCloud: elastic execution between mobile device and cloud , 2011, EuroSys '11.

[9]  Christina Delimitrou,et al.  Tarcil: reconciling scheduling speed and quality in large shared clusters , 2015, SoCC.

[10]  L. Barroso Warehouse-Scale Computing: Entering the Teenage Decade , 2011, SIGARCH Comput. Archit. News.

[11]  Serge Chaumette,et al.  Security, privacy and safety evaluation of dynamic and static fleets of drones , 2017, 2017 IEEE/AIAA 36th Digital Avionics Systems Conference (DASC).

[12]  Sergio Trilles,et al.  An IoT Platform Based on Microservices and Serverless Paradigms for Smart Farming Purposes , 2020, Sensors.

[13]  Christina Delimitrou,et al.  Sage: Practical & Scalable ML-Driven Performance Debugging in Microservices , 2020 .

[14]  Babak Falsafi,et al.  RPCValet: NI-Driven Tail-Aware Balancing of µs-Scale RPCs , 2019, ASPLOS.

[15]  Christoforos E. Kozyrakis,et al.  Heracles: Improving resource efficiency at scale , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).

[16]  Mendel Rosenblum,et al.  Network Interface Design for Low Latency Request-Response Protocols , 2013, USENIX ATC.

[17]  Andrea C. Arpaci-Dusseau,et al.  Serverless Computation with OpenLambda , 2016, HotCloud.

[18]  Hanno Hildmann,et al.  DISTRIBUTED UAV-SWARM-BASED REAL-TIME GEOMATIC DATA COLLECTION UNDER DYNAMICALLY CHANGING RESOLUTION REQUIREMENTS , 2017 .

[19]  Martín Abadi,et al.  TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[20]  Mauro Femminella,et al.  Performance Evaluation of Edge Cloud Computing System for Big Data Applications , 2016, 2016 5th IEEE International Conference on Cloud Networking (Cloudnet).

[21]  Sahil Malik Azure Functions , 2019 .

[22]  Nick McKeown,et al.  The Case for a Network Fast Path to the CPU , 2019, HotNets.

[23]  Josep Torrellas,et al.  BabelFish: Fusing Address Translations for Containers , 2020, 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA).

[24]  Christina Delimitrou,et al.  Paragon: QoS-aware scheduling for heterogeneous datacenters , 2013, ASPLOS '13.

[25]  Brandon Lucia,et al.  Orbital Edge Computing: Nanosatellite Constellations as a New Class of Computer System , 2020, ASPLOS.

[26]  Liang Tong,et al.  A hierarchical edge cloud architecture for mobile computing , 2016, IEEE INFOCOM 2016 - The 35th Annual IEEE International Conference on Computer Communications.

[27]  Andrea C. Arpaci-Dusseau,et al.  SOCK: Rapid Task Provisioning with Serverless-Optimized Containers , 2018, USENIX Annual Technical Conference.

[28]  Joseph M. Hellerstein,et al.  Serverless Computing: One Step Forward, Two Steps Back , 2018, CIDR.

[29]  Paarijaat Aditya,et al.  SAND: Towards High-Performance Serverless Computing , 2018, USENIX Annual Technical Conference.

[30]  Babak Falsafi,et al.  The NEBULA RPC-Optimized Architecture , 2020, 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA).

[31]  Andrew W. Moore,et al.  Understanding PCIe performance for end host networking , 2018, SIGCOMM.

[32]  Christoforos E. Kozyrakis,et al.  Centralized Core-granular Scheduling for Serverless Functions , 2019, SoCC.

[33]  David G. Andersen,et al.  FaSST: Fast, Scalable and Simple Distributed Transactions with Two-Sided (RDMA) Datagram RPCs , 2016, OSDI.

[34]  Schahram Dustdar,et al.  Towards a Serverless Platform for Edge AI , 2019, HotEdge.

[35]  Daniele Nardi,et al.  Field coverage and weed mapping by UAV swarms , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[36]  Shih-Chieh Lin Cross-Layer System Design for Autonomous Driving , 2019 .

[37]  Randy H. Katz,et al.  Heterogeneity and dynamicity of clouds at scale: Google trace analysis , 2012, SoCC '12.

[38]  Umakishore Ramachandran,et al.  An execution model for serverless functions at the edge , 2019, IoTDI.

[39]  Christina Delimitrou,et al.  Bolt: I Know What You Did Last Summer... In The Cloud , 2017, ASPLOS.

[40]  Quan Zhang,et al.  Firework: Data Processing and Sharing for Hybrid Cloud-Edge Analytics , 2018, IEEE Transactions on Parallel and Distributed Systems.

[41]  Dario Floreano,et al.  Forceful manipulation with micro air vehicles , 2018, Science Robotics.

[42]  Babak Falsafi,et al.  Optimus Prime: Accelerating Data Transformation in Servers , 2020, ASPLOS.

[43]  Theo Lynn,et al.  A Preliminary Review of Enterprise Serverless Cloud Computing (Function-as-a-Service) Platforms , 2017, 2017 IEEE International Conference on Cloud Computing Technology and Science (CloudCom).

[44]  Xiaofei Wang,et al.  Convergence of Edge Computing and Deep Learning: A Comprehensive Survey , 2019, IEEE Communications Surveys & Tutorials.

[45]  Anirudh Sivaraman,et al.  Encoding, Fast and Slow: Low-Latency Video Processing Using Thousands of Tiny Threads , 2017, NSDI.

[46]  Yuan He,et al.  An Open-Source Benchmark Suite for Microservices and Their Hardware-Software Implications for Cloud & Edge Systems , 2019, ASPLOS.

[47]  Deborah Ajilo,et al.  A Distributed Robot Garden System , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[48]  Christina Delimitrou,et al.  HCloud: Resource-Efficient Provisioning in Shared Cloud Systems , 2016, ASPLOS.

[49]  John K. Ousterhout,et al.  Homa: a receiver-driven low-latency transport protocol using network priorities , 2018, SIGCOMM.

[50]  Kunle Olukotun,et al.  Automatic Generation of Efficient Accelerators for Reconfigurable Hardware , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[51]  Ramesh Govindan,et al.  Odessa: enabling interactive perception applications on mobile devices , 2011, MobiSys '11.

[52]  Christina Delimitrou,et al.  The Architectural Implications of Cloud Microservices , 2018, IEEE Computer Architecture Letters.

[53]  Dan Feldman,et al.  Fleye on the car: big data meets the internet of things , 2015, IPSN.

[54]  Christina Delimitrou,et al.  QoS-Aware Admission Control in Heterogeneous Datacenters , 2013, ICAC.

[55]  Luiz André Barroso,et al.  The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines , 2009, The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines.

[56]  Ling Liu,et al.  Achieving 10Gbps Line-rate Key-value Stores with FPGAs , 2013, HotCloud.

[57]  Christina Delimitrou,et al.  QoS-Aware scheduling in heterogeneous datacenters with paragon , 2013, TOCS.

[58]  David Bermbach,et al.  tinyFaaS: A Lightweight FaaS Platform for Edge Environments , 2020, 2020 IEEE International Conference on Fog Computing (ICFC).

[59]  Christoforos E. Kozyrakis,et al.  IX: A Protected Dataplane Operating System for High Throughput and Low Latency , 2014, OSDI.

[60]  Mehmet Remzi Dogar,et al.  Multi-robot grasp planning for sequential assembly operations , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[61]  David G. Andersen,et al.  Design Guidelines for High Performance RDMA Systems , 2016, USENIX ATC.

[62]  Nick McKeown,et al.  pFabric: minimal near-optimal datacenter transport , 2013, SIGCOMM.

[63]  Yuan He,et al.  Seer: Leveraging Big Data to Navigate the Complexity of Performance Debugging in Cloud Microservices , 2019, ASPLOS.

[64]  Kushagra Vaid,et al.  Azure Accelerated Networking: SmartNICs in the Public Cloud , 2018, NSDI.

[65]  Duarte Pinto,et al.  Dynamic Allocation of Serverless Functions in IoT Environments , 2018, 2018 IEEE 16th International Conference on Embedded and Ubiquitous Computing (EUC).

[66]  Albert G. Greenberg,et al.  EyeQ: Practical Network Performance Isolation at the Edge , 2013, NSDI.

[67]  David Walker,et al.  Enabling Programmable Transport Protocols in High-Speed NICs , 2020, NSDI.

[68]  Jay Robert B. del Rosario,et al.  Modelling and Characterization of a Maze-Solving Mobile Robot Using Wall Follower Algorithm , 2013 .

[69]  Kunle Olukotun,et al.  Generating Configurable Hardware from Parallel Patterns , 2015, ASPLOS.

[70]  Keith Winstein,et al.  Salsify: Low-Latency Network Video through Tighter Integration between a Video Codec and a Transport Protocol , 2018, NSDI.

[71]  David Wentzlaff,et al.  Architectural Implications of Function-as-a-Service Computing , 2019, MICRO.

[72]  Michael Abd-El-Malek,et al.  Omega: flexible, scalable schedulers for large compute clusters , 2013, EuroSys '13.

[73]  Eunyoung Jeong,et al.  mTCP: a Highly Scalable User-level TCP Stack for Multicore Systems , 2014, NSDI.

[74]  Mohammad M. Shurman,et al.  Collaborative execution of distributed mobile and IoT applications running at the edge , 2017, 2017 International Conference on Electrical and Computing Technologies and Applications (ICECTA).

[75]  Yubin Xia,et al.  Catalyzer: Sub-millisecond Startup for Serverless Computing with Initialization-less Booting , 2020, ASPLOS.

[76]  Luciano Baresi,et al.  Empowering Low-Latency Applications Through a Serverless Edge Computing Architecture , 2017, ESOCC.

[77]  Emmett Witchel,et al.  Nightcore: efficient and scalable serverless computing for latency-sensitive, interactive microservices , 2021, ASPLOS.

[78]  Christoforos E. Kozyrakis,et al.  Pocket: Elastic Ephemeral Storage for Serverless Analytics , 2018, OSDI.

[79]  Hari Angepat,et al.  A cloud-scale acceleration architecture , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[80]  Christina Delimitrou,et al.  PARTIES: QoS-Aware Resource Partitioning for Multiple Interactive Services , 2019, ASPLOS.

[81]  AWS Lambda , 2019, Machine Learning in the AWS Cloud.

[82]  Lingjia Tang,et al.  The Architectural Implications of Autonomous Driving: Constraints and Acceleration , 2018, ASPLOS.

[83]  Hasan Genc,et al.  Flying IoT: Toward Low-Power Vision in the Sky , 2017, IEEE Micro.

[84]  Prakash Ranganathan,et al.  UAV swarm communication and control architectures: a review , 2019, Journal of Unmanned Vehicle Systems.

[85]  Kunle Olukotun,et al.  Spatial: a language and compiler for application accelerators , 2018, PLDI.

[86]  Sujata Banerjee,et al.  Granular Computing and Network Intensive Applications: Friends or Foes? , 2017, HotNets.

[87]  George Kesidis,et al.  Spock: Exploiting Serverless Functions for SLO and Cost Aware Resource Procurement in Public Cloud , 2019, 2019 IEEE 12th International Conference on Cloud Computing (CLOUD).

[88]  Arvind Krishnamurthy,et al.  E3: Energy-Efficient Microservices on SmartNIC-Accelerated Servers , 2019, USENIX Annual Technical Conference.

[89]  Gigliola Vaglini,et al.  Swarm coordination of mini-UAVs for target search using imperfect sensors , 2018, Intell. Decis. Technol..

[90]  Shrideep Pallickara,et al.  Serverless Computing: An Investigation of Factors Influencing Microservice Performance , 2018, 2018 IEEE International Conference on Cloud Engineering (IC2E).

[91]  Alec Wolman,et al.  MAUI: making smartphones last longer with code offload , 2010, MobiSys '10.

[92]  Luiz André Barroso,et al.  The tail at scale , 2013, CACM.

[93]  Alexandru Iosup,et al.  Serverless is More: From PaaS to Present Cloud Computing , 2018, IEEE Internet Computing.

[94]  Shin-Yeh Tsai,et al.  Disaggregating Persistent Memory and Controlling Them Remotely: An Exploration of Passive Disaggregated Key-Value Stores , 2020, USENIX ATC.

[95]  Patrick Wendell,et al.  Sparrow: distributed, low latency scheduling , 2013, SOSP.

[96]  Brandon Lucia,et al.  Dynamic Task-based Intermittent Execution for Energy-harvesting Devices , 2020, ACM Trans. Sens. Networks.

[97]  Gabriel-Miro Muntean,et al.  Ultra-Reliable IoT Communications with UAVs: A Swarm Use Case , 2018, IEEE Communications Magazine.

[98]  Calton Pu,et al.  milliScope: A Fine-Grained Monitoring Framework for Performance Debugging of n-Tier Web Services , 2017, 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS).

[99]  Christina Delimitrou,et al.  Quasar: resource-efficient and QoS-aware cluster management , 2014, ASPLOS.