OneEdge: An Efficient Control Plane for Geo-Distributed Infrastructures

Resource management for geo-distributed infrastructures is challenging due to the scarcity and non-uniformity of edge resources, as well as the high client mobility and workload surges inherent to situation awareness applications. Due to their centralized nature, state-of-the-art schedulers that work well in datacenters lack the performance and feature requirements of such applications. We present OneEdge, a hybrid control plane that enables autonomous decision-making at edge sites for localized, rapid single-site application deployment. Edge sites handle mobility, churn, and load spikes, by cooperating with a centralized controller that allows coordinated multi-site scheduling and dynamic reconfiguration. OneEdge's scheduling decisions are driven by each application's end-to-end service level objective (E2E SLO) as well as the specific requirements of situation awareness applications. OneEdge's novel distributed state management combines autonomous decision-making at the edge sites for rapid localized resource allocations with decision-making at the central controller when multi-site application deployment is needed. Using a mix of applications on multi-region Azure instances, we show that, in contrast to centralized or fully distributed control planes, OneEdge caters to the unique requirements of situation awareness applications. Compared to a centralized control plane, OneEdge reduces deployment latency by 66% for single-site applications, without compromising E2E SLOs.

[1]  Carlo Curino,et al.  Hydra: a federated resource manager for data-center scale analytics , 2019, NSDI.

[2]  Hermann Hellwagner,et al.  Edge Computing in 5G for Drone Navigation: What to Offload? , 2021, IEEE Robotics and Automation Letters.

[3]  Ying Huang,et al.  Extend Cloud to Edge with KubeEdge , 2018, 2018 IEEE/ACM Symposium on Edge Computing (SEC).

[4]  Abhishek Verma,et al.  Large-scale cluster management at Google with Borg , 2015, EuroSys.

[5]  Anne-Marie Kermarrec,et al.  Hawk: Hybrid Datacenter Scheduling , 2015, USENIX Annual Technical Conference.

[6]  Shuai Wang,et al.  Distributed Dynamic Map Fusion via Federated Learning for Intelligent Networked Vehicles , 2021, 2021 IEEE International Conference on Robotics and Automation (ICRA).

[7]  Zhijun Wang,et al.  Pigeon: an Effective Distributed, Hierarchical Datacenter Job Scheduler , 2019, SoCC.

[8]  Stephanie Vogelgesang,et al.  Pokémon GO , 2016, Informatik-Spektrum.

[9]  Kshitij Doshi,et al.  Agile Cold Starts for Scalable Serverless , 2019, HotCloud.

[10]  Ch. Ramesh Babu,et al.  Internet of Vehicles: From Intelligent Grid to Autonomous Cars and Vehicular Clouds , 2016 .

[11]  Kostas Daniilidis,et al.  Event-Based Visual Inertial Odometry , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Carlo Curino,et al.  Mercury: Hybrid Centralized and Distributed Scheduling in Large Shared Clusters , 2015, USENIX Annual Technical Conference.

[13]  Randy H. Katz,et al.  Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center , 2011, NSDI.

[14]  Wei Lin,et al.  Apollo: Scalable and Coordinated Scheduling for Cloud-Scale Computing , 2014, OSDI.

[15]  Christina Delimitrou,et al.  Tarcil: reconciling scheduling speed and quality in large shared clusters , 2015, SoCC.

[16]  Anis Koubâa,et al.  Robot Operating System (ROS): The Complete Reference (Volume 1) , 2016 .

[17]  Enrique Saurez,et al.  Incremental deployment and migration of geo-distributed situation awareness applications in the fog , 2016, DEBS.

[18]  Michael Abd-El-Malek,et al.  Omega: flexible, scalable schedulers for large compute clusters , 2013, EuroSys '13.

[19]  Amin Vahdat,et al.  Democratizing the Network Edge , 2019, CCRV.

[20]  Eric A. Brewer,et al.  Kubernetes and the path to cloud native , 2015, SoCC.

[21]  Babak Falsafi,et al.  The Case for RackOut: Scalable Data Serving Using Rack-Scale Systems , 2016, SoCC.

[22]  Paramvir Bahl,et al.  Real-Time Video Analytics: The Killer App for Edge Computing , 2017, Computer.

[23]  Umakishore Ramachandran,et al.  Coral-Pie: A Geo-Distributed Edge-compute Solution for Space-Time Vehicle Tracking , 2020, Middleware.

[24]  Werner Vogels,et al.  Building reliable distributed systems at a worldwide scale demands trade-offs between consistency and availability. , 2022 .

[25]  Tae-Young Lee,et al.  Supporting Driver Situation Awareness for Autonomous Urban Driving with an Augmented-Reality Windshield Display , 2018, 2018 IEEE International Symposium on Mixed and Augmented Reality Adjunct (ISMAR-Adjunct).

[26]  Germán Ros,et al.  CARLA: An Open Urban Driving Simulator , 2017, CoRL.

[27]  Lingjia Tang,et al.  The Architectural Implications of Autonomous Driving: Constraints and Acceleration , 2018, ASPLOS.