Falcon: Low Latency, Network-Accelerated Scheduling

We present Falcon, a novel scheduler design for large scale data analytics workloads. To improve the quality of the scheduling decisions, Falcon uses a single central scheduler. To scale the central scheduler to support large clusters, Falcon offloads the scheduling operation to a programmable switch. The core of the Falcon design is a novel pipeline-based scheduling logic that can schedule tasks at line-rate. Our prototype evaluation on a cluster with a Barefoot Tofino switch shows that the proposed approach can reduce scheduling overhead by 26 times and increase the scheduling throughput by 25 times compared to state-of-the-art centralized and decentralized schedulers.

[1]  Xiaozhou Li,et al.  NetChain: Scale-Free Sub-RTT Coordination , 2018, NSDI.

[2]  Fernando Pedone,et al.  NetPaxos: consensus at network speed , 2015, SOSR.

[3]  Scott Shenker,et al.  The Case for Tiny Tasks in Compute Clusters , 2013, HotOS.

[4]  Andrea C. Arpaci-Dusseau,et al.  NICE: Network-Integrated Cluster-Efficient Storage , 2017, HPDC.

[5]  Wei Lin,et al.  Apollo: Scalable and Coordinated Scheduling for Cloud-Scale Computing , 2014, OSDI.

[6]  Patrick Wendell,et al.  Sparrow: distributed, low latency scheduling , 2013, SOSP.

[7]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[8]  Stephanie Wang,et al.  Lineage stash: fault tolerance off the critical path , 2019, SOSP.

[9]  Samer Al-Kiswany,et al.  FLAIR: Accelerating Reads with Consistency-Aware Network Routing , 2020, NSDI.

[10]  Michael Stonebraker,et al.  The 8 requirements of real-time stream processing , 2005, SGMD.

[11]  Peter R. Pietzuch,et al.  Neptune: Scheduling Suspendable Tasks for Unified Stream/Batch Applications , 2019, SoCC.

[12]  Scott Shenker,et al.  Spark: Cluster Computing with Working Sets , 2010, HotCloud.

[13]  Panos Kalnis,et al.  In-Network Computation is a Dumb Idea Whose Time Has Come , 2017, HotNets.

[14]  Robert N. M. Watson,et al.  Firmament: Fast, Centralized Cluster Scheduling at Scale , 2016, OSDI.

[15]  Adam Wierman,et al.  Hopper: Decentralized Speculation-aware Cluster Scheduling at Scale , 2015, SIGCOMM.

[16]  Jialin Li,et al.  Designing Distributed Systems Using Approximate Synchrony in Data Center Networks , 2015, NSDI.

[17]  Aakanksha Chowdhery,et al.  The Design and Implementation of a Wireless Video Surveillance System , 2015, MobiCom.

[18]  Nate Foster,et al.  NetCache: Balancing Key-Value Stores with Fast In-Network Caching , 2017, SOSP.

[19]  Carlo Curino,et al.  Mercury: Hybrid Centralized and Distributed Scheduling in Large Shared Clusters , 2015, USENIX Annual Technical Conference.

[20]  Ali Ghodsi,et al.  Drizzle: Fast and Adaptable Stream Processing at Scale , 2017, SOSP.

[21]  Jialin Li,et al.  Just Say NO to Paxos Overhead: Replacing Consensus with Network Ordering , 2016, OSDI.

[22]  Edouard Bugnion,et al.  R2P2: Making RPCs first-class datacenter citizens , 2019, USENIX ATC.

[23]  Jacob Nelson,et al.  When Should The Network Be The Computer? , 2019, HotOS.

[24]  Xiaozhou Li,et al.  Be Fast, Cheap and in Control with SwitchKV , 2016, NSDI.