Design and Implementation of Coroutine Scheduling System on SW26010

With the rapid development of cloud computing in recent years, how to improve the system's concurrency capabilities has become increasingly important. SW26010 is a heterogeneous many-core processor used to build the Sunway TaihuLight supercomputer. Since this processor can only run a single thread on CPE (computing processing element), its concurrency is limited. Coroutines are a lightweight user-mode thread that occupy less resources. Scheduling and switching can be completed in user mode for coroutines, which is more suitable for processing concurrent tasks. This paper designs and implements a coroutine scheduling system on the SW26010 processor which add the CPE concurrency and improve program performance. This system has three modules: scheduler, executor, coroutine tasks. The scheduler runs on the MPE (management processing element) of SW26010. The scheduler controls the executor on the CPE, the executors control the coroutine task queue. This paper also compares convolution calculations of different scales and creates coroutine of different numbers tasks for convolution calculations. The performance of programs using the coroutine scheduling system has been increased by 23 times compared to only using the CPE.