Curriculum Reinforcement Learning-Based Computation Offloading Approach in Space-Air-Ground Integrated Network

Space-air-ground integrated network (SAGIN) is emerging as a prominent framework supporting the ever-growing Internet of Things (IoT) applications in the areas without infrastructures. In this paper, we investigate the problem of IoT task offloading under the SAGIN scenario where multiple IoT devices cooperatively use computing resources. We formulate the task offloading problem of minimizing the processing delay of all tasks, taking into account the dynamics of tasks generated by each IoT device, the mobility of unmanned aerial vehicle (UAV), and the difference in computing power between the UAV and the low earth orbit (LEO) satellite. Then the problem is formulated as a Markov decision process (MDP). To cope with the dynamics and complexity of the system, as well as the training difficulties caused by the large number of agents, we propose a curriculum learning-multi-agent deep deterministic policy gradient (CL-MADDPG) approach to learn the near-optimal offloading strategy. Simulation results show that the proposed method has satisfactory convergency and can significantly reduce the average task processing delay.