Demo: Resource Allocation for Wafer-Scale Deep Learning Accelerator

Due to the rapid development of deep learning (DL) has brought, artificial intelligence (AI) chips were invented incorperating the traditional computing architecture with the simulated neural network structure for the sake of improving the energy efficiency. Recently, emerging deep learning AI chips imposed the challenge of allocating computing resources according to a deep neural networks (DNN), such that tasks using the DNN can be processed in a parallel and distributed manner. In this paper, we combine graph theory and combinatorial optimization technology to devise a fast floorplanning approach based on kernel graph structure, which is provided by Cerebras Systems Inc. for mapping the layers of DNN to the mesh of computing units called Wafer-Scale-Engine (WSE). Numerical experiments were carried out to evaluate our method using the public benchmarks and evaluation criteria, demonstrating its performance gain comparing to the state-of-art algorithms.