RISC-V Barrel Processor for Deep Neural Network Acceleration

This paper presents a barrel RISC-V processor designed to control a deep neural network accelerator. Our design has a 5-stage pipeline data path with 8 hardware threads (harts). Each thread is executed under a strict round robin scheduler and is responsible for providing data and control signals to a neural network processing element (PE). Each PE is capable of arbitrary precision GEneral Matrix Vector (GEMV) operations. The execution of each thread is independent of other threads and any communication between threads are sent through shared memory via software. To reduce the area required for implementation, our processor is an implementation of the RV32I plus a set of custom CSRs for controlling the PEs. Our design passes all riscv_test written in assembly and compiled with RISC-V gcc. Our 8-hart barrel processor runs at 250 MHz with CPI of 1 and consumes 0.372W. To demonstrate the capabilities of our design, we computed a GEMV operation with an input matrix size of 8 by 128 and a weight matrix size of 128 by 128 with two-bit precision in only 16 clock cycles.