Memory-Efficient CMSIS-NN with Replacement Strategy

Microcontroller Units (MCUs) are widely used for industrial field applications, and are now ever more being used also for machine learning on the edge, because of their reliability, low cost, and energy efficiency. Due to the MCU resource limitations, the deployed ML models need to be optimized particularly in terms of memory footprint. In this paper, we propose an in-place computation strategy to reduce memory requirements of neural network inference. The strategy exploits the MCU single-core architecture, with sequential execution. Experimental analysis using the CMSIS-NN library on the CIFAE-10 dataset shows that the proposed optimization method can reduce the memory required by a NN model by more than 9%, without impacting the execution performance nor accuracy. The amount of reduction further increases with deeper network architectures.