Maximizing Parallel Activation of Word-Lines in MRAM-based Binary Neural Network Accelerators

Magnetic RAM (MRAM)-based crossbar array has a great potential as a platform for in-memory binary neural network (BNN) computing. However, the number of word-lines that can be activated simultaneously is limited because of the low $I_{H}/I_{L}$ ratio of MRAM, which makes BNNs more vulnerable to the device variation. To address this issue, we propose an algorithm/hardware co-design methodology. First, we choose a promising memristor crossbar array (MCA) structure based on the sensitivity analysis to process variations. Since the selected MCA structure becomes more tolerant to the device variation when the number of 1 in input activation values decreases, we apply an input distribution regularization scheme to reduce the number of 1 in input of BNNs during training. We further improve the robustness against device variation by adopting the retraining scheme based on knowledge distillation. Experimental results show that the proposed method makes BNNs more tolerant to MRAM variation and increases the number of parallel word-line activation significantly; thereby achieving improved throughput and energy efficiency.