Neural interface using decomposed motor units (MUs) from surface electromyography (sEMG) has allowed non-invasive access to the neural control signals, and provided a novel approach for intuitive human-machine interaction. However, most of the existing methods based on decomposed MUs merely adopted the discharge rate (DR) as the feature representations, which may lack local information around the discharge instant and ignore the subtle interactions of different MUs. In this study, we proposed an MU-specific image-based scheme for wrist torque estimation. Specifically, the high-density sEMG signals were decoded into motor unit spike trains (MUSTs), and then MU-specific images were reconstructed with MUSTs and corresponding motor unit action potential (MUAP). A convolutional neural network was used to learn representative features from MU-specific images automatically, and further to estimate wrist torques. The results demonstrated that the proposed method outperformed three conventional and a deep-learning regression approaches using DR features, with the estimation accuracy R 2 of 0.82 ± 0.09, 0.89 ± 0.06, and nRMSE of 12.6 ± 2.5 %, 11.0 ± 3.1 % for pronation/supination and flexion/extension, respectively. Further, the analysis of the extracted features from MU-specific images showed a higher correlation than DR for recorded torques, indicating the effectiveness of the proposed method. The outcomes of this study provide a novel and promising perspective for the intuitive control of neural interfacing.