Brain-computer interface (BCI) based on motor imagery (MI) electroencephalogram (EEG) decoding helps motor-disabled patients to communicate with external devices directly, which can achieve the purpose of human-computer interaction and assisted living. MI EEG decoding has a core problem which is extracting as many multiple types of features as possible from the multi-channel time series of EEG to understand brain activity accurately. Recently, deep learning technology has been widely used in EEG decoding. However, the variability of the simple network framework is insufficient to satisfy the complex task of EEG decoding. A multi-scale fusion convolutional neural network based on the attention mechanism (MS-AMF) is proposed in this paper. The network extracts spatio temporal multi-scale features from multi-brain regions representation signals and is supplemented by a dense fusion strategy to retain the maximum information flow. The attention mechanism we added to the network has improved the sensitivity of the network. The experimental results show that the network has a better classification effect compared with the baseline method in the BCI Competition IV-2a dataset. We conducted visualization analysis in multiple parts of the network, and the results show that the attention mechanism is also convenient for analyzing the underlying information flow of EEG decoding, which verifies the effectiveness of the MS-AMF method.