论文信息 - MS-Net: A lightweight separable ConvNet for multi-dimensional image processing

MS-Net: A lightweight separable ConvNet for multi-dimensional image processing

As the core technology of deep learning, convolutional neural networks have been widely applied in a variety of computer vision tasks and have achieved state-of-the-art performance. However, it’s difficult and inefficient for them to deal with high dimensional image signals due to the dramatic increase of training parameters. In this paper, we present a lightweight and efficient MS-Net for the multi-dimensional(MD) image processing, which provides a promising way to handle MD images, especially for devices with limited computational capacity. It takes advantage of a series of one dimensional convolution kernels and introduces a separable structure in the ConvNet throughout the learning process to handle MD image signals. Meanwhile, multiple group convolutions with kernel size 1 × 1 are used to extract channel information. Then the information of each dimension and channel is fused by a fusion module to extract the complete image features. Thus the proposed MS-Net significantly reduces the training complexity, parameters and memory cost. The proposed MS-Net is evaluated on both 2D and 3D benchmarks CIFAR-10, CIFAR-100 and KTH. Extensive experimental results show that the MS-Net achieves competitive performance with greatly reduced computational and memory cost compared with the state-of-the-art ConvNet models.

Baocai Yin | Yunhui Shi | Jin Wang | Yingxuan Cui | Zhenning Hou

[1] Juan Carlos Niebles,et al. Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words , 2008, International Journal of Computer Vision.

[2] Enhua Wu,et al. Squeeze-and-Excitation Networks , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3] Wei Liu,et al. SSD: Single Shot MultiBox Detector , 2015, ECCV.

[4] Martin A. Fischler,et al. The Representation and Matching of Pictorial Structures , 1973, IEEE Transactions on Computers.

[5] Ali Farhadi,et al. XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks , 2016, ECCV.

[6] Recent Advances in Computer Vision - Theories and Applications , 2019, Recent Advances in Computer Vision.

[7] Ming Yang,et al. 3D Convolutional Neural Networks for Human Action Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.