Efficient Capsule Network with Multi-Subspace Learning

The capsule networks (CapsNet) is a promising domain in deep learning, which can effectively solve the fatal shortcomings of Convolutional Neural Networks (CNNs), that is, the lack of the capability to represent the spatial relationship within feature map and the ability to encode corresponding positions. Yet its true potential is not fully realized thus far. Almost all kinds of CapsNet models, even if they only contain a few capsule layers, the model parameters have risen to the order of several million, which means that huge computer resources and training time are required. In this paper, we propose Multi-Head Self-Attention Capsule Network (MA-CapsNet) to address these drawbacks above. We design a lateral inhibition scheme to highlight the target capsule. Compared with other CapsNet, we use multi-head self-attention to route capsules, instead of the recently popular tensor convolution routing and the old-fashion iterative routing, which reduces the parameters of the entire CapsNet to about 19k, less than 0.3% of the original CapsNet proposed by Hinton. Our routing mechanism strengthens the model's ability to focus on different positions in feature maps and expands the representation subspace of the attention layer. With MA-CapsNet, we surpass the state-of-the-art results in the field of CapsNet on MNIST, FashionMNIST, SmallNORB with least model parameters.