Do Capsule Networks Solve the Problem of Rotation Invariance for Traffic Sign Classification?

Detecting and classifying traffic signs is a very import step to future autonomous driving. In contrast to earlier approaches with handcrafted features, modern neural networks learn the representation of the classes themselves. Current convolutional neural networks achieve very high accuracy when classifying images, but they have one big problem with their robustness to shift and rotation. In this work an evaluation of a new technique with Capsule Networks is performed and the results are compared to a standard Convolutional Neural Network and a Spatial Transformer Network. Moreover various methods for augmenting the training data are evaluated. This comparison shows the big advantages of the Capsule Networks but also their restrictions. They give a big boost in solving problems mentioned above but their computational complexity is much higher than convolutional neural networks.

[1]  Ah Chung Tsoi,et al.  Neural Network Classification and Prior Class Probabilities , 1996, Neural Networks: Tricks of the Trade.

[2]  Andrew Zisserman,et al.  Spatial Transformer Networks , 2015, NIPS.

[3]  Geoffrey E. Hinton,et al.  Dynamic Routing Between Capsules , 2017, NIPS.

[4]  Changshui Zhang,et al.  Traffic Sign Recognition With Hinge Loss Trained Convolutional Neural Networks , 2014, IEEE Transactions on Intelligent Transportation Systems.

[5]  Max Welling,et al.  Group Equivariant Convolutional Networks , 2016, ICML.

[6]  Yann LeCun,et al.  Traffic sign recognition with multi-scale Convolutional Networks , 2011, The 2011 International Joint Conference on Neural Networks.

[7]  Atsuto Maki,et al.  A systematic study of the class imbalance problem in convolutional neural networks , 2017, Neural Networks.

[8]  D. Hubel,et al.  Receptive fields of single neurones in the cat's striate cortex , 1959, The Journal of physiology.

[9]  Wang Ping,et al.  Research on data augmentation for image classification based on convolution neural networks , 2017, 2017 Chinese Automation Congress (CAC).

[10]  Kunihiko Fukushima,et al.  Neocognitron: A new algorithm for pattern recognition tolerant of deformations and shifts in position , 1982, Pattern Recognit..

[11]  Johannes Stallkamp,et al.  Man vs. computer: Benchmarking machine learning algorithms for traffic sign recognition , 2012, Neural Networks.

[12]  Michele Volpi,et al.  Learning rotation invariant convolutional filters for texture classification , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).

[13]  Jürgen Schmidhuber,et al.  Multi-column deep neural networks for image classification , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Yann LeCun,et al.  Dimensionality Reduction by Learning an Invariant Mapping , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[15]  Qiang Qiu,et al.  Oriented Response Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Bernard Victorri,et al.  Transformation invariance in pattern recognition: Tangent distance and propagation , 2000 .