论文信息 - Deep Quaternion Networks

Deep Quaternion Networks

The field of deep learning has seen significant advancement in recent years. However, much of the existing work has been focused on real-valued numbers. Recent work has shown that a deep learning system using the complex numbers can be deeper for a fixed parameter budget compared to its real-valued counterpart. In this work, we explore the benefits of generalizing one step further into the hyper-complex numbers, quaternions specifically, and provide the architecture components needed to build deep quaternion networks. We develop the theoretical basis by reviewing quaternion convolutions, developing a novel quaternion weight initialization scheme, and developing novel algorithms for quaternion batch-normalization. These pieces are tested in a classification model by end-to-end training on the CIFAR −10 and CIFAR −100 data sets and a segmentation model by end-to-end training on the KITTI Road Segmentation data set. These quaternion networks show improved convergence compared to real-valued and complex-valued networks, especially on the segmentation task, while having fewer parameters.

Anthony S. Maida | Chase J. Gaudet | A. Maida

[1] Titouan Parcollet,et al. Quaternion Neural Networks for Spoken Language Understanding , 2016, 2016 IEEE Spoken Language Technology Workshop (SLT).

[2] A.V. Oppenheim,et al. The importance of phase in signals , 1980, Proceedings of the IEEE.

[3] Adityan Rishiyur,et al. Neural Networks with Complex and Quaternion Inputs , 2006, ArXiv.

[4] Jürgen Schmidhuber,et al. Training Very Deep Networks , 2015, NIPS.

[5] Lilong Shi,et al. Quaternion color texture segmentation , 2007, Comput. Vis. Image Underst..

[6] Sepp Hochreiter,et al. Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs) , 2015, ICLR.

[7] Jannik Fritsch,et al. A new performance measure and evaluation benchmark for road detection algorithms , 2013, 16th International IEEE Conference on Intelligent Transportation Systems (ITSC 2013).

[8] Nobuyuki Matsui,et al. Feed forward neural network with random quaternionic neurons , 2017, Signal Process..

[9] Yoshua Bengio,et al. Unitary Evolution Recurrent Neural Networks , 2015, ICML.

[10] Alex Graves,et al. Associative Long Short-Term Memory , 2016, ICML.

[11] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12] Thomas Bülow,et al. Hypercomplex signals-a novel extension of the analytic signal to the multidimensional case , 2001, IEEE Trans. Signal Process..

[13] Y. Nesterov. A method for solving the convex programming problem with convergence rate O(1/k^2) , 1983 .

[14] Sepp Hochreiter,et al. Untersuchungen zu dynamischen neuronalen Netzen , 1991 .

[15] Les E. Atlas,et al. Full-Capacity Unitary Recurrent Neural Networks , 2016, NIPS.

[16] François Chollet,et al. Xception: Deep Learning with Depthwise Separable Convolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[18] Stephen J. Sangwine,et al. Colour image filters based on hypercomplex convolution , 2000 .

[19] Thomas Bülow,et al. Hypercomplex spectral signal representations for the processing and analysis of images , 1999 .

[20] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[21] Roberto Cipolla,et al. PoseNet: A Convolutional Network for Real-Time 6-DOF Camera Relocalization , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[22] Jian Sun,et al. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[23] Geoffrey E. Hinton,et al. Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[24] Sandeep Subramanian,et al. Deep Complex Networks , 2017, ICLR.

[25] K. Strimmer,et al. Optimal Whitening and Decorrelation , 2015, 1512.00809.

[26] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .