Dynamic Steerable Frame Networks

Filters in a convolutional network are typically parametrized in a pixel basis. As an orthonormal basis, pixels may represent any arbitrary vector in R. In this paper, we relax this orthonormality requirement and extend the set of viable bases to the generalized notion of frames. When applying suitable frame bases to ResNets on Cifar-10+ we demonstrate improved error rates by substitution only. By exploiting the transformation properties of such generalized bases, we arrive at steerable frames, that allow to continuously transform CNN filters under arbitrary Lie-groups. Further allowing us to locally separate pose from canonical appearance. We implement this in the Dynamic Steerable Frame Network, that dynamically estimates the transformations of filters, conditioned on its input. The derived method presents a hybrid of Dynamic Filter Networks and Spatial Transformer Networks that can be implemented in any convolutional architecture, as we illustrate in two examples. First, we illustrate estimation properties of steerable frames with a Dynamic Steerable Frame Network, compared to a Dynamic Filter Network on the task of edge detection, where we show clear advantages of the derived steerable frames. Lastly, we insert the Dynamic Steerable Frame Network as a module in a convolutional LSTM on the task of limited-data hand-gesture recognition from video and illustrate effective dynamic regularization and show clear advantages over Spatial Transformer Networks. In this paper, we have laid out the foundations of Frame-based convolutional networks and Dynamic Steerable Frame Networks while illustrating their advantages for continuously transforming features and data-efficient learning.

[1]  Yoshua Bengio,et al.  Unitary Evolution Recurrent Neural Networks , 2015, ICML.

[2]  Max Welling,et al.  Group Equivariant Convolutional Networks , 2016, ICML.

[3]  Yacov Hel-Or,et al.  Lie generators for computing steerable functions , 1998, Pattern Recognit. Lett..

[4]  Andrew Zisserman,et al.  Spatial Transformer Networks , 2015, NIPS.

[5]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[6]  Pedro M. Domingos,et al.  Deep Symmetry Networks , 2014, NIPS.

[7]  S. Mallat,et al.  Invariant Scattering Convolution Networks , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Stéphane Mallat,et al.  Deep roto-translation scattering for object classification , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Pietro Perona Steerable-scalable kernels for edge detection and junction analysis , 1992, Image Vis. Comput..

[10]  Jian Sun,et al.  Identity Mappings in Deep Residual Networks , 2016, ECCV.

[11]  Tae-Kyun Kim,et al.  Canonical Correlation Analysis of Video Volume Tensors for Action Categorization and Detection , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Max Welling,et al.  Learning the Irreducible Representations of Commutative Lie Groups , 2014, ICML.

[13]  I. Daubechies,et al.  Framelets: MRA-based constructions of wavelet frames☆☆☆ , 2003 .

[14]  Mark Tygert,et al.  A Mathematical Motivation for Complex-Valued Convolutional Networks , 2015, Neural Computation.

[15]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Keiichi Uchimura,et al.  Scale-Space Processing Using Polynomial Representations , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Jasper Snoek,et al.  Spectral Representations for Convolutional Neural Networks , 2015, NIPS.

[18]  Arnold W. M. Smeulders,et al.  Structured Receptive Fields in CNNs , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Martín Abadi,et al.  TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[20]  Max A. Viergever,et al.  Scale and the differential structure of images , 1992, Image Vis. Comput..

[21]  Luc Van Gool,et al.  Dynamic Filter Networks , 2016, NIPS.

[22]  Michael Unser,et al.  A Unifying Parametric Framework for 2D Steerable Wavelet Transforms , 2013, SIAM J. Imaging Sci..

[23]  Gerald Sommer,et al.  A Lie group approach to steerable filters , 1995, Pattern Recognit. Lett..

[24]  William T. Freeman,et al.  Presented at: 2nd Annual IEEE International Conference on Image , 1995 .

[25]  O. Christensen An introduction to frames and Riesz bases , 2002 .

[26]  Koray Kavukcuoglu,et al.  Exploiting Cyclic Symmetry in Convolutional Neural Networks , 2016, ICML.

[27]  Edward H. Adelson,et al.  The Design and Use of Steerable Filters , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[28]  Geoffrey E. Hinton,et al.  Transforming Auto-Encoders , 2011, ICANN.

[29]  Yacov Hel-Or,et al.  Canonical Decomposition of Steerable Functions , 2004, Journal of Mathematical Imaging and Vision.