A Mathematical Motivation for Complex-Valued Convolutional Networks

A complex-valued convolutional network (convnet) implements the repeated application of the following composition of three operations, recursively applying the composition to an input vector of nonnegative real numbers: (1) convolution with complex-valued vectors, followed by (2) taking the absolute value of every entry of the resulting vectors, followed by (3) local averaging. For processing real-valued random vectors, complex-valued convnets can be viewed as data-driven multiscale windowed power spectra, data-driven multiscale windowed absolute spectra, data-driven multiwavelet absolute values, or (in their most general configuration) data-driven nonlinear multiwavelet packets. Indeed, complex-valued convnets can calculate multiscale windowed spectra when the convnet filters are windowed complex-valued exponentials. Standard real-valued convnets, using rectified linear units (ReLUs), sigmoidal (e.g., logistic or tanh) nonlinearities, or max pooling, for example, do not obviously exhibit the same exact correspondence with data-driven wavelets (whereas for complex-valued convnets, the correspondence is much more than just a vague analogy). Courtesy of the exact correspondence, the remarkably rich and rigorous body of mathematical analysis for wavelets applies directly to (complex-valued) convnets.

[1]  S. Mallat Recursive interferometric representations , 2010, 2010 18th European Signal Processing Conference.

[2]  Ronald R. Coifman,et al.  Signal processing and compression with wavelet packets , 1994 .

[3]  Y. Meyer,et al.  Wavelets and Operators: Frontmatter , 1993 .

[4]  D. Donoho,et al.  Translation-Invariant De-Noising , 1995 .

[5]  Stéphane Mallat,et al.  Invariant Scattering Convolution Networks , 2012, IEEE transactions on pattern analysis and machine intelligence.

[6]  Ronald W. Schafer,et al.  Introduction to Digital Speech Processing , 2007, Found. Trends Signal Process..

[7]  Olaf Hellwich,et al.  Complex-Valued Convolutional Neural Networks for Object Detection in PolSAR data , 2010 .

[8]  Yuandong Tian,et al.  Scale-invariant learning and convolutional networks , 2015, ArXiv.

[9]  Y. Meyer,et al.  Wavelets: Calderón-Zygmund and Multilinear Operators , 1997 .

[10]  Luc Van Gool,et al.  Speeded-Up Robust Features (SURF) , 2008, Comput. Vis. Image Underst..

[11]  Stephane Mollai Recursive interferometric representations , 2010, EUSIPCO.

[12]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[13]  D. Donoho,et al.  Translation-Invariant DeNoising , 1995 .

[14]  Ronald R. Coifman,et al.  Local discriminant bases and their applications , 1995, Journal of Mathematical Imaging and Vision.

[15]  S. Mallat,et al.  Intermittent process analysis with scattering moments , 2013, 1311.4104.

[16]  Stéphane Mallat,et al.  A Wavelet Tour of Signal Processing - The Sparse Way, 3rd Edition , 2008 .

[17]  Y. Meyer Wavelets and Operators , 1993 .

[18]  Ingrid Daubechies,et al.  Ten Lectures on Wavelets , 1992 .

[19]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[20]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[21]  David J. Schwab,et al.  An exact mapping between the Variational Renormalization Group and Deep Learning , 2014, ArXiv.

[22]  Stéphane Mallat,et al.  Locally stationary covariance and signal estimation with macrotiles , 2003, IEEE Trans. Signal Process..

[23]  Yuandong Tian,et al.  Convolutional networks and learning invariant to homogeneous multiplicative scalings , 2015, 1506.08230.

[24]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[25]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[26]  William T. Freeman,et al.  Presented at: 2nd Annual IEEE International Conference on Image , 1995 .

[27]  Lorenzo Rosasco,et al.  The computational magic of the ventral stream: sketch of a theory (and why some deep architectures work). , 2012 .

[28]  Stéphane Mallat,et al.  Deep roto-translation scattering for object classification , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[30]  Stphane Mallat,et al.  A Wavelet Tour of Signal Processing, Third Edition: The Sparse Way , 2008 .

[31]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[32]  Eero P. Simoncelli,et al.  On Advances in Statistical Modeling of Natural Images , 2004, Journal of Mathematical Imaging and Vision.