On Data-Driven Saak Transform

Being motivated by the multilayer RECOS (REctified-COrrelations on a Sphere) transform, we develop a data-driven Saak (Subspace approximation with augmented kernels) transform in this work. The Saak transform consists of three steps: 1) building the optimal linear subspace approximation with orthonormal bases using the second-order statistics of input vectors, 2) augmenting each transform kernel with its negative, 3) applying the rectified linear unit (ReLU) to the transform output. The Karhunen-Lo\'eve transform (KLT) is used in the first step. The integration of Steps 2 and 3 is powerful since they resolve the sign confusion problem, remove the rectification loss and allow a straightforward implementation of the inverse Saak transform at the same time. Multiple Saak transforms are cascaded to transform images of a larger size. All Saak transform kernels are derived from the second-order statistics of input random vectors in a one-pass feedforward manner. Neither data labels nor backpropagation is used in kernel determination. Multi-stage Saak transforms offer a family of joint spatial-spectral representations between two extremes; namely, the full spatial-domain representation and the full spectral-domain representation. We select Saak coefficients of higher discriminant power to form a feature vector for pattern recognition, and use the MNIST dataset classification problem as an illustrative example.

[1]  Biing Hwang Juang,et al.  Deep neural networks – a developmental perspective , 2016, APSIPA Transactions on Signal and Information Processing.

[2]  Alhussein Fawzi,et al.  A geometric perspective on 
 the robustness of deep networks , 2017 .

[3]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[4]  Jason Yosinski,et al.  Deep neural networks are easily fooled: High confidence predictions for unrecognizable images , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Bolei Zhou,et al.  Object Detectors Emerge in Deep Scene CNNs , 2014, ICLR.

[6]  Alexander Binder,et al.  On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation , 2015, PloS one.

[7]  Stéphane Mallat,et al.  Group Invariant Scattering , 2011, ArXiv.

[8]  Anil K. Bera,et al.  A test for normality of observations and regression residuals , 1987 .

[9]  Stéphane Mallat,et al.  Invariant Scattering Convolution Networks , 2012, IEEE transactions on pattern analysis and machine intelligence.

[10]  Ying Nian Wu,et al.  Generative Modeling of Convolutional Neural Networks , 2014, ICLR.

[11]  George Cybenko,et al.  Approximation by superpositions of a sigmoidal function , 1989, Math. Control. Signals Syst..

[12]  C.-C. Jay Kuo Understanding convolutional neural networks with a mathematical model , 2016, J. Vis. Commun. Image Represent..

[13]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[14]  Adel Javanmard,et al.  Theoretical Insights Into the Optimization Landscape of Over-Parameterized Shallow Neural Networks , 2017, IEEE Transactions on Information Theory.

[15]  Trevor Darrell,et al.  DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition , 2013, ICML.

[16]  Nadav Cohen,et al.  On the Expressive Power of Deep Learning: A Tensor Analysis , 2015, COLT 2016.

[17]  N. Ahmed,et al.  Discrete Cosine Transform , 1996 .

[18]  Thomas Wiatowski,et al.  A Mathematical Theory of Deep Convolutional Neural Networks for Feature Extraction , 2015, IEEE Transactions on Information Theory.

[19]  Victor S. Lempitsky,et al.  Unsupervised Domain Adaptation by Backpropagation , 2014, ICML.

[20]  F. E. Grubbs Sample Criteria for Testing Outlying Observations , 1950 .

[21]  Michael Elad,et al.  Multilayer Convolutional Sparse Modeling: Pursuit and Dictionary Learning , 2017, IEEE Transactions on Signal Processing.

[22]  C.-C. Jay Kuo The CNN as a Guided Multilayer RECOS Transform [Lecture Notes] , 2017, IEEE Signal Processing Magazine.

[23]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[24]  Henry Stark,et al.  Probability, Random Processes, and Estimation Theory for Engineers , 1995 .

[25]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[26]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[27]  Seyed-Mohsen Moosavi-Dezfooli,et al.  DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Alexander Binder,et al.  Explaining nonlinear classification decisions with deep Taylor decomposition , 2015, Pattern Recognit..

[29]  Aarnout Brombacher,et al.  Probability... , 2009, Qual. Reliab. Eng. Int..

[30]  Jianqin Zhou,et al.  On discrete cosine transform , 2011, ArXiv.

[31]  Andrew Zisserman,et al.  Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.