论文信息 - On Data-Driven Saak Transform

On Data-Driven Saak Transform

Being motivated by the multilayer RECOS (REctified-COrrelations on a Sphere) transform, we develop a data-driven Saak (Subspace approximation with augmented kernels) transform in this work. The Saak transform consists of three steps: 1) building the optimal linear subspace approximation with orthonormal bases using the second-order statistics of input vectors, 2) augmenting each transform kernel with its negative, 3) applying the rectified linear unit (ReLU) to the transform output. The Karhunen-Lo\'eve transform (KLT) is used in the first step. The integration of Steps 2 and 3 is powerful since they resolve the sign confusion problem, remove the rectification loss and allow a straightforward implementation of the inverse Saak transform at the same time. Multiple Saak transforms are cascaded to transform images of a larger size. All Saak transform kernels are derived from the second-order statistics of input random vectors in a one-pass feedforward manner. Neither data labels nor backpropagation is used in kernel determination. Multi-stage Saak transforms offer a family of joint spatial-spectral representations between two extremes; namely, the full spatial-domain representation and the full spectral-domain representation. We select Saak coefficients of higher discriminant power to form a feature vector for pattern recognition, and use the MNIST dataset classification problem as an illustrative example.

C.-C. Jay Kuo | Yueru Chen | Yueru Chen

[1] Biing Hwang Juang,et al. Deep neural networks – a developmental perspective , 2016, APSIPA Transactions on Signal and Information Processing.

[2] Alhussein Fawzi,et al. A geometric perspective on   the robustness of deep networks , 2017 .

[3] Kurt Hornik,et al. Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[4] Jason Yosinski,et al. Deep neural networks are easily fooled: High confidence predictions for unrecognizable images , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5] Bolei Zhou,et al. Object Detectors Emerge in Deep Scene CNNs , 2014, ICLR.

[6] Alexander Binder,et al. On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation , 2015, PloS one.

[7] Stéphane Mallat,et al. Group Invariant Scattering , 2011, ArXiv.

[8] Anil K. Bera,et al. A test for normality of observations and regression residuals , 1987 .

[9] Stéphane Mallat,et al. Invariant Scattering Convolution Networks , 2012, IEEE transactions on pattern analysis and machine intelligence.

[10] Ying Nian Wu,et al. Generative Modeling of Convolutional Neural Networks , 2014, ICLR.

[11] George Cybenko,et al. Approximation by superpositions of a sigmoidal function , 1989, Math. Control. Signals Syst..

[12] C.-C. Jay Kuo. Understanding convolutional neural networks with a mathematical model , 2016, J. Vis. Commun. Image Represent..

[13] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.

[14] Adel Javanmard,et al. Theoretical Insights Into the Optimization Landscape of Over-Parameterized Shallow Neural Networks , 2017, IEEE Transactions on Information Theory.

[15] Trevor Darrell,et al. DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition , 2013, ICML.

[16] Nadav Cohen,et al. On the Expressive Power of Deep Learning: A Tensor Analysis , 2015, COLT 2016.

[17] N. Ahmed,et al. Discrete Cosine Transform , 1996 .

[18] Thomas Wiatowski,et al. A Mathematical Theory of Deep Convolutional Neural Networks for Feature Extraction , 2015, IEEE Transactions on Information Theory.

[19] Victor S. Lempitsky,et al. Unsupervised Domain Adaptation by Backpropagation , 2014, ICML.

[20] F. E. Grubbs. Sample Criteria for Testing Outlying Observations , 1950 .

[21] Michael Elad,et al. Multilayer Convolutional Sparse Modeling: Pursuit and Dictionary Learning , 2017, IEEE Transactions on Signal Processing.

[22] C.-C. Jay Kuo. The CNN as a Guided Multilayer RECOS Transform [Lecture Notes] , 2017, IEEE Signal Processing Magazine.

[23] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.

[24] Henry Stark,et al. Probability, Random Processes, and Estimation Theory for Engineers , 1995 .

[25] Rob Fergus,et al. Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[26] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..

[27] Seyed-Mohsen Moosavi-Dezfooli,et al. DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28] Alexander Binder,et al. Explaining nonlinear classification decisions with deep Taylor decomposition , 2015, Pattern Recognit..

[29] Aarnout Brombacher,et al. Probability... , 2009, Qual. Reliab. Eng. Int..

[30] Jianqin Zhou,et al. On discrete cosine transform , 2011, ArXiv.

[31] Andrew Zisserman,et al. Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.