Invariant Scattering Convolution Networks

A wavelet scattering network computes a translation invariant image representation which is stable to deformations and preserves high-frequency information for classification. It cascades wavelet transform convolutions with nonlinear modulus and averaging operators. The first network layer outputs SIFT-type descriptors, whereas the next layers provide complementary invariant information that improves classification. The mathematical analysis of wavelet scattering networks explains important properties of deep convolution networks for classification. A scattering representation of stationary processes incorporates higher order moments and can thus discriminate textures having the same Fourier power spectrum. State-of-the-art classification results are obtained for handwritten digits and texture discrimination, with a Gaussian kernel SVM and a generative PCA classifier.

[1]  P. Abry,et al.  Wavelets, spectrum analysis and 1/ f processes , 1995 .

[2]  P. Massart,et al.  From Model Selection to Adaptive Estimation , 1997 .

[3]  Jean-Jacques E. Slotine,et al.  On Contraction Analysis for Non-linear Systems , 1998, Autom..

[4]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[5]  Winfried Stefan Lohmiller,et al.  Contraction analysis of nonlinear systems , 1999 .

[6]  Michael I. Jordan,et al.  On Discriminative vs. Generative Classifiers: A comparison of logistic regression and naive Bayes , 2001, NIPS.

[7]  Yehoshua Y. Zeevi,et al.  Gabor Feature Space Diffusion via the Minimal Weighted Area Method , 2001, EMMCVPR.

[8]  Bernard Haasdonk,et al.  Tangent distance kernels for support vector machines , 2002, Object recognition supported by user interaction for service robots.

[9]  Andrew Zisserman,et al.  Texture classification: are filter banks necessary? , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[10]  Eero P. Simoncelli,et al.  A Parametric Texture Model Based on Joint Statistics of Complex Wavelet Coefficients , 2000, International Journal of Computer Vision.

[11]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[12]  Jitendra Malik,et al.  Representing and Recognizing the Visual Appearance of Materials using Three-dimensional Textons , 2001, International Journal of Computer Vision.

[13]  Mario Fritz,et al.  On the Significance of Real-World Conditions for Material Classification , 2004, ECCV.

[14]  Pietro Perona,et al.  Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[15]  Alain Trouvé,et al.  Local Geometry of Deformable Templates , 2005, SIAM J. Math. Anal..

[16]  Max Welling,et al.  Robust Higher Order Statistics , 2005, AISTATS.

[17]  Robert E. Broadhurst Statistical Estimation of Histogram Variation for Texture Classification , 2005 .

[18]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[19]  Yali Amit,et al.  POP: Patchwork of Parts Models for Object Recognition , 2007, International Journal of Computer Vision.

[20]  Y. Amit,et al.  Towards a coherent statistical framework for dense deformable template estimation , 2007 .

[21]  Hermann Ney,et al.  Deformation Models for Image Recognition , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Marc'Aurelio Ranzato,et al.  Unsupervised Learning of Invariant Feature Hierarchies with Applications to Object Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Yoshua Bengio,et al.  Exploring Strategies for Training Deep Neural Networks , 2009, J. Mach. Learn. Res..

[24]  Yann LeCun,et al.  What is the best multi-stage architecture for object recognition? , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[25]  Stefano Soatto,et al.  Actionable information in vision , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[26]  Andrew Zisserman,et al.  A Statistical Approach to Material Classification Using Image Patch Exemplars , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Lewis D. Griffin,et al.  Using Basic Image Features for Texture Classification , 2010, International Journal of Computer Vision.

[28]  Lorenzo Rosasco,et al.  On Invariance in Hierarchical Models , 2009, NIPS.

[29]  P. Bickel,et al.  Covariance regularization by thresholding , 2009, 0901.3079.

[30]  Yihong Gong,et al.  Locality-constrained Linear Coding for image classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[31]  Stephane Mollai Recursive interferometric representations , 2010, EUSIPCO.

[32]  Vincent Lepetit,et al.  DAISY: An Efficient Dense Descriptor Applied to Wide-Baseline Stereo , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  Jean Ponce,et al.  Learning mid-level features for recognition , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[34]  Yann LeCun,et al.  Convolutional networks and applications in vision , 2010, Proceedings of 2010 IEEE International Symposium on Circuits and Systems.

[35]  Zhenhua Guo,et al.  Rotation invariant texture classification using LBP variance (LBPV) with global matching , 2010, Pattern Recognit..

[36]  Laurent U. Perrinet,et al.  Role of Homeostasis in Learning Sparse Representations , 2007, Neural Computation.

[37]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[38]  Stéphane Mallat,et al.  Group Invariant Scattering , 2011, ArXiv.

[39]  Stéphane Mallat,et al.  Combined scattering for rotation invariant texture analysis , 2012, ESANN.

[40]  Jean Ponce,et al.  Task-Driven Dictionary Learning , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[41]  J. Lindenstrauss,et al.  Fréchet Differentiability of Lipschitz Functions and Porous Sets in Banach Spaces , 2012 .

[42]  Alexandre d'Aspremont,et al.  Phase recovery, MaxCut and complex semidefinite programming , 2012, Math. Program..