论文信息 - Scattering Representations for Recognition

Scattering Representations for Recognition

This thesis addresses the problem of pattern and texture recognition from a mathematical perspective. These high level tasks require signal representations enjoying specific invariance, stability and consistency properties, which are not satisfied by linear representations. Scattering operators cascade wavelet decompositions and complex modulus, followed by a lowpass filtering. They define a non-linear representation which is locally translation invariant and Lipschitz continuous to the action of diffeomorphisms. They also define a texture representation capturing high order moments and which can be consistently estimated from few realizations. The thesis derives new mathematical properties of scattering representations and demonstrates its efficiency on pattern and texture recognition tasks. Thanks to its Lipschitz continuity to the action of diffeomorphisms, small deformations of the signal are linearized, which can be exploited in applications with a generative affine classifier yielding state-of-the-art results on handwritten digit classification. Expected scattering representations are applied on image and auditory texture datasets, showing their capacity to capture high order moments information with consistent estimators. Scattering representations are particularly efficient for the estimation and characterization of fractal parameters. A renormalization of scattering coefficients is introduced, giving a new insight on fractal description, with the ability in particular to characterize multifractal intermittency using consistent estimators.

Joan Bruna

[1] G. G. Stokes. "J." , 1890, The New Yale Book of Quotations.

[2] Gen-ichirô Sunouchi. On Mercer's Theorem , 1946 .

[3] M. Rosenblatt. A CENTRAL LIMIT THEOREM AND A STRONG MIXING CONDITION. , 1956, Proceedings of the National Academy of Sciences of the United States of America.

[4] W. Root,et al. An introduction to the theory of random signals and noise , 1958 .

[5] A. Kolmogorov,et al. On Strong Mixing Conditions for Stationary Gaussian Processes , 1960 .

[6] Béla Julesz,et al. Visual Pattern Discrimination , 1962, IRE Trans. Inf. Theory.

[7] J. Lamperti. ON CONVERGENCE OF STOCHASTIC PROCESSES , 1962 .

[8] O. H. Lowry. Academic press. , 1972, Analytical chemistry.

[9] P. Mermelstein,et al. Distance measures for speech recognition, psychological and instrumental , 1976 .

[10] U. Frisch. FULLY DEVELOPED TURBULENCE AND INTERMITTENCY , 1980 .

[11] B. Julesz. Textons, the elements of texture perception, and their interactions , 1981, Nature.

[12] Berthold K. P. Horn,et al. Determining Optical Flow , 1981, Other Conferences.

[13] Yu. A. Gur'yan,et al. Parts I and II , 1982 .

[14] Donald Geman,et al. Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15] Donald Geman,et al. Gibbs distributions and the bayesian restoration of images , 1984 .

[16] Pierre Collet,et al. The dimension spectrum of some dynamical systems , 1987 .

[17] O. Thual. Turbulence and Random Processes in Fluid Mechanics , 1988 .

[18] R. Lathe. Phd by thesis , 1988, Nature.

[19] P Perona,et al. Preattentive texture discrimination with early vision mechanisms. , 1990, Journal of the Optical Society of America. A, Optics and image science.

[20] C. Meneveau,et al. The multifractal nature of turbulent energy dissipation , 1991, Journal of Fluid Mechanics.

[21] Bernhard E. Boser,et al. A training algorithm for optimal margin classifiers , 1992, COLT '92.

[22] E. Bacry,et al. Multifractal formalism for fractal signals: The structure-function approach versus the wavelet-transform modulus-maxima method. , 1993, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[23] E. Bacry,et al. Singularity spectrum of fractal signals from wavelet analysis: Exact results , 1993 .

[24] Chrysostomos L. Nikias,et al. Higher-order spectral analysis , 1993, Proceedings of the 15th Annual International Conference of the IEEE Engineering in Medicine and Biology Societ.

[25] James R. Bergen,et al. Pyramid-based texture analysis/synthesis , 1995, Proceedings., International Conference on Image Processing.

[26] Y. Meyer,et al. Wavelet Methods for Pointwise Regularity and Local Oscillations of Functions , 1996 .

[27] A. Oppenheim,et al. Signal processing with fractals: a wavelet-based approach , 1996 .

[28] Robert D. Nowak,et al. Low rank estimation of higher order statistics , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[29] P. Massart,et al. From Model Selection to Adaptive Estimation , 1997 .

[30] Song-Chun Zhu,et al. Minimax Entropy Principle and Its Application to Texture Modeling , 1997, Neural Computation.

[31] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[32] Vladimir Vapnik,et al. Statistical learning theory , 1998 .

[33] J. C. BurgesChristopher. A Tutorial on Support Vector Machines for Pattern Recognition , 1998 .

[34] Rupert Paget,et al. Texture synthesis via a noncausal nonparametric multiscale Markov random field , 1998, IEEE Trans. Image Process..

[35] S. Mallat. A wavelet tour of signal processing , 1998 .

[36] Shree K. Nayar,et al. Reflectance and texture of real-world surfaces , 1999, TOGS.

[37] Eero P. Simoncelli,et al. Texture modeling and synthesis using joint statistics of complex wavelet coefficients , 1999 .

[38] Edward C. Waymire,et al. Statistical estimation for multiplicative cascades , 2000 .

[39] Ronen Basri,et al. Lambertian reflectance and linear subspaces , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[40] Michael I. Jordan,et al. On Discriminative vs. Generative Classifiers: A comparison of logistic regression and naive Bayes , 2001, NIPS.

[41] Alexei A. Efros,et al. Image quilting for texture synthesis and transfer , 2001, SIGGRAPH.

[42] S. Jaffard. Wavelet expansions, function spaces and multifractal analysis , 2001 .

[43] L. Gool,et al. Parallel composite texture synthesis , 2002 .

[44] Bernard Haasdonk,et al. Tangent distance kernels for support vector machines , 2002, Object recognition supported by user interaction for service robots.

[45] Patrice Abry,et al. Wavelets for the Analysis, Estimation, and Synthesis of Scaling Data , 2002 .

[46] Andrew Zisserman,et al. Texture classification: are filter banks necessary? , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[47] Nello Cristianini,et al. Kernel Methods for Pattern Analysis , 2003, ICTAI.

[48] Patrice Abry,et al. Revisiting scaling, multifractal, and multiplicative cascades with the wavelet leader lens , 2004, SPIE Optics East.

[49] Alain Trouvé,et al. Diffeomorphisms Groups and Pattern Matching in Image Analysis , 1998, International Journal of Computer Vision.

[50] G LoweDavid,et al. Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[51] Corinna Cortes,et al. Support-Vector Networks , 1995, Machine Learning.

[52] R. Sukthankar,et al. PCA-SIFT: a more distinctive representation for local image descriptors , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[53] L. Younes,et al. Statistics on diffeomorphisms via tangent space representations , 2004, NeuroImage.

[54] B. Mandelbrot. Intermittent turbulence in self-similar cascades : divergence of high moments and dimension of the carrier , 2004 .

[55] Lionel Moisan,et al. Edge Detection by Helmholtz Principle , 2001, Journal of Mathematical Imaging and Vision.

[56] Jitendra Malik,et al. Representing and Recognizing the Visual Appearance of Materials using Three-dimensional Textons , 2001, International Journal of Computer Vision.

[57] Mario Fritz,et al. On the Significance of Real-World Conditions for Material Classification , 2004, ECCV.

[58] Pietro Perona,et al. Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[59] Alain Trouvé,et al. Local Geometry of Deformable Templates , 2005, SIAM J. Math. Anal..

[60] Nipun Kwatra,et al. Texture optimization for example-based synthesis , 2005, ACM Trans. Graph..

[61] Hugues Hoppe,et al. Parallel controllable texture synthesis , 2005, ACM Trans. Graph..

[62] Robert E. Broadhurst. Statistical Estimation of Histogram Variation for Texture Classification , 2005 .

[63] Bill Triggs,et al. Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[64] R. Randles. Theory of Probability and Its Applications , 2005 .

[65] Ilkay Ulusoy,et al. Comparison of Generative and Discriminative Techniques for Object Detection and Classification , 2006, Toward Category-Level Object Recognition.

[66] V. Vapnik. Estimation of Dependences Based on Empirical Data , 2006 .

[67] Luc Van Gool,et al. The 2005 PASCAL Visual Object Classes Challenge , 2005, MLCW.

[68] Cordelia Schmid,et al. Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[69] Yuan Yao,et al. Mercer's Theorem, Feature Maps, and Smoothing , 2006, COLT.

[70] Yali Amit,et al. POP: Patchwork of Parts Models for Object Recognition , 2007, International Journal of Computer Vision.

[71] Y. Amit,et al. Towards a coherent statistical framework for dense deformable template estimation , 2007 .

[72] Marc'Aurelio Ranzato,et al. Sparse Feature Learning for Deep Belief Networks , 2007, NIPS.

[73] Hermann Ney,et al. Deformation Models for Image Recognition , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[74] Rodrigo L. Carceroni. Journal of Mathematical Imaging and Vision: SIBGRAPI 2006 Special Issue , 2007, Journal of Mathematical Imaging and Vision.

[75] Marc'Aurelio Ranzato,et al. Unsupervised Learning of Invariant Feature Hierarchies with Applications to Object Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[76] Andrew Zisserman,et al. Image Classification using Random Forests and Ferns , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[77] Emmanuel Bacry,et al. Continuous cascade models for asset returns , 2008 .

[78] A. Dasgupta. Asymptotic Theory of Statistics and Probability , 2008 .

[79] Log-Normal continuous cascades: aggregation properties and estimation. Application to financial time-series , 2008, 0804.0185.

[80] A. Papapantoleon. An introduction to Lévy processes with applications in finance , 2008, 0804.0482.

[81] Yoshua Bengio,et al. Exploring Strategies for Training Deep Neural Networks , 2009, J. Mach. Learn. Res..

[82] Yihong Gong,et al. Linear spatial pyramid matching using sparse coding for image classification , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[83] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[84] Stefano Soatto,et al. Actionable information in vision , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[85] Andrew Zisserman,et al. A Statistical Approach to Material Classification Using Image Patch Exemplars , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[86] Christopher Hunt,et al. Notes on the OpenSURF Library , 2009 .

[87] Guillermo Sapiro,et al. Discriminative k-metrics , 2009, ICML '09.

[88] Yihong Gong,et al. Nonlinear Learning using Local Coordinate Coding , 2009, NIPS.

[89] Lewis D. Griffin,et al. Using Basic Image Features for Texture Classification , 2010, International Journal of Computer Vision.

[90] Lorenzo Rosasco,et al. On Invariance in Hierarchical Models , 2009, NIPS.

[91] Patrice Abry,et al. Wavelet Leader multifractal analysis for texture classification , 2009, 2009 16th IEEE International Conference on Image Processing (ICIP).

[92] Vincent Lepetit,et al. DAISY: An Efficient Dense Descriptor Applied to Wide-Baseline Stereo , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[93] Prashant Parikh. A Theory of Communication , 2010 .

[94] Yann LeCun,et al. Convolutional networks and applications in vision , 2010, Proceedings of 2010 IEEE International Symposium on Circuits and Systems.

[95] Laurent U. Perrinet,et al. Role of Homeostasis in Learning Sparse Representations , 2007, Neural Computation.

[96] Richard E. Turner. Statistical models for natural sounds , 2010 .

[97] Pedro J. Rodríguez Esquerdo. Convergence of Random Variables , 2011, International Encyclopedia of Statistical Science.

[98] Chih-Jen Lin,et al. LIBSVM: A library for support vector machines , 2011, TIST.

[99] Stéphane Mallat,et al. Group Invariant Scattering , 2011, ArXiv.

[100] Eero P. Simoncelli,et al. Article Sound Texture Perception via Statistics of the Auditory Periphery: Evidence from Sound Synthesis , 2022 .

[101] Stéphane Mallat,et al. Combined scattering for rotation invariant texture analysis , 2012, ESANN.

[102] Jean Ponce,et al. Task-Driven Dictionary Learning , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[103] J. Lindenstrauss,et al. Fréchet Differentiability of Lipschitz Functions and Porous Sets in Banach Spaces , 2012 .

[104] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[105] Stochastic Relaxation , 2014, Computer Vision, A Reference Guide.