A Theoretical Analysis of Optimization by Gaussian Continuation

Optimization via continuation method is a widely used approach for solving nonconvex minimization problems. While this method generally does not provide a global minimum, empirically it often achieves a superior local minimum compared to alternative approaches such as gradient descent. However, theoretical analysis of this method is largely unavailable. Here, we provide a theoretical analysis that provides a bound on the end-point solution of the continuation method. The derived bound depends on a problem specific characteristic that we refer to as optimization complexity. We show that this characteristic can be analytically computed when the objective function is expressed in some suitable basis functions. Our analysis combines elements of scalespace theory, regularization and differential equations.

[1]  Fernando De la Torre,et al.  Gaussian Processes Multiple Instance Learning , 2010, ICML.

[2]  Jooyoung Park,et al.  Universal Approximation Using Radial-Basis-Function Networks , 1991, Neural Computation.

[3]  Pascal Vincent,et al.  The Difficulty of Training Deep Architectures and the Effect of Unsupervised Pre-Training , 2009, AISTATS.

[4]  John B. Shoven,et al.  I , Edinburgh Medical and Surgical Journal.

[5]  Marek Junghans,et al.  Hough Transform with GNC , 2004, 2004 12th European Signal Processing Conference.

[6]  D. Widder The heat equation , 1975 .

[7]  Hong Qiao,et al.  An Extended Path Following Algorithm for Graph-Matching Problem , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Swarat Chaudhuri,et al.  Bridging boolean and quantitative synthesis using smoothed proof search , 2014, POPL.

[9]  M. Zaslavskiy,et al.  A Path Following Algorithm for the Graph Matching Problem , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Mila Nikolova,et al.  Fast Nonconvex Nonsmooth Minimization Methods for Image Restoration and Reconstruction , 2010, IEEE Transactions on Image Processing.

[11]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[12]  A. Ron,et al.  On multivariate approximation by integer translates of a basis function , 1992 .

[13]  Martin D. Buhmann,et al.  Radial Basis Functions , 2021, Encyclopedia of Mathematical Geosciences.

[14]  Alan L. Yuille,et al.  The invisible hand algorithm: Solving the assignment problem with statistical physics , 1994, Neural Networks.

[15]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[17]  Roger M. Dufour,et al.  Template matching based object recognition with unknown geometric parameters , 2002, IEEE Trans. Image Process..

[18]  Jason M. Saragih,et al.  Deformable Face Alignment via Local Measurements and Global Constraints , 2013 .

[19]  L. Cohen,et al.  Snakes: Sur La Convexit E De La Fonctionnelle D' Energie , 2007 .

[20]  A. Morgan Solving Polynomial Systems Using Continuation for Engineering and Scientific Problems , 1987 .

[21]  Swarat Chaudhuri,et al.  Smoothing a Program Soundly and Robustly , 2011, CAV.

[22]  Emanuele Menegatti,et al.  Scalable Dense Large-Scale Mapping and Navigation , 2010, ICRA 2010.

[23]  Jonathan Balzer,et al.  Isogeometric finite-elements methods and variational reconstruction tasks in vision — A perfect match , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Eric Mjolsness,et al.  Learning with Preknowledge: Clustering with Point and Graph Matching Distance Measures , 1996, Neural Computation.

[25]  Ce Liu,et al.  Deformable Spatial Pyramid Matching for Fast Dense Correspondences , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  Christian Jutten,et al.  Sparse Recovery using Smoothed ℓ0 (SL0): Convergence Analysis , 2010, ArXiv.

[27]  Luc Florack,et al.  On the Behavior of Spatial Critical Points under Gaussian Blurring. A Folklore Theorem and Scale-Space Constraints , 2001, Scale-Space.

[28]  A. Boccuto,et al.  A GNC Algorithm for Deblurring Images with Interacting Discontinuities , 2007 .

[29]  Jonathan T. Barron,et al.  Shapes, Paint, and Light , 2013 .

[30]  Geoffrey C. Fox,et al.  A deterministic annealing approach to clustering , 1990, Pattern Recognit. Lett..

[31]  Armando Manduca,et al.  Highly Undersampled Magnetic Resonance Image Reconstruction via Homotopic $\ell_{0}$ -Minimization , 2009, IEEE Transactions on Medical Imaging.

[32]  Michael J. Black,et al.  Secrets of optical flow estimation and their principles , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[33]  Lin Xiao,et al.  A Proximal-Gradient Homotopy Method for the L1-Regularized Least-Squares Problem , 2012, ICML.

[34]  Hossein Mobahi,et al.  On the Link between Gaussian Homotopy Continuation and Convex Envelopes , 2015, EMMCVPR.

[35]  Christian Jutten,et al.  Recovery of Low-Rank Matrices Under Affine Constraints via a Smoothed Rank Function , 2013, IEEE Transactions on Signal Processing.

[36]  Peter V. Gehler,et al.  Deterministic Annealing for Multiple-Instance Learning , 2007, AISTATS.

[37]  Zhijun Wu,et al.  The Eeective Energy Transformation Scheme as a General Continuation Approach to Global Optimization with Application to Molecular Conformation , 2022 .

[38]  Andrew J. Sommese,et al.  The numerical solution of systems of polynomials - arising in engineering and science , 2005 .

[39]  Scott Cohen,et al.  Simultaneous foreground, background, and alpha estimation for image matting , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[40]  Demetri Terzopoulos,et al.  Signal matching through scale space , 1986, International Journal of Computer Vision.

[41]  Jason Weston,et al.  Curriculum learning , 2009, ICML '09.

[42]  Tomaso A. Poggio,et al.  Regularization Theory and Neural Networks Architectures , 1995, Neural Computation.

[43]  A. Yuille,et al.  Energy functions for early vision and analog networks , 1989, Biological Cybernetics.

[44]  Maxime Descoteaux,et al.  Collaborative patch-based super-resolution for diffusion-weighted images , 2013, NeuroImage.

[45]  Andrew Blake,et al.  Visual Reconstruction , 1987, Deep Learning for EEG-Based Brain–Computer Interfaces.

[46]  Alan L. Yuille,et al.  A mathematical analysis of the motion coherence theory , 1989, International Journal of Computer Vision.

[47]  Hossein Mobahi,et al.  Data-driven image completion by image patch subspaces , 2009, 2009 Picture Coding Symposium.

[48]  Richard K. Beatson,et al.  Reconstruction and representation of 3D objects with radial basis functions , 2001, SIGGRAPH.

[49]  Byung-Woo Hong,et al.  A New Model and Simple Algorithms for Multi-label Mumford-Shah Problems , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[50]  Anand Rangarajan,et al.  Generalized graduated nonconvexity algorithm for maximum a posteriori image estimation , 1990, [1990] Proceedings. 10th International Conference on Pattern Recognition.

[51]  Steven Gold,et al.  A Graduated Assignment Algorithm for Graph Matching , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[52]  Philip N. Klein,et al.  Indexing based on edit-distance matching of shape graphs , 1998, Other Conferences.

[53]  Christian Jutten,et al.  Sparse Recovery using Smoothed ℓ0 (SL0): Convergence Analysis , 2010, ArXiv.

[54]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[55]  Richard K. Beatson,et al.  Surface interpolation with radial basis functions for medical imaging , 1997, IEEE Transactions on Medical Imaging.

[56]  W. Madych,et al.  Multivariate interpolation and condi-tionally positive definite functions , 1988 .

[58]  R. Schaback,et al.  Characterization and construction of radial basis functions , 2001 .

[59]  Zhe Wu,et al.  Calibrating Photometric Stereo by Holistic Reflectance Symmetry Analysis , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[60]  Martial Hebert,et al.  Smoothing-based Optimization , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[61]  S. Sathiya Keerthi,et al.  Deterministic Annealing for Semi-Supervised Structured Output Learning , 2012, AISTATS.

[62]  S. Sathiya Keerthi,et al.  Deterministic annealing for semi-supervised kernel machines , 2006, ICML.

[63]  Johannes J. Duistermaat,et al.  On the behaviour of spatial critical points under gaussian blurring , 2001 .