A lifted Bregman formulation for the inversion of deep neural networks

We propose a novel framework for the regularised inversion of deep neural networks. The framework is based on the authors' recent work on training feed-forward neural networks without the differentiation of activation functions. The framework lifts the parameter space into a higher dimensional space by introducing auxiliary variables, and penalises these variables with tailored Bregman distances. We propose a family of variational regularisations based on these Bregman distances, present theoretical results and support their practical application with numerical examples. In particular, we present the first convergence result (to the best of our knowledge) for the regularised inversion of a single-layer perceptron that only assumes that the solution of the inverse problem is in the range of the regularisation operator, and that shows that the regularised inverse provably converges to the true inverse if measurement errors converge to zero.

[1]  Ming-Hsuan Yang,et al.  GAN Inversion: A Survey , 2021, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  M. Burger,et al.  Convergent Data-driven Regularizations for CT Reconstruction , 2022, ArXiv.

[3]  M. Benning,et al.  Lifted Bregman Training of Neural Networks , 2022, ArXiv.

[4]  Amit H. Bermano,et al.  An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion , 2022, ICLR.

[5]  Paul Vicol,et al.  Understanding and mitigating exploding inverses in invertible neural networks , 2020, AISTATS.

[6]  Martin Benning,et al.  Bregman Methods for Large-Scale Optimisation with Applications in Imaging , 2021 .

[7]  Martin Benning,et al.  Generalised Perceptron Learning , 2020, ArXiv.

[8]  C. Schonlieb,et al.  Learned convex regularizers for inverse problems , 2020, ArXiv.

[9]  Pieter Abbeel,et al.  Denoising Diffusion Probabilistic Models , 2020, NeurIPS.

[10]  O. Scherzer,et al.  A Data-Driven Iteratively Regularized Landweber Iteration , 2018, Numerical Functional Analysis and Optimization.

[11]  Stephan Antholzer,et al.  NETT: solving inverse problems with deep neural networks , 2018, Inverse Problems.

[12]  Andrea Vedaldi,et al.  Understanding Deep Networks via Extremal Perturbations and Smooth Masks , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[13]  Simon R. Arridge,et al.  Solving inverse problems using data-driven models , 2019, Acta Numerica.

[14]  David Duvenaud,et al.  Invertible Residual Networks , 2018, ICML.

[15]  David Duvenaud,et al.  Explaining Image Classifiers by Counterfactual Generation , 2018, ICLR.

[16]  Stephan Antholzer,et al.  Deep null space learning for inverse problems: convergence analysis and rates , 2018, Inverse Problems.

[17]  Carola-Bibiane Schönlieb,et al.  Adversarial Regularizers in Inverse Problems , 2018, NeurIPS.

[18]  Martin Burger,et al.  Modern regularization methods for inverse problems , 2018, Acta Numerica.

[19]  Harshad Rai,et al.  Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks , 2018 .

[20]  Andrea Vedaldi,et al.  Interpretable Explanations of Black Boxes by Meaningful Perturbation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[21]  Antonin Chambolle,et al.  An introduction to continuous optimization for imaging , 2016, Acta Numerica.

[22]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Lea Fleischer,et al.  Regularization of Inverse Problems , 1996 .

[24]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[25]  Shakir Mohamed,et al.  Variational Inference with Normalizing Flows , 2015, ICML.

[26]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[27]  Surya Ganguli,et al.  Deep Unsupervised Learning using Nonequilibrium Thermodynamics , 2015, ICML.

[28]  Stephen J. Wright Coordinate descent algorithms , 2015, Mathematical Programming.

[29]  Andrea Vedaldi,et al.  Understanding deep image representations by inverting them , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Yoshua Bengio,et al.  NICE: Non-linear Independent Components Estimation , 2014, ICLR.

[31]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[32]  Andrew Zisserman,et al.  Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.

[33]  Amir Beck,et al.  On the Convergence of Block Coordinate Descent Type Methods , 2013, SIAM J. Optim..

[34]  Frank Nielsen,et al.  The Burbea-Rao and Bhattacharyya Centroids , 2010, IEEE Transactions on Information Theory.

[35]  Martin Burger,et al.  ERROR ESTIMATES FOR GENERAL FIDELITIES , 2011 .

[36]  Antonin Chambolle,et al.  A First-Order Primal-Dual Algorithm for Convex Problems with Applications to Imaging , 2011, Journal of Mathematical Imaging and Vision.

[37]  Tony F. Chan,et al.  A General Framework for a Class of First Order Primal-Dual Algorithms for Convex Optimization in Imaging Science , 2010, SIAM J. Imaging Sci..

[38]  Daniel Cremers,et al.  An algorithm for minimizing the Mumford-Shah functional , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[39]  Otmar Scherzer,et al.  Variational Methods in Imaging , 2008, Applied mathematical sciences.

[40]  Mingqiang Zhu,et al.  An Efficient Primal-Dual Hybrid Gradient Algorithm For Total Variation Image Restoration , 2008 .

[41]  M. Nikolova An Algorithm for Total Variation Minimization and Applications , 2004 .

[42]  Hajime Kita,et al.  Inverting feedforward neural networks using linear and nonlinear programming , 1999, IEEE Trans. Neural Networks.

[43]  Robert J. Marks,et al.  Inversion of feedforward neural networks: algorithms and applications , 1999, Proc. IEEE.

[44]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[45]  K. Kiwiel Proximal Minimization Methods with Generalized Bregman Functions , 1997 .

[46]  P. Lions,et al.  Image recovery via total variation minimization and related problems , 1997 .

[47]  L. Rudin,et al.  Nonlinear total variation based noise removal algorithms , 1992 .

[48]  Jianhua Lin,et al.  Divergence measures based on the Shannon entropy , 1991, IEEE Trans. Inf. Theory.

[49]  Alexander Linden,et al.  Inversion of neural networks by gradient descent , 1990, Parallel Comput..

[50]  A. Linden,et al.  Inversion of multilayer nets , 1989, International 1989 Joint Conference on Neural Networks.

[51]  C. Atkinson METHODS FOR SOLVING INCORRECTLY POSED PROBLEMS , 1985 .

[52]  V. A. Morozov,et al.  Methods for Solving Incorrectly Posed Problems , 1984 .

[53]  C. R. Rao,et al.  On the convexity of higher order Jensen differences based on entropy functions , 1982, IEEE Trans. Inf. Theory.

[54]  C. R. Rao,et al.  On the convexity of some divergence measures based on entropy functions , 1982, IEEE Trans. Inf. Theory.

[55]  L. Bregman The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming , 1967 .

[56]  L. Landweber An iteration formula for Fredholm integral equations of the first kind , 1951 .