Constrained deep networks: Lagrangian optimization via Log-barrier extensions

This study investigates imposing hard inequality constraints on the outputs of convolutional neural networks (CNN) during training. Several recent works showed that the theoretical and practical advantages of Lagrangian optimization over simple penalties do not materialize in practice when dealing with modern CNNs involving millions of parameters. Therefore, constrained CNNs are typically handled with penalties. We propose log-barrier extensions, which approximate Lagrangian optimization of constrained-CNN problems with a sequence of unconstrained losses. Unlike standard interior-point and log-barrier methods, our formulation does not need an initial feasible solution. The proposed extension yields an upper bound on the duality gap-generalizing the result of standard log-barriers-and yielding sub-optimality certificates for feasible solutions. While sub-optimality is not guaranteed for non-convex problems, this result shows that log-barrier extensions are a principled way to approximate Lagrangian optimization for constrained CNNs via implicit dual variables. We report weakly supervised image segmentation experiments, with various constraints, showing that our formulation outperforms substantially the existing constrained-CNN methods, in terms of accuracy, constraint satisfaction and training stability, more so when dealing with a large number of constraints.

[1]  Sathya N. Ravi,et al.  Explicitly Imposing Constraints in Deep Networks via Conditional Gradients Gives Improved Generalization and Faster Convergence , 2019, AAAI.

[2]  Xinlei Chen,et al.  Prior-Aware Neural Network for Partially-Supervised Multi-Organ Segmentation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[3]  Philip David,et al.  A Curriculum Domain Adaptation Approach to the Semantic Segmentation of Urban Scenes , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Ismail Ben Ayed,et al.  Decoupling Direction and Norm for Efficient Gradient-Based L2 Adversarial Attacks and Defenses , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  R. Salakhutdinov,et al.  Deep Generative Models with Learnable Knowledge Constraints , 2018, NeurIPS.

[6]  Eric Granger,et al.  Constrained‐CNN losses for weakly supervised segmentation☆ , 2018, Medical Image Anal..

[7]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[8]  Pascal Fua,et al.  Imposing Hard Constraints on Deep Networks: Promises and Limitations , 2017, CVPR 2017.

[9]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[10]  Bram van Ginneken,et al.  A survey on deep learning in medical image analysis , 2017, Medical Image Anal..

[11]  Zhipeng Jia,et al.  Constrained Deep Weak Supervision for Histopathology Image Segmentation , 2017, IEEE Transactions on Medical Imaging.

[12]  Yang Liu,et al.  Learning to Play in a Day: Faster Deep Reinforcement Learning by Optimality Tightening , 2016, ICLR.

[13]  Eugenio Culurciello,et al.  ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation , 2016, ArXiv.

[14]  Yoram Singer,et al.  Train faster, generalize better: Stability of stochastic gradient descent , 2015, ICML.

[15]  Trevor Darrell,et al.  Constrained Convolutional Neural Networks for Weakly Supervised Segmentation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[16]  Florian Jung,et al.  Evaluation of prostate segmentation algorithms for MRI: The PROMISE12 challenge , 2014, Medical Image Anal..

[17]  Lena Gorelick,et al.  Fast Trust Region for Segmentation , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Marc Niethammer,et al.  Segmentation with area constraints , 2012, Medical Image Analysis.

[19]  Toby Sharp,et al.  Image segmentation with a bounding box prior , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[20]  Stephen P. Boyd,et al.  Convex Optimization , 2004, IEEE Transactions on Automatic Control.

[21]  R. Fletcher Practical Methods of Optimization , 1988 .

[22]  Philip E. Gill,et al.  Practical optimization , 1981 .

[23]  Parag Singla,et al.  A Primal Dual Formulation For Deep Learning With Constraints , 2019, NeurIPS.

[24]  Yung-Yu Chuang,et al.  Weakly Supervised Instance Segmentation using the Bounding Box Tightness Prior , 2019, NeurIPS.

[25]  Nicolas Couellan,et al.  Interior Point Methods for Supervised Training of Artificial Neural Networks with Bounded Weights , 1997 .

[26]  Dimitri P. Bertsekas,et al.  Nonlinear Programming , 1997 .