Homogeneous Linear Inequality Constraints for Neural Network Activations

We propose a method to impose homogeneous linear inequality constraints of the form Ax ≤ 0 on neural network activations. The proposed method allows a data-driven training approach to be combined with modeling prior knowledge about the task. One way to achieve this task is by means of a projection step at test time after unconstrained training. However, this is an expensive operation. By directly incorporating the constraints into the architecture, we can significantly speed-up inference at test time; for instance, our experiments show a speed-up of up to two orders of magnitude over a projection method. Our algorithm computes a suitable parameterization of the feasible set at initialization and uses standard variants of stochastic gradient descent to find solutions to the constrained network. Thus, the modeling constraints are always satisfied during training. Crucially, our approach avoids to solve an optimization problem at each training step or to manually trade-off data and constraint fidelity with additional hyperparameters. We consider constrained generative modeling as an important application domain and experimentally demonstrate the proposed method by constraining a variational autoencoder.

[1]  Komei Fukuda,et al.  Double Description Method Revisited , 1995, Combinatorics and Computer Science.

[2]  David Avis,et al.  A pivoting algorithm for convex hulls and vertex enumeration of arrangements and polyhedra , 1991, SCG '91.

[3]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[4]  Dieter Fox,et al.  SE3-nets: Learning rigid body motion using deep neural networks , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[5]  H. Raiffa,et al.  3. The Double Description Method , 1953 .

[6]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[7]  Wei Zhang,et al.  Deep Kinematic Pose Regression , 2016, ECCV Workshops.

[8]  Yann LeCun,et al.  The mnist database of handwritten digits , 2005 .

[9]  Pascal Fua,et al.  Imposing Hard Constraints on Deep Networks: Promises and Limitations , 2017, CVPR 2017.

[10]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[11]  M. E. Dyer,et al.  The Complexity of Vertex Enumeration Methods , 1983, Math. Oper. Res..

[12]  Juergen Gall,et al.  Convolutional Simplex Projection Network for Weakly Supervised Semantic Segmentation , 2018, BMVC.

[13]  Michael Brückner,et al.  Double Description Method , 2013 .

[14]  Trevor Darrell,et al.  Constrained Convolutional Neural Networks for Weakly Supervised Segmentation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[15]  David Bremner,et al.  Incremental Convex Hull Algorithms Are Not Output Sensitive , 1996, ISAAC.

[16]  Stephen P. Boyd,et al.  OSQP: an operator splitting solver for quadratic programs , 2017, 2018 UKACC 12th International Conference on Control (CONTROL).

[17]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[18]  J. Zico Kolter,et al.  OptNet: Differentiable Optimization as a Layer in Neural Networks , 2017, ICML.

[19]  Stephen J. Wright,et al.  Numerical Optimization , 2018, Fundamental Statistical Inference.

[20]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[21]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[22]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[23]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[24]  David M. Blei,et al.  Variational Inference: A Review for Statisticians , 2016, ArXiv.

[25]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .