ReLU Networks as Surrogate Models in Mixed-Integer Linear Programs

We consider the embedding of piecewise-linear deep neural networks (ReLU networks) as surrogate models in mixed-integer linear programming (MILP) problems. A MILP formulation of ReLU networks has recently been applied by many authors to probe for various model properties subject to input bounds. The formulation is obtained by programming each ReLU operator with a binary variable and applying the big-M method. The efficiency of the formulation hinges on the tightness of the bounds defined by the big-M values. When ReLU networks are embedded in a larger optimization problem, the presence of output bounds can be exploited in bound tightening. To this end, we devise and study several bound tightening procedures that consider both input and output bounds. Our numerical results show that bound tightening may reduce solution times considerably, and that small-sized ReLU networks are suitable as surrogate models in mixed-integer linear programs.

[1]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[2]  George L. Nemhauser,et al.  Mixed-Integer Models for Nonseparable Piecewise-Linear Optimization: Unifying Framework and Extensions , 2010, Oper. Res..

[3]  Frederico W. Tavares,et al.  Machine learning model and optimization of a PSA unit for methane-nitrogen separation , 2017, Comput. Chem. Eng..

[4]  Matthew Mirman,et al.  Fast and Effective Robustness Certification , 2018, NeurIPS.

[5]  Atharv Bhosekar,et al.  Advances in surrogate based modeling, feasibility analysis, and optimization: A review , 2018, Comput. Chem. Eng..

[7]  Christine A. Shoemaker,et al.  A Stochastic Radial Basis Function Method for the Global Optimization of Expensive Functions , 2007, INFORMS J. Comput..

[8]  Selen Cremaschi,et al.  Process synthesis of biodiesel production plant using artificial neural networks as the surrogate models , 2012, Comput. Chem. Eng..

[9]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[10]  Christodoulos A. Floudas,et al.  Optimization of black-box problems using Smolyak grids and polynomial approximations , 2018, J. Glob. Optim..

[11]  Iftekhar A. Karimi,et al.  Design of computer experiments: A review , 2017, Comput. Chem. Eng..

[12]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[13]  Ignacio E. Grossmann,et al.  Advances in mathematical programming models for enterprise-wide optimization , 2012, Comput. Chem. Eng..

[14]  Matteo Fischetti,et al.  Deep neural networks and mixed integer linear optimization , 2018, Constraints.

[15]  Christodoulos A. Floudas,et al.  Global optimization of grey-box computational systems using surrogate functions and application to highly constrained oil-field operations , 2018, Comput. Chem. Eng..

[16]  Shuo Ma,et al.  Artificial neural network based optimization for hydrogen purification performance of pressure swing adsorption , 2019, International Journal of Hydrogen Energy.

[17]  Christodoulos A. Floudas,et al.  Global optimization of general constrained grey-box models: new method and its application to constrained PDEs for pressure swing adsorption , 2017, J. Glob. Optim..

[18]  Donald R. Jones,et al.  Efficient Global Optimization of Expensive Black-Box Functions , 1998, J. Glob. Optim..

[19]  Dirk Gorissen,et al.  Multiobjective global surrogate modeling, dealing with the 5-percent problem , 2010, Engineering with Computers.

[20]  C. Floudas,et al.  Piecewise-Linear Approximations of Multidimensional Functions , 2010 .

[21]  Dirk Gorissen,et al.  A Novel Hybrid Sequential Design Strategy for Global Surrogate Modeling of Computer Experiments , 2011, SIAM J. Sci. Comput..

[22]  Gilbert Laporte,et al.  The pickup and delivery problem with time windows and handling operations , 2017, Comput. Oper. Res..

[23]  Noboru Murata,et al.  Neural Network with Unbounded Activation Functions is Universal Approximator , 2015, 1505.03654.

[24]  Jorge Nocedal,et al.  Optimization Methods for Large-Scale Machine Learning , 2016, SIAM Rev..

[25]  Inderjit S. Dhillon,et al.  Towards Fast Computation of Certified Robustness for ReLU Networks , 2018, ICML.

[26]  Georgia Perakis,et al.  Optimizing Objective Functions Determined from Random Forests , 2017 .

[27]  Mohd Azmin Ishak,et al.  Virtual multiphase flow metering using diverse neural network ensemble and adaptive simulated annealing , 2018, Expert Syst. Appl..

[28]  Tsuyoshi Murata,et al.  {m , 1934, ACML.

[29]  Yoshua Bengio,et al.  Maxout Networks , 2013, ICML.

[30]  Martín Abadi,et al.  TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[31]  Chih-Hong Cheng,et al.  Maximum Resilience of Artificial Neural Networks , 2017, ATVA.

[32]  Bjarne Grimstad,et al.  Global optimization of multiphase flow networks using spline surrogate models , 2016, Comput. Chem. Eng..

[33]  Pushmeet Kohli,et al.  A Dual Approach to Scalable Verification of Deep Networks , 2018, UAI.

[34]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Artur M. Schweidtmann,et al.  Global Deterministic Optimization with Artificial Neural Networks Embedded , 2018 .

[36]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[37]  Yoshua Bengio,et al.  Deep Sparse Rectifier Neural Networks , 2011, AISTATS.

[38]  Shie Mannor,et al.  Bayesian Reinforcement Learning: A Survey , 2015, Found. Trends Mach. Learn..

[39]  Christian Tjandraatmadja,et al.  Strong mixed-integer programming formulations for trained neural networks , 2018, Mathematical Programming.

[40]  Katya Scheinberg,et al.  Introduction to derivative-free optimization , 2010, Math. Comput..

[41]  Nikolaos V. Sahinidis,et al.  Derivative-free optimization: a review of algorithms and comparison of software implementations , 2013, J. Glob. Optim..

[42]  Tom Dhaene,et al.  A Fuzzy Hybrid Sequential Design Strategy for Global Surrogate Modeling of High-Dimensional Computer Experiments , 2015, SIAM J. Sci. Comput..

[43]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Timo Berthold,et al.  Three enhancements for optimization-based bound tightening , 2017, J. Glob. Optim..

[45]  Dennis Huisman,et al.  Integrating Timetabling and Crew Scheduling at a Freight Railway Operator , 2014, Transp. Sci..

[46]  Marielle Christiansen,et al.  A new decomposition algorithm for a liquefied natural gas inventory routing problem , 2016 .

[47]  Fabiano A.N. Fernandes,et al.  Optimization of Fischer‐Tropsch Synthesis Using Neural Networks , 2006 .

[48]  Jay M. Rosenberger,et al.  Global optimization of non-convex piecewise linear regression splines , 2017, J. Glob. Optim..

[49]  Gianni Di Pillo,et al.  Support vector machines for surrogate modeling of electronic circuits , 2013, Neural Computing and Applications.

[50]  Michael R. Anderson,et al.  Simplexity of the cube , 1996, Discret. Math..