Constrained Machine Learning: The Bagel Framework

Machine learning models are widely used for real-world applications, such as document analysis and vision. Constrained machine learning problems are problems where learned models have to both be accurate and respect constraints. For continuous convex constraints, many works have been proposed, but learning under combinatorial constraints is still a hard problem. The goal of this paper is to broaden the modeling capacity of constrained machine learning problems by incorporating existing work from combinatorial optimization. We propose first a general framework called BaGeL (Branch, Generate and Learn) which applies Branch and Bound to constrained learning problems where a learning problem is generated and trained at each node until only valid models are obtained. Because machine learning has specific requirements, we also propose an extended table constraint to split the space of hypotheses. We validate the approach on two examples: a linear regression under configuration constraints and a non-negative matrix factorization with prior knowledge for latent semantics analysis.

[1]  Yu-Jin Zhang,et al.  Nonnegative Matrix Factorization: A Comprehensive Review , 2013, IEEE Transactions on Knowledge and Data Engineering.

[2]  David L. Dill,et al.  Learning a SAT Solver from Single-Bit Supervision , 2018, ICLR.

[3]  Marco Maggini,et al.  Local Propagation in Constraint-based Neural Network , 2020, ArXiv.

[4]  Olivier Lhomme Practical Reformulations With Table Constraints , 2012, ECAI.

[5]  Toby Walsh,et al.  Handbook of Constraint Programming , 2006, Handbook of Constraint Programming.

[6]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[7]  Sheldon H. Jacobson,et al.  Branch-and-bound algorithms: A survey of recent advances in searching, branching, and pruning , 2016, Discret. Optim..

[8]  Ronan Le Bras,et al.  Phase Mapper: Accelerating Materials Discovery with AI , 2018, AI Mag..

[9]  Jean-Charles Régin,et al.  A Filtering Algorithm for Constraints of Difference in CSPs , 1994, AAAI.

[10]  Yves Deville,et al.  Efficient Reification of Table Constraints , 2017, 2017 IEEE 29th International Conference on Tools with Artificial Intelligence (ICTAI).

[11]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Stephen P. Boyd,et al.  Enhancing Sparsity by Reweighted ℓ1 Minimization , 2007, 0711.1612.

[13]  Ted K. Ralphs,et al.  Integer and Combinatorial Optimization , 2013 .

[14]  Meinolf Sellmann,et al.  Streamlined Constraint Reasoning , 2004, CP.

[15]  Pierre Schaus,et al.  Compact-Table: Efficiently Filtering Table Constraints with Reversible Sparse Bit-Sets , 2016, CP.

[16]  Jean-Charles Régin,et al.  Improving GAC-4 for Table and MDD Constraints , 2014, CP.

[17]  Martin W. P. Savelsbergh,et al.  Branch-and-Price: Column Generation for Solving Huge Integer Programs , 1998, Oper. Res..

[18]  Laurent Condat,et al.  A Primal–Dual Splitting Method for Convex Optimization Involving Lipschitzian, Proximable and Linear Composite Terms , 2012, Journal of Optimization Theory and Applications.

[19]  Michael Elad,et al.  Optimally sparse representation in general (nonorthogonal) dictionaries via ℓ1 minimization , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[20]  Keinosuke Fukunaga,et al.  A Branch and Bound Algorithm for Feature Subset Selection , 1977, IEEE Transactions on Computers.

[21]  Terence Tao,et al.  The Dantzig selector: Statistical estimation when P is much larger than n , 2005, math/0506081.

[22]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[23]  Andrea Bartolini,et al.  Empirical decision model learning , 2017, Artif. Intell..

[24]  Chih-Jen Lin,et al.  On the Convergence of Multiplicative Update Algorithms for Nonnegative Matrix Factorization , 2007, IEEE Transactions on Neural Networks.

[25]  Mats Carlsson,et al.  Auto-tabling for subproblem presolving in MiniZinc , 2017, Constraints.

[26]  Fabrizio Detassis,et al.  Teaching the old dog new tricks: supervised learning with constraints , 2020, NeHuAI@ECAI.

[27]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[28]  Toby Walsh,et al.  Restart Strategy Selection Using Machine Learning Techniques , 2009, SAT.

[29]  Sergios Theodoridis,et al.  Adaptive algorithm for sparse system identification using projections onto weighted ℓ1 balls , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[30]  Weijie J. Su,et al.  SLOPE-ADAPTIVE VARIABLE SELECTION VIA CONVEX OPTIMIZATION. , 2014, The annals of applied statistics.

[31]  Balas K. Natarajan,et al.  Sparse Approximate Solutions to Linear Systems , 1995, SIAM J. Comput..