Optimization Models for Machine Learning: A Survey
暂无分享,去创建一个
Joe Naoum-Sawaya | Bissan Ghaddar | Claudio Gambella | Bissan Ghaddar | Joe Naoum-Sawaya | Claudio Gambella
[1] F. Sibel Salman,et al. A mixed-integer programming approach to the clustering problem with an application in customer segmentation , 2006, Eur. J. Oper. Res..
[2] Edoardo Amaldi,et al. A distance-based point-reassignment heuristic for the k-hyperplane clustering problem , 2013, Eur. J. Oper. Res..
[3] Qiang Ji,et al. Efficient Structure Learning of Bayesian Networks using Constraints , 2011, J. Mach. Learn. Res..
[4] Pierre Hansen,et al. An improved column generation algorithm for minimum sum-of-squares clustering , 2009, Math. Program..
[5] H. Zou,et al. Regularization and variable selection via the elastic net , 2005 .
[6] Pierre Hansen,et al. Cluster analysis and mathematical programming , 1997, Math. Program..
[7] Erwin Pesch,et al. Fast Clustering Algorithms , 1994, INFORMS J. Comput..
[8] Martin Wistuba,et al. A Survey on Neural Architecture Search , 2019, ArXiv.
[9] Anil K. Jain,et al. Data clustering: a review , 1999, CSUR.
[10] Alan Julian Izenman,et al. Modern Multivariate Statistical Techniques: Regression, Classification, and Manifold Learning , 2008 .
[11] Michael Kearns,et al. On the complexity of teaching , 1991, COLT '91.
[12] Ryuhei Miyashiro,et al. Mixed integer second-order cone programming formulations for variable selection in linear regression , 2015, Eur. J. Oper. Res..
[13] Jian Sun,et al. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[14] Mohammad Azad,et al. Minimization of decision tree depth for multi-label decision tables , 2014, 2014 IEEE International Conference on Granular Computing (GrC).
[15] Keinosuke Fukunaga,et al. Introduction to Statistical Pattern Recognition , 1972 .
[16] Stefan Feuerriegel,et al. Deep learning in business analytics and operations research: Models, applications and managerial implications , 2018, Eur. J. Oper. Res..
[17] Robert Tibshirani,et al. 1-norm Support Vector Machines , 2003, NIPS.
[18] O. Mangasarian,et al. Robust linear programming discrimination of two linearly inseparable sets , 1992 .
[19] Balas K. Natarajan,et al. Sparse Approximate Solutions to Linear Systems , 1995, SIAM J. Comput..
[20] Gerhard Widmer,et al. Prediction of Ordinal Classes Using Regression Trees , 2001, Fundam. Informaticae.
[21] Frank Hutter,et al. Neural Architecture Search: A Survey , 2018, J. Mach. Learn. Res..
[22] Heng Tao Shen,et al. Principal Component Analysis , 2009, Encyclopedia of Biometrics.
[23] Xiaojin Zhu,et al. Machine Teaching: An Inverse Problem to Machine Learning and an Approach Toward Optimal Education , 2015, AAAI.
[24] Wei-Yin Loh,et al. Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..
[25] Alejandro Toriello,et al. Fitting piecewise linear continuous functions , 2012, Eur. J. Oper. Res..
[26] Martin Zinkevich,et al. Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.
[27] Yizhen Wang,et al. Data Poisoning Attacks against Online Learning , 2018, ArXiv.
[28] Andrea Lodi,et al. On learning and branching: a survey , 2017 .
[29] Emilio Carrizosa,et al. Biobjective sparse principal component analysis , 2014, J. Multivar. Anal..
[30] Lucila Ohno-Machado,et al. Logistic regression and artificial neural network classification models: a methodology review , 2002, J. Biomed. Informatics.
[31] Paulo Cortez,et al. Modeling wine preferences by data mining from physicochemical properties , 2009, Decis. Support Syst..
[32] Sheila A. McIlraith,et al. Training Binarized Neural Networks Using MIP and CP , 2019, CP.
[33] Trevor Hastie,et al. The Elements of Statistical Learning , 2001 .
[34] M. Florian,et al. THE NONLINEAR BILEVEL PROGRAMMING PROBLEM: FORMULATIONS, REGULARITY AND OPTIMALITY CONDITIONS , 1993 .
[35] Dimitris Bertsimas,et al. Characterization of the equivalence of robustification and regularization in linear and matrix regression , 2017, Eur. J. Oper. Res..
[36] Ender Özcan,et al. A review on the self and dual interactions between machine learning and optimisation , 2019, Progress in Artificial Intelligence.
[37] Dimitris Bertsimas,et al. OR Forum - An Algorithmic Approach to Linear Regression , 2016, Oper. Res..
[38] Amir Globerson,et al. Nightmare at test time: robust learning by feature deletion , 2006, ICML.
[39] Tommi S. Jaakkola,et al. Learning Bayesian Network Structure using LP Relaxations , 2010, AISTATS.
[40] Lin Bai,et al. Learning More Robust Features with Adversarial Training , 2018, ArXiv.
[41] Velibor V. Misic,et al. Optimization of Tree Ensembles , 2017, Oper. Res..
[42] Shuichi Kawano,et al. Sparse principal component regression for generalized linear models , 2016, Comput. Stat. Data Anal..
[43] Yancong Deng,et al. Few Shot Learning Based on the Street View House Numbers (SVHN) Dataset , 2021 .
[44] Been Kim,et al. Towards A Rigorous Science of Interpretable Machine Learning , 2017, 1702.08608.
[45] Le Song,et al. Learning to Branch in Mixed Integer Programming , 2016, AAAI.
[46] Yoshua Bengio,et al. Random Search for Hyper-Parameter Optimization , 2012, J. Mach. Learn. Res..
[47] Bernhard Schölkopf,et al. A tutorial on support vector regression , 2004, Stat. Comput..
[48] Mohammad Azad,et al. Minimization of Decision Tree Average Depth for Decision Tables with Many-valued Decisions , 2014, KES.
[49] Xiaonan Li,et al. Operations research and data mining , 2008, Eur. J. Oper. Res..
[50] Dimitris Bertsimas,et al. From Predictive to Prescriptive Analytics , 2014, Manag. Sci..
[51] Christopher Meek,et al. Adversarial learning , 2005, KDD '05.
[52] Justo Puerto,et al. Locating hyperplanes to fitting set of points: A general framework , 2018, Comput. Oper. Res..
[53] Andrea Lodi,et al. Learning MILP Resolution Outcomes Before Reaching Time-Limit , 2019, CPAIOR.
[54] Kurt Hornik,et al. Approximation capabilities of multilayer feedforward networks , 1991, Neural Networks.
[55] H. Crowder,et al. Cluster Analysis: An Application of Lagrangian Relaxation , 1979 .
[56] Thore Graepel,et al. Large Margin Rank Boundaries for Ordinal Regression , 2000 .
[57] Yann LeCun,et al. What is the best multi-stage architecture for object recognition? , 2009, 2009 IEEE 12th International Conference on Computer Vision.
[58] Yann LeCun,et al. Generalization and network design strategies , 1989 .
[59] Y. LeCun,et al. Learning methods for generic object recognition with invariance to pose and lighting , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..
[60] Zoubin Ghahramani,et al. Unifying linear dimensionality reduction , 2014, 1406.0873.
[61] Loo Hay Lee,et al. Enhancing transportation systems via deep learning: A survey , 2019, Transportation Research Part C: Emerging Technologies.
[62] George Cybenko,et al. Approximation by superpositions of a sigmoidal function , 1989, Math. Control. Signals Syst..
[63] D. Bertsimas,et al. Best Subset Selection via a Modern Optimization Lens , 2015, 1507.03133.
[64] Akihiko Konagaya,et al. Improvements to the cluster Newton method for underdetermined inverse problems , 2015, J. Comput. Appl. Math..
[65] Uri Shaham,et al. Understanding adversarial training: Increasing local stability of supervised models through robust optimization , 2015, Neurocomputing.
[66] Ken Kobayashi,et al. BEST SUBSET SELECTION FOR ELIMINATING MULTICOLLINEARITY , 2017 .
[67] P Baldi,et al. Enhanced Higgs boson to τ(+)τ(-) search with deep learning. , 2014, Physical review letters.
[68] Daniel Aloise,et al. A Model for Clustering Data from Heterogeneous Dissimilarities , 2016, Eur. J. Oper. Res..
[69] Ronald L. Rivest,et al. Constructing Optimal Binary Decision Trees is NP-Complete , 1976, Inf. Process. Lett..
[70] Kristin P. Bennett,et al. Model selection for primal SVM , 2011, Machine Learning.
[71] Vince D. Calhoun,et al. A kernel machine method for detecting higher order interactions in multimodal datasets: Application to schizophrenia , 2018, Journal of Neuroscience Methods.
[72] Nuno Vasconcelos,et al. Direct convex relaxations of sparse SVM , 2007, ICML '07.
[73] Jihun Hamm,et al. K-Beam Subgradient Descent for Minimax Optimization , 2018, ICML 2018.
[74] Pierre Bonami,et al. On mathematical programming with indicator constraints , 2015, Math. Program..
[75] G. D. H. Claassen,et al. An application of Special Ordered Sets to a periodic milk collection problem , 2007, Eur. J. Oper. Res..
[76] L. A. Cox,et al. Heuristic least-cost computation of discrete classification functions with uncertain argument values , 1990 .
[77] I. Grossmann. Review of Nonlinear Mixed-Integer and Disjunctive Programming Techniques , 2002 .
[78] Eduardo Sontag,et al. A Comparison of the Computational Power of Sigmoid and Boolean Threshold Circuits , 1994 .
[79] Xiaojin Zhu,et al. Optimal Teaching for Online Perceptrons , 2016 .
[80] Ralf Herbrich,et al. Learning Kernel Classifiers: Theory and Algorithms , 2001 .
[81] Gilles Louppe,et al. Independent consultant , 2013 .
[82] Tobias Scheffer,et al. Stackelberg games for adversarial prediction problems , 2011, KDD.
[83] Jens Lagergren,et al. Learning Bounded Tree-width Bayesian Networks using Integer Linear Programming , 2014, AISTATS.
[84] Blaine Nelson,et al. The security of machine learning , 2010, Machine Learning.
[85] Sergio García,et al. A mixed integer linear model for clustering with variable selection , 2014, Comput. Oper. Res..
[86] Honglak Lee,et al. An Analysis of Single-Layer Networks in Unsupervised Feature Learning , 2011, AISTATS.
[87] Bradley C. Love,et al. Optimal Teaching for Limited-Capacity Human Learners , 2014, NIPS.
[88] Marianthi G. Ierapetritou,et al. Resolution method for mixed integer bi-level linear problems based on decomposition technique , 2009, J. Glob. Optim..
[89] Jonathan F. Bard,et al. An algorithm for the mixed-integer nonlinear bilevel programming problem , 1992, Ann. Oper. Res..
[90] Dimitris Bertsimas,et al. Classification and Regression via Integer Optimization , 2007, Oper. Res..
[91] Bruno Simeone,et al. Clustering heuristics for set covering , 1993, Ann. Oper. Res..
[92] J. MacQueen. Some methods for classification and analysis of multivariate observations , 1967 .
[93] Sandra Zilles,et al. An Overview of Machine Teaching , 2018, ArXiv.
[94] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[95] Akihiko Konagaya,et al. Cluster Newton Method for Sampling Multiple Solutions of Underdetermined Inverse Problems: Application to a Parameter Identification Problem in Pharmacokinetics , 2014, SIAM J. Sci. Comput..
[96] R. Tibshirani,et al. Extended Comparisons of Best Subset Selection, Forward Stepwise Selection, and the Lasso , 2017, 1707.08692.
[97] T. T. Narendran,et al. CLOVES: A cluster-and-search heuristic to solve the vehicle routing problem with delivery and pick-up , 2007, Eur. J. Oper. Res..
[98] Kristin P. Bennett,et al. Decision Tree Construction Via Linear Programming , 1992 .
[99] Jason Weston,et al. Gene Selection for Cancer Classification using Support Vector Machines , 2002, Machine Learning.
[100] Fabio Roli,et al. Multiple Classifier Systems under Attack , 2010, MCS.
[101] Hai Zhao,et al. A special ordered set approach for optimizing a discontinuous separable piecewise linear function , 2008, Oper. Res. Lett..
[102] Yoshua Bengio,et al. BinaryConnect: Training Deep Neural Networks with binary weights during propagations , 2015, NIPS.
[103] Dorit S. Hochbaum,et al. A comparative study of the leading machine learning techniques and two new optimization algorithms , 2019, Eur. J. Oper. Res..
[104] Joan Bruna,et al. Intriguing properties of neural networks , 2013, ICLR.
[105] Haldun Aytug,et al. Feature selection for support vector machines using Generalized Benders Decomposition , 2015, Eur. J. Oper. Res..
[106] Geoffrey E. Hinton,et al. Deep Learning , 2015, Nature.
[107] Selwyn Piramuthu. Evaluating feature selection methods for learning in data mining applications , 2004, Eur. J. Oper. Res..
[108] Massih-Reza Amini,et al. Learning from Multiple Partially Observed Views - an Application to Multilingual Text Categorization , 2009, NIPS.
[109] Milosz Kadzinski,et al. Robust ordinal regression in preference learning and ranking , 2013, Machine Learning.
[110] Dan Boneh,et al. Ensemble Adversarial Training: Attacks and Defenses , 2017, ICLR.
[111] Fei-Fei Li,et al. Large-Scale Video Classification with Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[112] Roland Vollgraf,et al. Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms , 2017, ArXiv.
[113] T. Klastorin. The p-Median Problem for Cluster Analysis: A Comparative Test Using the Mixture Model Approach , 1985 .
[114] David A. Wagner,et al. Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).
[115] Christodoulos A. Floudas,et al. Global Optimization of Nonlinear Bilevel Programming Problems , 2001, J. Glob. Optim..
[116] Junlong Zhang,et al. A Branch-and-cut Algorithm for Discrete Bilevel Linear Programs , 2017 .
[117] Christian Tjandraatmadja,et al. Bounding and Counting Linear Regions of Deep Neural Networks , 2017, ICML.
[118] Trevor Hastie,et al. An Introduction to Statistical Learning , 2013, Springer Texts in Statistics.
[119] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[120] R. Tibshirani,et al. REJOINDER TO "LEAST ANGLE REGRESSION" BY EFRON ET AL. , 2004, math/0406474.
[121] Martine Labbé,et al. Lagrangian relaxation for SVM feature selection , 2017, Comput. Oper. Res..
[122] Grant Potter,et al. ConvNetJS: Deep Learning in your browser , 2017 .
[123] Stephen P. Boyd,et al. Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..
[124] S. Dempe,et al. Bilevel programming with discrete lower level problems , 2009 .
[125] Gaël Varoquaux,et al. Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..
[126] Vladimir N. Vapnik,et al. The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.
[127] Luca Pulina,et al. Automated Verification of Neural Networks: Advances, Challenges and Perspectives , 2018, ArXiv.
[128] O. Mangasarian,et al. Massive data discrimination via linear support vector machines , 2000 .
[129] Haytham Elghazel,et al. A hybrid algorithm for Bayesian network structure learning with application to multi-label learning , 2014, Expert Syst. Appl..
[130] Laetitia Vermeulen-Jourdan,et al. Synergies between operations research and data mining: The emerging use of multi-objective approaches , 2012, Eur. J. Oper. Res..
[131] H. Robbins. A Stochastic Approximation Method , 1951 .
[132] Aleksander Madry,et al. Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.
[133] Eric R. Ziegel,et al. The Elements of Statistical Learning , 2003, Technometrics.
[134] Emilio Carrizosa,et al. Detecting relevant variables and interactions in supervised classification , 2011, Eur. J. Oper. Res..
[135] James Cussens,et al. Integer Linear Programming for the Bayesian network structure learning problem , 2017, Artif. Intell..
[136] Emilio Carrizosa,et al. Optimal randomized classification trees , 2021, Comput. Oper. Res..
[137] Fu Jie Huang,et al. A Tutorial on Energy-Based Learning , 2006 .
[138] J. Paul Brooks,et al. Principal Component Analysis and Optimization: A Tutorial , 2015 .
[139] James Cussens,et al. Bayesian network learning with cutting planes , 2011, UAI.
[140] Abolfazl Keshvari. Segmented concave least squares: A nonparametric piecewise linear regression , 2018, Eur. J. Oper. Res..
[141] Shuichi Kawano,et al. Sparse principal component regression with adaptive loading , 2014, Comput. Stat. Data Anal..
[142] S. Chatterjee,et al. Regression Analysis by Example , 1979 .
[143] Percy Liang,et al. Certified Defenses for Data Poisoning Attacks , 2017, NIPS.
[144] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[145] TreesKristin P. Bennett,et al. Optimal Decision Trees , 1996 .
[146] Jian Yang,et al. Complete large margin linear discriminant analysis using mathematical programming approach , 2013, Pattern Recognit..
[147] Ken Kobayashi,et al. Mixed integer quadratic optimization formulations for eliminating multicollinearity based on variance inflation factor , 2018, Journal of Global Optimization.
[148] Yoshua Bengio,et al. Maxout Networks , 2013, ICML.
[149] Mohammad Azad,et al. Multi-stage optimization of decision and inhibitory trees for decision tables with many-valued decisions , 2017, Eur. J. Oper. Res..
[150] Tobias Scheffer,et al. Static prediction games for adversarial learning problems , 2012, J. Mach. Learn. Res..
[151] Ohad Shamir,et al. Learning to classify with missing and corrupted features , 2008, ICML '08.
[152] William S. Meisel,et al. An Algorithm for Constructing Optimal Binary Decision Trees , 1977, IEEE Transactions on Computers.
[153] Fabio Roli,et al. Evasion Attacks against Machine Learning at Test Time , 2013, ECML/PKDD.
[154] Achille Fokoue,et al. An effective algorithm for hyperparameter optimization of neural networks , 2017, IBM J. Res. Dev..
[155] Sunil Tiwari,et al. Big data analytics in supply chain management between 2010 and 2016: Insights to industries , 2018, Comput. Ind. Eng..
[156] Xiaojin Zhu,et al. Machine Teaching for Bayesian Learners in the Exponential Family , 2013, NIPS.
[157] Xiaojin Zhu,et al. The Teaching Dimension of Linear Learners , 2015, ICML.
[158] Qiang Ji,et al. Learning Bounded Tree-Width Bayesian Networks via Sampling , 2015, ECSQARU.
[159] Luca Rigazio,et al. Towards Deep Neural Network Architectures Robust to Adversarial Examples , 2014, ICLR.
[160] Qiang Chen,et al. Network In Network , 2013, ICLR.
[161] B. Jaumard,et al. Cluster Analysis and Mathematical Programming , 2003 .
[162] Pietro Perona,et al. Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.
[163] Changhe Yuan,et al. Learning Optimal Bayesian Networks: A Shortest Path Perspective , 2013, J. Artif. Intell. Res..
[164] Yingqian Zhang,et al. Learning Decision Trees with Flexible Constraints and Objectives Using Integer Optimization , 2017, CPAIOR.
[165] Manfred Morari,et al. A clustering technique for the identification of piecewise affine systems , 2001, Autom..
[166] Marwan Mattar,et al. Labeled Faces in the Wild: A Database forStudying Face Recognition in Unconstrained Environments , 2008 .
[167] Jonathon Shlens,et al. Explaining and Harnessing Adversarial Examples , 2014, ICLR.
[168] Richard Weber,et al. Feature selection for Support Vector Machines via Mixed Integer Linear Programming , 2014, Inf. Sci..
[169] Marcos Negreiros,et al. The capacitated centred clustering problem , 2006, Comput. Oper. Res..
[170] Katya Scheinberg,et al. Optimization Methods for Supervised Machine Learning: From Linear Models to Deep Learning , 2017, ArXiv.
[171] Kenneth J. Berry,et al. Permutation-based multivariate regression analysis: The case for least sum of absolute deviations regression , 1997, Ann. Oper. Res..
[172] Martin Wistuba,et al. Adversarial Robustness Toolbox v1.0.0 , 2018, 1807.01069.
[173] Michael I. Jordan,et al. A Robust Minimax Approach to Classification , 2003, J. Mach. Learn. Res..
[174] Chih-Hong Cheng,et al. Maximum Resilience of Artificial Neural Networks , 2017, ATVA.
[175] Vaithilingam Jeyakumar,et al. Simultaneous classification and feature selection via convex quadratic programming with application to HIV-associated neurocognitive disorder assessment , 2010, Eur. J. Oper. Res..
[176] Igor Chikalov,et al. Bi-criteria optimization of decision trees with applications to data analysis , 2018, Eur. J. Oper. Res..
[177] Chris H. Q. Ding,et al. Multi-label Linear Discriminant Analysis , 2010, ECCV.
[178] Diego Klabjan,et al. Activation Ensembles for Deep Neural Networks , 2017, 2019 IEEE International Conference on Big Data (Big Data).
[179] Matteo Fischetti,et al. Deep neural networks and mixed integer linear optimization , 2018, Constraints.
[180] Neil F. Doherty,et al. Operational research from Taylorism to Terabytes: A research agenda for the analytics age , 2015, Eur. J. Oper. Res..
[181] Andrea Lodi,et al. Optimistic MILP modeling of non-linear optimization problems , 2014, Eur. J. Oper. Res..
[182] Marco Fraccaro,et al. Machine learning meets mathematical optimization to predict the optimal production of offshore wind parks , 2018, Comput. Oper. Res..
[183] Mohammad Azad,et al. Classification and optimization of decision trees for inconsistent decision tables represented as MVD tables , 2015, 2015 Federated Conference on Computer Science and Information Systems (FedCSIS).
[184] Edoardo Amaldi,et al. Discrete optimization methods to fit piecewise affine models to data points , 2016, Comput. Oper. Res..
[185] Pierre Hansen,et al. Improving heuristics for network modularity maximization using an exact algorithm , 2011, Discret. Appl. Math..
[186] Marco Zaffalon,et al. Learning Treewidth-Bounded Bayesian Networks with Thousands of Variables , 2016, NIPS.
[187] Premysl Sucha,et al. Accelerating the Branch-and-Price Algorithm Using Machine Learning , 2018, Eur. J. Oper. Res..
[188] Kristin P. Bennett,et al. The Interplay of Optimization and Machine Learning Research , 2006, J. Mach. Learn. Res..
[189] L. N. Vicente,et al. Discrete linear bilevel programming problem , 1996 .
[190] Adil M. Bagirov,et al. A new nonsmooth optimization algorithm for minimum sum-of-squares clustering problems , 2006, Eur. J. Oper. Res..
[191] Tomaso A. Poggio,et al. Fisher-Rao Metric, Geometry, and Complexity of Neural Networks , 2017, AISTATS.
[192] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[193] R. Tibshirani,et al. Generalized additive models for medical research , 1986, Statistical methods in medical research.
[194] Amnon Shashua,et al. Ranking with Large Margin Principle: Two Approaches , 2002, NIPS.
[195] Russ Tedrake,et al. Evaluating Robustness of Neural Networks with Mixed Integer Programming , 2017, ICLR.
[196] Mykel J. Kochenderfer,et al. Reluplex: An Efficient SMT Solver for Verifying Deep Neural Networks , 2017, CAV.
[197] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.
[198] Razvan Pascanu,et al. On the Number of Linear Regions of Deep Neural Networks , 2014, NIPS.
[199] Nenad Mladenovic,et al. Variable neighborhood search for minimum sum-of-squares clustering on networks , 2012, Eur. J. Oper. Res..
[200] Anil K. Jain,et al. NOTE ON DISTANCE-WEIGHTED k-NEAREST NEIGHBOR RULES. , 1978 .
[201] Yingqian Zhang,et al. Learning optimization models in the presence of unknown relations , 2014, ArXiv.
[202] Vladimir Vapnik,et al. Support-vector networks , 2004, Machine Learning.
[203] George L. Nemhauser,et al. Mixed-Integer Models for Nonseparable Piecewise-Linear Optimization: Unifying Framework and Extensions , 2010, Oper. Res..
[204] Andrea Bartolini,et al. Empirical decision model learning , 2017, Artif. Intell..
[205] Yoshua Bengio,et al. Machine Learning for Combinatorial Optimization: a Methodological Tour d'Horizon , 2018, Eur. J. Oper. Res..
[206] Thomas Pock,et al. Continuous Hyper-parameter Learning for Support Vector Machines , 2015 .
[207] Bart P. G. Van Parys,et al. Sparse Classification and Phase Transitions: A Discrete Optimization Perspective , 2017 .
[208] Samy Bengio,et al. Adversarial Machine Learning at Scale , 2016, ICLR.
[209] Vladimir Vapnik,et al. Statistical learning theory , 1998 .
[210] Ayumi Shinohara,et al. Teachability in computational learning , 1990, New Generation Computing.
[211] Xiaojin Zhu,et al. Using Machine Teaching to Identify Optimal Training-Set Attacks on Machine Learners , 2015, AAAI.
[212] Luiz Antonio Nogueira Lorena,et al. Clustering search algorithm for the capacitated centered clustering problem , 2010, Comput. Oper. Res..
[213] Pascal Vincent,et al. The Manifold Tangent Classifier , 2011, NIPS.
[214] Ronald A. Cole,et al. Spoken Letter Recognition , 1990, HLT.
[215] Denis J. Dean,et al. Comparative accuracies of artificial neural networks and discriminant analysis in predicting forest cover types from cartographic variables , 1999 .
[216] Rolf Wendolsky,et al. A scatter search heuristic for the capacitated clustering problem , 2006, Eur. J. Oper. Res..
[217] James Cussens,et al. Advances in Bayesian Network Learning using Integer Programming , 2013, UAI.
[218] Blaine Nelson,et al. Poisoning Attacks against Support Vector Machines , 2012, ICML.
[219] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.
[220] R. Fisher. THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .
[221] Anita Schöbel,et al. Locating least-distant lines in the plane , 1998, Eur. J. Oper. Res..
[222] Ying Daisy Zhuo,et al. Robust Classification , 2019, INFORMS J. Optim..
[223] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.
[224] Christodoulos A. Floudas,et al. Global optimization of mixed-integer bilevel programming problems , 2005, Comput. Manag. Sci..
[225] T. Hastie,et al. Generalized Additive Model Selection , 2015, 1506.03850.
[226] Emilio Carrizosa,et al. Supervised classification and mathematical optimization , 2013, Comput. Oper. Res..
[227] Joe Naoum-Sawaya,et al. High dimensional data classification and feature selection using support vector machines , 2018, Eur. J. Oper. Res..
[228] Stephen J. Wright. Optimization algorithms for data analysis , 2018, IAS/Park City Mathematics Series.
[229] Stephan Dempe,et al. Discrete bilevel programming: Application to a natural gas cash-out problem , 2005, Eur. J. Oper. Res..
[230] J KriegmanDavid,et al. Eigenfaces vs. Fisherfaces , 1997 .
[231] Radu Ioan Bot,et al. Optimization problems in statistical learning: Duality and optimality conditions , 2011, Eur. J. Oper. Res..
[232] P. Taylan,et al. New approaches to regression by generalized additive models and continuous optimization for modern applications in finance, science and technology , 2007 .
[233] Dimitris Bertsimas,et al. Optimal classification trees , 2017, Machine Learning.
[234] Alper Atamtürk,et al. Rank-one Convexification for Sparse Regression , 2019, ArXiv.
[235] Bistra N. Dilkina,et al. Combinatorial Attacks on Binarized Neural Networks , 2019, ICLR.
[236] Emilio Carrizosa,et al. rs-Sparse principal component analysis: A mixed integer nonlinear programming approach with VNS , 2014, Comput. Oper. Res..
[237] Andrew Y. Ng,et al. Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .
[238] Adil M. Bagirov,et al. New diagonal bundle method for clustering problems in large data sets , 2017, Eur. J. Oper. Res..
[239] Jorge Nocedal,et al. Optimization Methods for Large-Scale Machine Learning , 2016, SIAM Rev..
[240] Rüdiger Ehlers,et al. Formal Verification of Piece-Wise Linear Feed-Forward Neural Networks , 2017, ATVA.
[241] H. Zou,et al. The doubly regularized support vector machine , 2006 .
[242] José Miguel Díaz-Báñez,et al. Continuous location of dimensional structures , 2004, Eur. J. Oper. Res..
[243] Christopher Potts,et al. Learning Word Vectors for Sentiment Analysis , 2011, ACL.
[244] Maw-Sheng Chern,et al. Nonlinear integer bilevel programming , 1994 .
[245] Andrea Lodi,et al. Mathematical programming techniques in water network optimization , 2015, Eur. J. Oper. Res..
[246] Jorge Nocedal,et al. A Limited Memory Algorithm for Bound Constrained Optimization , 1995, SIAM J. Sci. Comput..
[247] Gregory Cohen,et al. EMNIST: an extension of MNIST to handwritten letters , 2017, CVPR 2017.
[248] Pierre Hansen,et al. Reformulation of a model for hierarchical divisive graph modularity maximization , 2012, Annals of Operations Research.
[249] Lorenzo Rosasco,et al. Theory of Deep Learning III: explaining the non-overfitting puzzle , 2017, ArXiv.
[250] Emilio Carrizosa,et al. Sparsity in Optimal Randomized Classification Trees , 2020, Eur. J. Oper. Res..
[251] Antonio Criminisi,et al. Measuring Neural Net Robustness with Constraints , 2016, NIPS.
[252] Marc'Aurelio Ranzato,et al. Building high-level features using large scale unsupervised learning , 2011, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[253] Oktay Günlük,et al. Optimal decision trees for categorical data via integer programming , 2021, Journal of Global Optimization.
[254] Emilio Carrizosa,et al. Binarized Support Vector Machines , 2010, INFORMS J. Comput..
[255] Alan J. Miller. Subset Selection in Regression , 1992 .
[256] Paolo Frasconi,et al. Bilevel Programming for Hyperparameter Optimization and Meta-Learning , 2018, ICML.
[257] Dan Roth,et al. Constraint Classification for Multiclass Classification and Ranking , 2002, NIPS.
[258] Michael J. Fry,et al. Model-based capacitated clustering with posterior regularization , 2018, Eur. J. Oper. Res..
[259] Jean-Philippe Vial,et al. Robust Optimization , 2021, ICORES.
[260] Andrea Lodi,et al. Learning a Classification of Mixed-Integer Quadratic Programming Problems , 2017, CPAIOR.
[261] George L. Nemhauser,et al. Learning to Run Heuristics in Tree Search , 2017, IJCAI.
[262] Pierre Baldi,et al. Learning Activation Functions to Improve Deep Neural Networks , 2014, ICLR.
[263] Bart P. G. Van Parys,et al. Sparse high-dimensional regression: Exact scalable algorithms and phase transitions , 2017, The Annals of Statistics.
[264] Mark W. Lewis,et al. Exact Solutions to the Capacitated Clustering Problem: A Comparison of Two Models , 2014 .
[265] David Maxwell Chickering,et al. Learning Bayesian Networks is , 1994 .
[266] Pushmeet Kohli,et al. A Unified View of Piecewise Linear Neural Network Verification , 2017, NeurIPS.
[267] Qiang Ji,et al. Advances in Learning Bayesian Networks of Bounded Treewidth , 2014, NIPS.
[268] Carl Tim Kelley,et al. Iterative methods for optimization , 1999, Frontiers in applied mathematics.
[269] A. Gunasekaran,et al. Big data analytics in logistics and supply chain management: Certain investigations for research and applications , 2016 .
[270] Chris H. Q. Ding,et al. K-means clustering via principal component analysis , 2004, ICML.
[271] Wei Chu,et al. Support Vector Ordinal Regression , 2007, Neural Computation.