PenDer: Incorporating Shape Constraints via Penalized Derivatives

When deploying machine learning models in the real-world, system designers may wish that models exhibit certain shape behavior, i.e., model outputs follow a particular shape with respect to input features. Trends such as monotonicity, convexity, diminishing or accelerating returns are some of the desired shapes. Presence of these shapes makes the model more interpretable for the system designers, and adequately fair for the customers. We notice that many such common shapes are related to derivatives, and propose a new approach, PenDer (Penalizing Derivatives), which incorporates these shape constraints by penalizing the derivatives. We further present an Augmented Lagrangian Method (ALM) to learn the joint unconstrained objective function. Experiments on three realworld datasets illustrate that even though both PenDer and state-of-the-art Lattice models achieve similar conformance to shape, PenDer captures better sensitivity of prediction with respect to intended features. We also demonstrate that PenDer achieves better test performance than Lattice while enforcing more desirable shape behavior.

[1]  Yoshua Bengio,et al.  Série Scientifique Scientific Series Incorporating Second-order Functional Knowledge for Better Option Pricing Incorporating Second-order Functional Knowledge for Better Option Pricing , 2022 .

[2]  Nida Shahid,et al.  Applications of artificial neural networks in health care organizational decision-making: A scoping review , 2019, PloS one.

[3]  Maya R. Gupta,et al.  Shape Constraints for Set Functions , 2019, ICML.

[4]  Maya R. Gupta,et al.  Monotonic Calibrated Interpolated Look-Up Tables , 2015, J. Mach. Learn. Res..

[5]  Leon Sterling,et al.  Learning and classification of monotonic ordinal concepts , 1989, Comput. Intell..

[6]  A. J. Feelders Prior Knowledge in Economic Applications of Data Mining , 2000, PKDD.

[7]  David Gamarnik Efficient learning of monotone concepts via quadratic optimization , 1998, COLT' 98.

[8]  M. Hestenes Multiplier and gradient methods , 1969 .

[9]  Tugrul U. Daim,et al.  Using artificial neural network models in stock market index prediction , 2011, Expert Syst. Appl..

[10]  Scott A. Moss,et al.  What Predicts Law Student Success? A Longitudinal Study Correlating Law Student Applicant Data and Law School Outcomes , 2016 .

[11]  Adriano C. M. Pereira,et al.  Stock market's price movement prediction with LSTM neural networks , 2017, 2017 International Joint Conference on Neural Networks (IJCNN).

[12]  Peter Hall,et al.  On selecting interacting features from high-dimensional data , 2014, Comput. Stat. Data Anal..

[13]  N. Mankiw,et al.  Principles of Economics, 5th edition , 2011 .

[14]  Gilbert Strang,et al.  Approximation in the finite element method , 1972 .

[15]  Yangyang Xu,et al.  Iteration complexity of inexact augmented Lagrangian methods for constrained convex programming , 2017, Mathematical Programming.

[16]  Zachary Chase Lipton The mythos of model interpretability , 2016, ACM Queue.

[17]  Serena Wang,et al.  Deontological Ethics By Monotonicity Shape Constraints , 2020, AISTATS.

[18]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[19]  Bernhard Lang,et al.  Monotonic Multi-layer Perceptron Networks as Universal Approximators , 2005, ICANN.

[20]  Joseph Sill,et al.  Monotonicity Hints , 1996, NIPS.

[21]  P Royston,et al.  A useful monotonic non-linear model with applications in medicine and epidemiology. , 2000, Statistics in medicine.

[22]  Been Kim,et al.  Towards A Rigorous Science of Interpretable Machine Learning , 2017, 1702.08608.

[23]  Carl M. O’Brien,et al.  Nonparametric Estimation under Shape Constraints: Estimators, Algorithms and Asymptotics , 2016 .

[24]  Marina Velikova,et al.  Monotone and Partially Monotone Neural Networks , 2010, IEEE Transactions on Neural Networks.

[25]  Maya R. Gupta,et al.  Diminishing Returns Shape Constraints for Interpretability and Regularization , 2018, NeurIPS.

[26]  Michael Aikenhead,et al.  The Uses and Abuses of Neural Networks in Law , 1996 .

[27]  L. Ungar,et al.  Estimating Monotonic Functions and Their Bounds , 1999 .

[28]  Lavanya Marla,et al.  Dynamic Pricing for Airline Ancillaries with Customer Context , 2019, KDD.

[29]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[30]  Volkan Cevher,et al.  An Inexact Augmented Lagrangian Framework for Nonconvex Optimization with Nonlinear Constraints , 2019, NeurIPS.

[31]  Leonardo Caggiani,et al.  A Neural Network based Model for Real Estate Price Estimation Considering Environmental Quality of Property Location , 2014 .

[32]  Andrew L. Johnson,et al.  Shape Constraints in Economics and Operations Research , 2018, Statistical Science.

[33]  Yoshua Bengio,et al.  Incorporating Functional Knowledge in Neural Networks , 2009, J. Mach. Learn. Res..

[34]  Shouhong Wang,et al.  Application of the Back Propagation Neural Network Algorithm with Monotonicity Constraints for Two‐Group Classification Problems* , 1993 .

[35]  Joseph Sill,et al.  Monotonic Networks , 1997, NIPS.

[36]  Maya R. Gupta,et al.  Fast and Flexible Monotonic Functions with Ensembles of Lattices , 2016, NIPS.

[37]  Maya R. Gupta,et al.  Deep Lattice Networks and Partial Monotonic Functions , 2017, NIPS.

[38]  D. Ghosh Incorporating monotonicity into the evaluation of a biomarker. , 2007, Biostatistics.

[39]  H. Daniels,et al.  Application of MLP Networks to Bond Rating and House Pricing , 1999, Neural Computing & Applications.

[40]  D K Smith,et al.  Numerical Optimization , 2001, J. Oper. Res. Soc..

[41]  Liran Einav,et al.  The impact of credit scoring on consumer lending , 2013 .

[42]  D. Rubinfeld,et al.  Hedonic housing prices and the demand for clean air , 1978 .

[43]  Lihui Zhao,et al.  Associations of Dietary Cholesterol or Egg Consumption With Incident Cardiovascular Disease and Mortality , 2019, JAMA.