Learning with Complex Loss Functions and Constraints

We develop a general approach for solving constrained classification problems, where the loss and constraints are defined in terms of a general function of the confusion matrix. We are able to handle complex, non-linear loss functions such as the F-measure, G-mean or H-mean, and constraints ranging from budget limits, to constraints for fairness, to bounds on complex evaluation metrics. Our approach builds on the framework of Narasimhan et al. (2015) for unconstrained classification with complex losses, and reduces the constrained learning problem to a sequence of cost-sensitive learning tasks. We provide algorithms for two broad families of problems, involving convex and fractional-convex losses, subject to convex constraints. Our algorithms are statistically consistent, generalize an existing approach for fair classification, and readily apply to multiclass problems. Experiments on a variety of tasks demonstrate the efficacy of our methods.

[1]  Rong Jin,et al.  Stochastic Convex Optimization with Multiple Objectives , 2013, NIPS.

[2]  Oluwasanmi Koyejo,et al.  Consistent Binary Classification with Generalized Performance Metrics , 2014, NIPS.

[3]  Prateek Jain,et al.  Online and Stochastic Gradient Methods for Non-decomposable Loss Functions , 2014, NIPS.

[4]  Prateek Jain,et al.  Surrogate Functions for Maximizing Precision at the Top , 2015, ICML.

[5]  Xin Yao,et al.  Multiclass Imbalance Problems: Analysis and Potential Solutions , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[6]  Shuai Li,et al.  Online Optimization Methods for the Quantification Problem , 2016, KDD.

[7]  Harikrishna Narasimhan,et al.  Consistent Multiclass Algorithms for Complex Performance Measures , 2015, ICML.

[8]  Martin Jaggi,et al.  Revisiting Frank-Wolfe: Projection-Free Sparse Convex Optimization , 2013, ICML.

[9]  Martin Zinkevich,et al.  Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.

[10]  Prateek Jain,et al.  Optimizing Non-decomposable Performance Measures: A Tale of Two Classes , 2015, ICML.

[11]  Seth Neel,et al.  A Convex Framework for Fair Regression , 2017, ArXiv.

[12]  Maya R. Gupta,et al.  Satisfying Real-world Goals with Dataset Constraints , 2016, NIPS.

[13]  Jun Sakuma,et al.  Fairness-aware Learning through Regularization Approach , 2011, 2011 IEEE 11th International Conference on Data Mining Workshops.

[14]  Yue Wang,et al.  The Genia Event Extraction Shared Task, 2013 Edition - Overview , 2013, BioNLP@ACL.

[15]  George Forman,et al.  Quantifying counts and costs via classification , 2008, Data Mining and Knowledge Discovery.

[16]  John Langford,et al.  A Reductions Approach to Fair Classification , 2018, ICML.

[17]  Wei Gao,et al.  Tweet sentiment: From classification to quantification , 2015, 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[18]  Harikrishna Narasimhan,et al.  On the Statistical Consistency of Plug-in Classifiers for Non-decomposable Performance Measures , 2014, NIPS.

[19]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[20]  Andrea Esuli,et al.  Optimizing Text Quantifiers for Multivariate Loss Functions , 2015, TKDD.

[21]  Yang Wang,et al.  Boosting for Learning Multiple Classes with Imbalanced Class Distribution , 2006, Sixth International Conference on Data Mining (ICDM'06).

[22]  Krishna P. Gummadi,et al.  Fairness Constraints: Mechanisms for Fair Classification , 2015, AISTATS.

[23]  Mark D. Reid,et al.  Composite Multiclass Losses , 2011, J. Mach. Learn. Res..

[24]  Jon M. Kleinberg,et al.  Inherent Trade-Offs in the Fair Determination of Risk Scores , 2016, ITCS.

[25]  Katrina Ligett,et al.  Learning Fair Classifiers: A Regularization-Inspired Approach , 2017, ArXiv.

[26]  Avi Feller,et al.  Algorithmic Decision Making and the Cost of Fairness , 2017, KDD.

[27]  Ah Chung Tsoi,et al.  Neural Network Classification and Prior Class Probabilities , 1996, Neural Networks: Tricks of the Trade.

[28]  Aditya Krishna Menon,et al.  The cost of fairness in classification , 2017, ArXiv.

[29]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[30]  Thorsten Joachims,et al.  A support vector method for multivariate performance measures , 2005, ICML.

[31]  David D. Lewis,et al.  Evaluating Text Categorization I , 1991, HLT.

[32]  Pavlos Protopapas,et al.  Optimizing the Multiclass F-Measure via Biconcave Programming , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).