On the relation between Loss Functions and T-Norms

Deep learning has been shown to achieve impressive results in several domains like computer vision and natural language processing. A key element of this success has been the development of new loss functions, like the popular cross-entropy loss, which has been shown to provide faster convergence and to reduce the vanishing gradient problem in very deep structures. While the cross-entropy loss is usually justified from a probabilistic perspective, this paper shows an alternative and more direct interpretation of this loss in terms of t-norms and their associated generator functions, and derives a general relation between loss functions and t-norms. In particular, the presented work shows intriguing results leading to the development of a novel class of loss functions. These losses can be exploited in any supervised learning task and which could lead to faster convergence rates that the commonly employed cross-entropy loss.

[1]  Sándor Jenei,et al.  A note on the ordinal sum theorem and its consequence for the construction of triangular norms , 2002, Fuzzy Sets Syst..

[2]  Andrew McCallum,et al.  Introduction to Statistical Relational Learning , 2007 .

[3]  Stephen H. Bach,et al.  Hinge-Loss Markov Random Fields and Probabilistic Soft Logic , 2015, J. Mach. Learn. Res..

[4]  Mariano Eriz Aggregation Functions: A Guide for Practitioners , 2010 .

[5]  Radko Mesiar,et al.  Triangular norms. Position paper III: continuous t-norms , 2004, Fuzzy Sets Syst..

[6]  Radko Mesiar,et al.  Triangular norms. Position paper II: general constructions and parameterized families , 2004, Fuzzy Sets Syst..

[7]  Fan Yang,et al.  Differentiable Learning of Logical Rules for Knowledge Base Reasoning , 2017, NIPS.

[8]  Ben Taskar,et al.  Introduction to statistical relational learning , 2007 .

[9]  Marco Gori,et al.  Semantic-based regularization for learning and inference , 2017, Artif. Intell..

[10]  V. Novák,et al.  Mathematical Principles of Fuzzy Logic , 1999 .

[11]  Luc De Raedt,et al.  Learning SMT(LRA) Constraints using SMT Solvers , 2018, IJCAI.

[12]  Radko Mesiar,et al.  Triangular norms. Position paper I: basic analytical and algebraic properties , 2004, Fuzzy Sets Syst..

[13]  Marco Gori,et al.  On a Convex Logic Fragment for Learning and Reasoning , 2018, IEEE Transactions on Fuzzy Systems.

[14]  Radko Mesiar,et al.  Aggregation functions: Means , 2011, Inf. Sci..

[15]  Vicenç Torra,et al.  Modeling decisions - information fusion and aggregation operators , 2007 .

[16]  Francis R. Bach,et al.  Harder, Better, Faster, Stronger Convergence Rates for Least-Squares Regression , 2016, J. Mach. Learn. Res..

[17]  Guy Van den Broeck,et al.  A Semantic Loss Function for Deep Learning with Symbolic Knowledge , 2017, ICML.

[18]  Matthew Richardson,et al.  Markov logic networks , 2006, Machine Learning.

[19]  Artur S. d'Avila Garcez,et al.  Logic Tensor Networks for Semantic Image Interpretation , 2017, IJCAI.

[20]  R. Mesiar,et al.  Aggregation operators: properties, classes and construction methods , 2002 .

[21]  Petr Hájek,et al.  Metamathematics of Fuzzy Logic , 1998, Trends in Logic.