Multitask Kernel-based Learning with Logic Constraints

This paper presents a general framework to integrate prior knowledge in the form of logic constraints among a set of task functions into kernel machines. The logic propositions provide a partial representation of the environment, in which the learner operates, that is exploited by the learning algorithm together with the information available in the supervised examples. In particular, we consider a multi-task learning scheme, where multiple unary predicates on the feature space are to be learned by kernel machines and a higher level abstract representation consists of logic clauses on these predicates, known to hold for any input. A general approach is presented to convert the logic clauses into a continuous implementation, that processes the outputs computed by the kernel-based predicates. The learning task is formulated as a primal optimization problem of a loss function that combines a term measuring the fitting of the supervised examples, a regularization term, and a penalty term that enforces the constraints on both supervised and unsupervised examples. The proposed semi-supervised learning framework is particularly suited for learning in high dimensionality feature spaces, where the supervised training examples tend to be sparse and generalization difficult. Unlike for standard kernel machines, the cost function to optimize is not generally guaranteed to be convex. However, the experimental results show that it is still possible to find good solutions using a two stage learning schema, in which first the supervised examples are learned until convergence and then the logic constraints are forced. Some promising experimental results on artificial multi-task learning tasks are reported, showing how the classification accuracy can be effectively improved by exploiting the a priori rules and the unsupervised examples.

[1]  Olivier Chapelle,et al.  Training a Support Vector Machine in the Primal , 2007, Neural Computation.

[2]  Jason Weston,et al.  Curriculum learning , 2009, ICML '09.

[3]  Stephen Muggleton,et al.  Support Vector Inductive Logic Programming , 2005, Discovery Science.

[4]  Paolo Frasconi,et al.  Learning with Kernels and Logical Representations , 2007, Probabilistic Inductive Logic Programming.

[5]  Radko Mesiar,et al.  Triangular Norms , 2000, Trends in Logic.

[6]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[7]  Charles A. Micchelli,et al.  Universal Multi-Task Kernels , 2008, J. Mach. Learn. Res..

[8]  Marco Gori Semantic-based regularization and Piaget's cognitive stages , 2009, Neural Networks.