Learning deep kernels in the space of monotone conjunctive polynomials

Abstract Dot-product kernels is a large family of kernel functions based on dot-product between examples. A recent result states that any dot-product kernel can be decomposed as a non-negative linear combination of homogeneous polynomial kernels of different degrees, and it is possible to learn the coefficients of the combination by exploiting the Multiple Kernel Learning (MKL) paradigm. In this paper it is proved that, under mild conditions, any homogeneous polynomial kernel defined on binary valued data can be decomposed in a parametrized finite linear non-negative combination of monotone conjunctive kernels. MKL has been employed to learn the parameters of the combination. Furthermore, we show that our solution produces a deep kernel whose feature space consists of hierarchically organized features of increasing complexity. We also emphasize the connection between our solution and existing deep kernel learning frameworks. A wide empirical assessment is presented to evaluate the proposed framework, and to compare it against the baselines on several categorical and binary datasets.

[1]  Fabio Aiolli,et al.  Learning deep kernels in the space of dot product polynomials , 2017, Machine Learning.

[2]  I. J. Schoenberg Positive definite functions on spheres , 1942 .

[3]  William Stafford Noble,et al.  Learning kernels from biological networks by maximizing entropy , 2004, ISMB/ECCB.

[4]  Alessandro Sperduti,et al.  On Filter Size in Graph Convolutional Networks , 2018, 2018 IEEE Symposium Series on Computational Intelligence (SSCI).

[5]  Gunnar Rätsch,et al.  Large Scale Multiple Kernel Learning , 2006, J. Mach. Learn. Res..

[6]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2003, ICTAI.

[7]  Sayan Mukherjee,et al.  Choosing Multiple Parameters for Support Vector Machines , 2002, Machine Learning.

[8]  Arthur Gretton,et al.  Learning deep kernels for exponential family densities , 2018, ICML.

[9]  Dinggang Shen,et al.  An efficient radius-incorporated MKL algorithm for Alzheimer's disease prediction , 2015, Pattern Recognit..

[10]  Mehryar Mohri,et al.  Multi-Class Classification with Maximum Margin Multiple Kernel , 2013, ICML.

[11]  Ethem Alpaydin,et al.  Multiple Kernel Learning Algorithms , 2011, J. Mach. Learn. Res..

[12]  Marius Kloft,et al.  Learning Kernels Using Local Rademacher Complexity , 2013, NIPS.

[13]  Andrew Gordon Wilson,et al.  Deep Kernel Learning , 2015, AISTATS.

[14]  Michael I. Jordan,et al.  Multiple kernel learning, conic duality, and the SMO algorithm , 2004, ICML.

[15]  Mirko Polato,et al.  Classification of Categorical Data in the Feature Space of Monotone DNFs , 2017, ICANN.

[16]  Fabio Aiolli,et al.  EasyMKL: a scalable multiple kernel learning algorithm , 2015, Neurocomputing.

[17]  Marius Kloft,et al.  Two-sample Testing Using Deep Learning , 2020, AISTATS.

[18]  Nico Pfeifer,et al.  Integrating different data types by regularized unsupervised multiple kernel learning with application to cancer subtype discovery , 2015, Bioinform..

[19]  Ivano Lauriola,et al.  Learning dot-product polynomials for multiclass problems , 2017, ESANN.

[20]  Jesús Alcalá-Fdez,et al.  KEEL Data-Mining Software Tool: Data Set Repository, Integration of Algorithms and Experimental Analysis Framework , 2011, J. Multiple Valued Log. Soft Comput..

[21]  Alexandros Kalousis,et al.  Convex formulations of radius-margin based Support Vector Machines , 2013, ICML.

[22]  Ivor W. Tsang,et al.  This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 1 Soft Margin Multiple Kernel Learning , 2022 .

[23]  Pinar Yanardag,et al.  Deep Graph Kernels , 2015, KDD.

[24]  Alessandro Sperduti,et al.  Hyper-Parameter Tuning for Graph Kernels via Multiple Kernel Learning , 2016, ICONIP.

[25]  Mehryar Mohri,et al.  Generalization Bounds for Learning Kernels , 2010, ICML.

[26]  Chih-Jen Lin,et al.  Radius Margin Bounds for Support Vector Machines with the RBF Kernel , 2002, Neural Computation.

[27]  Lawrence K. Saul,et al.  Kernel Methods for Deep Learning , 2009, NIPS.

[28]  Erik Cambria,et al.  Ensemble application of convolutional neural networks and multiple kernel learning for multimodal sentiment analysis , 2017, Neurocomputing.

[29]  Melanie Hilario,et al.  Margin and Radius Based Multiple Kernel Learning , 2009, ECML/PKDD.

[30]  Le Song,et al.  Deep Fried Convnets , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[31]  Curtis B. Storlie,et al.  Multiple Kernel Learning Clustering with an Application to Malware , 2012, 2012 IEEE 12th International Conference on Data Mining.