Copula Multi-label Learning

A formidable challenge in multi-label learning is to model the interdependencies between labels and features. Unfortunately, the statistical properties of existing multi-label dependency modelings are still not well understood. Copulas are a powerful tool for modeling dependence of multivariate data, and achieve great success in a wide range of applications, such as finance, econometrics and systems neuroscience. This inspires us to develop a novel copula multi-label learning paradigm for modeling label and feature dependencies. The copula based paradigm enables to reveal new statistical insights in multi-label learning. In particular, the paper first leverages the kernel trick to construct continuous distribution in the output space, and then estimates our proposed model semiparametrically where the copula is modeled parametrically, while the marginal distributions are modeled nonparametrically. Theoretically, we show that our estimator is an unbiased and consistent estimator and follows asymptotically a normal distribution. Moreover, we bound the mean squared error of estimator. The experimental results from various domains validate the superiority of our proposed approach.

[1]  Larry A. Wasserman,et al.  The Nonparanormal: Semiparametric Estimation of High Dimensional Undirected Graphs , 2009, J. Mach. Learn. Res..

[2]  Jennifer Neville,et al.  Collective inference for network data with copula latent markov networks , 2013, WSDM.

[3]  Weiwei Liu,et al.  Multilabel Prediction via Cross-View Search , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[4]  Grigorios Tsoumakas,et al.  Mining Multi-label Data , 2010, Data Mining and Knowledge Discovery Handbook.

[5]  Jeff G. Schneider,et al.  Multi-Label Output Codes using Canonical Correlation Analysis , 2011, AISTATS.

[6]  Klaus-Robert Müller,et al.  N-ary decomposition for multi-class classification , 2019, Machine Learning.

[7]  Ing Rj Ser Approximation Theorems of Mathematical Statistics , 1980 .

[8]  Weiwei Liu,et al.  Metric Learning for Multi-Output Tasks , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Weiwei Liu,et al.  Making Decision Trees Feasible in Ultrahigh Feature and Label Dimensions , 2017, J. Mach. Learn. Res..

[10]  Grigorios Tsoumakas,et al.  Introduction to the special issue on learning from multi-label data , 2012, Machine Learning.

[11]  Emiliano A. Valdez,et al.  Understanding Relationships Using Copulas , 1998 .

[12]  Rick Siow Mong Goh,et al.  Dual Adversarial Neural Transfer for Low-Resource Named Entity Recognition , 2019, ACL.

[13]  Prateek Jain,et al.  Sparse Local Embeddings for Extreme Multi-label Classification , 2015, NIPS.

[14]  Ricardo Bezerra de Andrade e Silva,et al.  Flexible sampling of discrete data correlations without the marginal distributions , 2013, NIPS.

[15]  Geoff Holmes,et al.  Classifier chains for multi-label classification , 2009, Machine Learning.

[16]  Weiwei Liu,et al.  Compact Multi-Label Learning , 2018, AAAI.

[17]  Bernhard Schölkopf,et al.  Semi-Supervised Domain Adaptation with Non-Parametric Copulas , 2012, NIPS.

[18]  Weiwei Liu,et al.  Large Margin Metric Learning for Multi-Label Prediction , 2015, AAAI.

[19]  R. Nelsen An Introduction to Copulas , 1998 .

[20]  D. Oakes,et al.  Bivariate survival models induced by frailties , 1989 .

[21]  Hao Zhang,et al.  Dual Adversarial Transfer for Sequence Labeling , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  C. Genest,et al.  A semiparametric estimation procedure of dependence parameters in multivariate families of distributions , 1995 .

[23]  Andrew J. Patton Modelling Asymmetric Exchange Rate Dependence , 2006 .

[24]  Hsuan-Tien Lin,et al.  Feature-aware Label Space Dimension Reduction for Multi-label Classification , 2012, NIPS.

[25]  Weiwei Liu,et al.  An Easy-to-hard Learning Paradigm for Multiple Classes and Multiple Labels , 2017, J. Mach. Learn. Res..

[26]  Pradeep Ravikumar,et al.  PD-Sparse : A Primal and Dual Sparse Approach to Extreme Multiclass and Multilabel Classification , 2016, ICML.

[27]  Weiwei Liu,et al.  On the Optimality of Classifier Chain for Multi-label Classification , 2015, NIPS.

[28]  E. Luciano,et al.  Copula methods in finance , 2004 .

[29]  Sergey Kirshner,et al.  Learning with Tree-Averaged Densities and Distributions , 2007, NIPS.

[30]  H. Tsukahara,et al.  Semiparametric estimation in copula models , 2005 .

[31]  Ivor W. Tsang,et al.  Multi-class Heterogeneous Domain Adaptation , 2019, J. Mach. Learn. Res..

[32]  Frank D. Wood,et al.  Characterizing neural dependencies with copula models , 2008, NIPS.

[33]  Barnabás Póczos,et al.  Copula-based Kernel Dependency Measures , 2012, ICML.

[34]  Gal Elidan,et al.  Copula Bayesian Networks , 2010, NIPS.

[35]  A. Sklar,et al.  Random variables, distribution functions, and copulas---a personal look backward and forward , 1996 .