Complex heterogeneity learning: A theoretical and empirical study

Abstract Data heterogeneity such as task heterogeneity, view heterogeneity, and instance heterogeneity often co-exist in many real-world applications including insider threat detection, traffic prediction, brain image analysis, quality control in manufacturing processes, etc. However, most of the existing techniques might not take fully advantage of the rich heterogeneity. To address this problem, we propose a novel graph-based approach named M3 to simultaneously model triple heterogeneity in a principled framework. The main idea is to employ the hybrid graphs to jointly model the task relatedness, view consistency, and bag-instance correlation by enhancing the labeling consistency between nearby nodes on the graphs. Furthermore, we analyze the generalization performance of the proposed method based on Rademacher complexity, which sheds light on the benefits of jointly modeling multiple types of heterogeneity. The resulting optimization problem is challenging since the objective function is non-smooth and non-convex. We propose an iterative algorithm based on block coordinate descent and bundle method to solve the problem. Experimental results on various datasets demonstrate the effectiveness of the proposed method.

[1]  Alexander J. Smola,et al.  Kernels and Regularization on Graphs , 2003, COLT.

[2]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[3]  Thomas Hofmann,et al.  Support Vector Machines for Multiple-Instance Learning , 2002, NIPS.

[4]  Antonio Fuduli,et al.  Minimizing Nonconvex Nonsmooth Functions via Cutting Planes and Proximity Control , 2003, SIAM J. Optim..

[5]  Jiayu Zhou,et al.  Clustered Multi-Task Learning Via Alternating Structure Optimization , 2011, NIPS.

[6]  Jieping Ye,et al.  A Convex Formulation for Learning a Shared Predictive Structure from Multiple Tasks , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Kristin P. Bennett,et al.  Fast Bundle Algorithm for Multiple-Instance Learning , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Massimiliano Pontil,et al.  Convex multi-task feature learning , 2008, Machine Learning.

[9]  Dinggang Shen,et al.  Structured sparsity regularized multiple kernel learning for Alzheimer's disease diagnosis , 2019, Pattern Recognit..

[10]  John Shawe-Taylor,et al.  Two view learning: SVM-2K, Theory and Practice , 2005, NIPS.

[11]  Zhi-Hua Zhou,et al.  Multi-instance learning by treating instances as non-I.I.D. samples , 2008, ICML '09.

[12]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[13]  Jingrui He,et al.  A Graphbased Framework for Multi-Task Multi-View Learning , 2011, ICML.

[14]  Yada Zhu,et al.  Heterogeneous representation learning with separable structured sparsity regularization , 2017, Knowledge and Information Systems.

[15]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2003, ICTAI.

[16]  Taesung Park,et al.  CyCADA: Cycle-Consistent Adversarial Domain Adaptation , 2017, ICML.

[17]  W. Gao,et al.  Information-Theoretic Multi-view Domain Adaptation: A Theoretical and Empirical Study , 2014, J. Artif. Intell. Res..

[18]  Dan Zhang,et al.  MI2LS: multi-instance learning from multiple informationsources , 2013, KDD.

[19]  Zhi-Hua Zhou,et al.  Multi-Instance Learning with Key Instance Shift , 2017, IJCAI.

[20]  Hiroyuki Yoshida,et al.  Heterogeneous data analysis: Online learning for medical-image-based diagnosis , 2017, Pattern Recognit..

[21]  Qi Zhang,et al.  EM-DD: An Improved Multiple-Instance Learning Technique , 2001, NIPS.

[22]  Philip S. Yu,et al.  Learning Multiple Tasks with Multilinear Relationship Networks , 2015, NIPS.

[23]  Andrew McCallum,et al.  Automating the Construction of Internet Portals with Machine Learning , 2000, Information Retrieval.

[24]  Thomas Gärtner,et al.  Multi-Instance Kernels , 2002, ICML.

[25]  Le Song,et al.  Nonparametric Estimation of Multi-View Latent Variable Models , 2013, ICML.

[26]  Eunho Yang,et al.  Asymmetric multi-task learning based on task relatedness and loss , 2016, ICML 2016.

[27]  Jun Wang,et al.  Solving the Multiple-Instance Problem: A Lazy Learning Approach , 2000, ICML.

[28]  Jianmin Wang,et al.  Multi-Adversarial Domain Adaptation , 2018, AAAI.

[29]  Jintao Zhang,et al.  Inductive multi-task learning with multiple view data , 2012, KDD.

[30]  Thomas G. Dietterich,et al.  Solving the Multiple Instance Problem with Axis-Parallel Rectangles , 1997, Artif. Intell..

[31]  Jingrui He,et al.  Task-Adversarial Co-Generative Nets , 2019, KDD.

[32]  Michael I. Jordan,et al.  Learning Transferable Features with Deep Adaptation Networks , 2015, ICML.

[33]  Tat-Seng Chua,et al.  Learning from Multiple Social Networks , 2016, Synthesis Lectures on Information Concepts, Retrieval, and Services.

[34]  Dit-Yan Yeung,et al.  A Convex Formulation for Learning Task Relationships in Multi-Task Learning , 2010, UAI.

[35]  Vikas Sindhwani,et al.  An RKHS for multi-view learning and manifold co-regularization , 2008, ICML '08.

[36]  Meng Wang,et al.  Oracle in Image Search: A Content-Based Approach to Performance Prediction , 2012, TOIS.

[37]  P. Tseng Convergence of a Block Coordinate Descent Method for Nondifferentiable Minimization , 2001 .

[38]  Rich Caruana,et al.  Multitask Learning , 1997, Machine Learning.

[39]  Bernhard Schölkopf,et al.  Learning with Hypergraphs: Clustering, Classification, and Embedding , 2006, NIPS.

[40]  Jiebo Luo,et al.  Multi-modal deep feature learning for RGB-D object detection , 2017, Pattern Recognit..

[41]  Yongxin Yang,et al.  Deep Multi-task Representation Learning: A Tensor Factorisation Approach , 2016, ICLR.

[42]  Michael I. Jordan,et al.  Deep Transfer Learning with Joint Adaptation Networks , 2016, ICML.

[43]  Jieping Ye,et al.  Deep Multi-Task Learning with Adversarial-and-Cooperative Nets , 2019, IJCAI.

[44]  Tomás Lozano-Pérez,et al.  A Framework for Multiple-Instance Learning , 1997, NIPS.

[45]  Sham M. Kakade,et al.  An Information Theoretic Framework for Multi-view Learning , 2008, COLT.

[46]  Tong Zhang,et al.  A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , 2005, J. Mach. Learn. Res..

[47]  Jingrui He,et al.  Learning with dual heterogeneity: a nonparametric bayes model , 2014, KDD.

[48]  Misha Denil,et al.  From Group to Individual Labels Using Deep Features , 2015, KDD.

[49]  Jingrui He,et al.  A Graph-Based Hybrid Framework for Modeling Complex Heterogeneity , 2015, 2015 IEEE International Conference on Data Mining.

[50]  Ning Chen,et al.  Predictive Subspace Learning for Multi-view Data: a Large Margin Approach , 2010, NIPS.

[51]  Trevor Darrell,et al.  Adversarial Discriminative Domain Adaptation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[52]  Jieping Ye,et al.  An accelerated gradient method for trace norm minimization , 2009, ICML '09.

[53]  Victor S. Lempitsky,et al.  Unsupervised Domain Adaptation by Backpropagation , 2014, ICML.