论文信息 - Towards a unified multi-source-based optimization framework for multi-label learning

Towards a unified multi-source-based optimization framework for multi-label learning

Abstract In the era of Big Data, a practical yet challenging task is to make learning techniques more universally applicable in dealing with the complex learning problem, such as multi-source multi-label learning. While some of the early work have developed many effective solutions for multi-label classification and multi-source fusion separately, in this paper we learn the two problems together, and propose a novel method for the joint learning of multiple class labels and data sources, in which an optimization framework is constructed to formulate the learning problem, and the result of multi-label classification is induced by the weighted combination of the decisions from multiple sources. The proposed method is responsive in exploiting the label correlations and fusing multi-source data, especially in the fusion of long-tail data. Experiments on various multi-source multi-label data sets reveal the advantages of the proposed method.

[1] Felix Naumann,et al. Data fusion , 2009, CSUR.

[2] Xindong Wu,et al. Data mining with big data , 2014, IEEE Transactions on Knowledge and Data Engineering.

[3] Huajun Chen,et al. Modern bioinformatics meets traditional Chinese medicine , 2014, Briefings Bioinform..

[4] Nitish Srivastava,et al. Multimodal learning with deep Boltzmann machines , 2012, J. Mach. Learn. Res..

[5] Andreas Holzinger,et al. Interactive machine learning for health informatics: when do we need the human-in-the-loop? , 2016, Brain Informatics.

[6] Hen-Hong Chang,et al. Latent class model based diagnostic system utilizing traditional Chinese medicine for patients with systemic lupus erythematosus , 2011, Expert Syst. Appl..

[7] Qinghua Hu,et al. Multi-label feature selection with streaming labels , 2016, Inf. Sci..

[8] Yongcheng Li,et al. Joint similar and specific learning for diabetes mellitus and impaired glucose regulation detection , 2017, Inf. Sci..

[9] Bo Zhao,et al. A Confidence-Aware Approach for Truth Discovery on Long-Tail Data , 2014, Proc. VLDB Endow..

[10] Patrick Haffner,et al. Support vector machines for histogram-based image classification , 1999, IEEE Trans. Neural Networks.

[11] Jia Zhang,et al. Multi-label learning with label-specific features by resolving label correlations , 2018, Knowl. Based Syst..

[12] P. Bork,et al. A side effect resource to capture phenotypic effects of drugs , 2010, Molecular systems biology.

[13] Olivier Bodenreider,et al. The Unified Medical Language System (UMLS): integrating biomedical terminology , 2004, Nucleic Acids Res..

[14] Javier Bajo,et al. Multi-source homogeneous data clustering for multi-target detection from cluttered background with misdetection , 2017, Appl. Soft Comput..

[15] Geoff Holmes,et al. Classifier chains for multi-label classification , 2009, Machine Learning.

[16] Eyke Hüllermeier,et al. Multilabel classification via calibrated label ranking , 2008, Machine Learning.

[17] Chih-Jen Lin,et al. LIBSVM: A library for support vector machines , 2011, TIST.

[18] Guozheng Li,et al. Modelling of inquiry diagnosis for coronary heart disease in traditional Chinese medicine by using multi-label learning , 2010, BMC complementary and alternative medicine.

[19] Jia Zhang,et al. Computational drug repositioning using collaborative filtering via multi-source fusion , 2017, Expert Syst. Appl..

[20] Zoran Obradovic,et al. Computational Drug Repositioning by Ranking and Integrating Multiple Data Sources , 2013, ECML/PKDD.

[21] Alexandros Labrinidis,et al. Challenges and Opportunities with Big Data , 2012, Proc. VLDB Endow..

[22] Min-Ling Zhang,et al. A Review on Multi-Label Learning Algorithms , 2014, IEEE Transactions on Knowledge and Data Engineering.

[23] David S. Wishart,et al. DrugBank: a knowledgebase for drugs, drug actions and drug targets , 2007, Nucleic Acids Res..

[24] Yanli Wang,et al. PubChem: a public information system for analyzing bioactivities of small molecules , 2009, Nucleic Acids Res..

[25] Mark E. J. Newman,et al. Power-Law Distributions in Empirical Data , 2007, SIAM Rev..

[26] Stephen P. Boyd,et al. Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[27] B. S. Manjunath,et al. Multi-Label Learning With Fused Multimodal Bi-Relational Graph , 2014, IEEE Transactions on Multimedia.

[28] Lei Wu,et al. Lift: Multi-Label Learning with Label-Specific Features , 2015, IEEE Trans. Pattern Anal. Mach. Intell..

[29] Marinka Zitnik,et al. Data Fusion by Matrix Factorization , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30] Zhengrui Jiang. A Decision-Theoretic Framework for Numerical Attribute Value Reconciliation , 2012, IEEE Transactions on Knowledge and Data Engineering.

[31] Jing Zhang,et al. Similarity computing model of high dimension data for symptom classification of Chinese traditional medicine , 2009, Appl. Soft Comput..

[32] Yizhou Sun,et al. A Graph-Based Consensus Maximization Approach for Combining Multiple Supervised and Unsupervised Models , 2013, IEEE Transactions on Knowledge and Data Engineering.

[33] Philip S. Yu,et al. Multi-label Ensemble Learning , 2011, ECML/PKDD.

[34] Sebastián Ventura,et al. A Tutorial on Multilabel Learning , 2015, ACM Comput. Surv..

[35] S. Siva Sathya,et al. Evolutionary algorithms for de novo drug design - A survey , 2015, Appl. Soft Comput..

[36] Xindong Wu,et al. Learning Label-Specific Features and Class-Dependent Labels for Multi-Label Classification , 2016, IEEE Transactions on Knowledge and Data Engineering.

[37] Jiebo Luo,et al. Learning multi-label scene classification , 2004, Pattern Recognit..

[38] Jie Duan,et al. Multi-label feature selection based on neighborhood mutual information , 2016, Appl. Soft Comput..