论文信息 - A brief introduction to weakly supervised learning

A brief introduction to weakly supervised learning

Supervised learning techniques construct predictive models by learning from a large number of training examples, where each training example has a label indicating its ground-truth output. Though current techniques have achieved great success, it is noteworthy that in many tasks it is difficult to get strong supervision information like fully ground-truth labels due to the high cost of the data-labeling process. Thus, it is desirable for machine-learning techniques to work with weak supervision. This article reviews some research progress of weakly supervised learning, focusing on three typical types of weak supervision: incomplete supervision, where only a subset of training data is given with labels; inexact supervision, where the training data are given with only coarse-grained labels; and inaccurate supervision, where the given labels are not always ground-truth.

Zhi-Hua Zhou | Zhi-Hua Zhou

[1] Zhi-Hua Zhou,et al. Ensemble Methods: Foundations and Algorithms , 2012 .

[2] Guoqing Liu,et al. Key Instance Detection in Multi-Instance Learning , 2012, ACML.

[3] A. P. Dawid,et al. Maximum Likelihood Estimation of Observer Error‐Rates Using the EM Algorithm , 1979 .

[4] Aravind Srinivasan,et al. Approximating Hyper-Rectangles: Learning and Pseudorandom Sets , 1998, J. Comput. Syst. Sci..

[5] Alexander Zien,et al. Semi-Supervised Learning , 2006 .

[6] David J. Miller,et al. A Mixture of Experts Classifier with Learning Based on Both Labelled and Unlabelled Data , 1996, NIPS.

[7] Zhi-Hua Zhou,et al. Crowdsourcing with unsure option , 2016, Machine Learning.

[8] Zhi-Hua Zhou,et al. Towards Making Unlabeled Data Never Hurt , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9] Fabrice Muhlenbach,et al. Identifying and Handling Mislabelled Instances , 2004, Journal of Intelligent Information Systems.

[10] David A. McAllester,et al. Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11] Yixin Chen,et al. MILES: Multiple-Instance Learning via Embedded Instance Selection , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12] Zhi-Hua Zhou. When semi-supervised learning meets ensemble learning , 2011 .

[13] Zhi-Hua Zhou,et al. Multi-View Active Learning in the Non-Realizable Case , 2010, NIPS.

[14] Matthias Hein,et al. Manifold Denoising , 2006, NIPS.

[15] Zhi-Hua Zhou,et al. Active Learning from Crowds with Unsure Option , 2015, IJCAI.

[16] Xin Xu,et al. Logistic Regression and Boosting for Labeled Bags of Instances , 2004, PAKDD.

[17] Avrim Blum,et al. The Bottleneck , 2021, Monopsony Capitalism.

[18] Naonori Ueda,et al. A Hybrid Generative/Discriminative Approach to Semi-Supervised Classifier Design , 2005, AAAI.

[19] Daren C. Brabham. Crowdsourcing as a Model for Problem Solving , 2008 .

[20] Xian-Sheng Hua,et al. Two-Dimensional Active Learning for image classification , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[21] Shai Ben-David,et al. New England , 1894, Letters from America.

[22] Ivor W. Tsang,et al. Convex and scalable weakly labeled SVMs , 2013, J. Mach. Learn. Res..

[23] Miguel Á. Carreira-Perpiñán,et al. Proximity Graphs for Clustering and Manifold Learning , 2004, NIPS.

[24] Zhi-Hua Zhou,et al. Tri-training: exploiting unlabeled data using three classifiers , 2005, IEEE Transactions on Knowledge and Data Engineering.

[25] Min-Ling Zhang,et al. A Review on Multi-Label Learning Algorithms , 2014, IEEE Transactions on Knowledge and Data Engineering.

[26] Steve Hanneke,et al. Adaptive Rates of Convergence in Active Learning , 2009, COLT.

[27] Zhi-Hua Zhou,et al. Semi-supervised learning by disagreement , 2010, Knowledge and Information Systems.

[28] Xiu-Shen Wei,et al. An empirical study on image bag generators for multi-instance learning , 2016, Machine Learning.

[29] Rong Jin,et al. Active Learning by Querying Informative and Representative Examples , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30] Arnold W. M. Smeulders,et al. Active learning using pre-clustering , 2004, ICML.

[31] James R. Foulds,et al. Revisiting Multiple-Instance Learning Via Embedded Instance Selection , 2008, Australasian Conference on Artificial Intelligence.

[32] Matti Kääriäinen,et al. Active Learning in the Non-realizable Case , 2006, ALT.

[33] Xin Li,et al. Active Learning with Multi-Label SVM Classification , 2013, IJCAI.

[34] Sanjoy Dasgupta,et al. Hierarchical sampling for active learning , 2008, ICML '08.

[35] Jieping Ye,et al. Querying discriminative and representative samples for batch mode active learning , 2013, KDD.

[36] James R. Foulds,et al. A review of multi-instance learning assumptions , 2010, The Knowledge Engineering Review.

[37] Naftali Tishby,et al. Homogeneous Multi-Instance Learning with Arbitrary Dependence , 2009, COLT.

[38] Naoki Abe,et al. Query Learning Strategies Using Boosting and Bagging , 1998, ICML.

[39] Tat-Seng Chua,et al. Image Annotation by Graph-Based Inference With Integrated Multiple/Single Instance Representations , 2010, IEEE Transactions on Multimedia.

[40] Sanjoy Dasgupta,et al. Analysis of a greedy active learning strategy , 2004, NIPS.

[41] M. Verleysen,et al. Classification in the Presence of Label Noise: A Survey , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[42] Avrim Blum,et al. Learning from Labeled and Unlabeled Data using Graph Mincuts , 2001, ICML.

[43] Gerardo Hermosillo,et al. Learning From Crowds , 2010, J. Mach. Learn. Res..

[44] Ohad Shamir,et al. Good learners for evil teachers , 2009, ICML '09.

[45] Xiu-Shen Wei,et al. Scalable Algorithms for Multi-Instance Learning , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[46] Adam Tauman Kalai,et al. A Note on Learning from Multiple-Instance Examples , 2004, Machine Learning.

[47] Nihar B. Shah,et al. Double or Nothing: Multiplicative Incentive Mechanisms for Crowdsourcing , 2014, J. Mach. Learn. Res..

[48] Vittorio Castelli,et al. On the exponential value of labeled samples , 1995, Pattern Recognit. Lett..

[49] Friedhelm Schwenker,et al. Partially supervised learning for pattern recognition , 2014, Pattern Recognit. Lett..

[50] Carla E. Brodley,et al. Identifying Mislabeled Training Data , 1999, J. Artif. Intell. Res..

[51] R. C. Williamson,et al. Degrees of supervision , 2011 .

[52] Thomas Hofmann,et al. Multiple Instance Learning for Computer Aided Diagnosis , 2007 .

[53] Fabio Gagliardi Cozman,et al. Unlabeled Data Can Degrade Classification Performance of Generative Classifiers , 2002, FLAIRS.

[54] Zhi-Hua Zhou,et al. Multi-Label Learning with Weak Label , 2010, AAAI.

[55] Zhi-Hua Zhou,et al. Crowdsourcing label quality: a theoretical analysis , 2015, Science China Information Sciences.

[56] Panagiotis G. Ipeirotis,et al. Get another label? improving data quality and data mining using multiple, noisy labelers , 2008, KDD.

[57] Jennifer G. Dy,et al. Active Learning from Crowds , 2011, ICML.

[58] Devavrat Shah,et al. Iterative Learning for Reliable Crowdsourcing Systems , 2011, NIPS.

[59] H. Sebastian Seung,et al. Query by committee , 1992, COLT '92.

[60] Yixin Chen,et al. Image Categorization by Learning and Reasoning with Regions , 2004, J. Mach. Learn. Res..

[61] Bernhard Pfahringer,et al. A Two-Level Learning Method for Generalized Multi-instance Problems , 2003, ECML.

[62] Sally A. Goldman,et al. MISSL: multiple-instance semi-supervised learning , 2006, ICML.

[63] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[64] Hui Chen,et al. A literature survey on smart cities , 2015, Science China Information Sciences.

[65] Philip M. Long,et al. PAC Learning Axis-Aligned Rectangles with Respect to Product Distributions from Multiple-Instance Examples , 1996, COLT.

[66] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[67] Ming-Hsuan Yang,et al. Robust Object Tracking with Online Multiple Instance Learning , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[68] Mark Craven,et al. Multiple-Instance Active Learning , 2007, NIPS.

[69] Lu Wang,et al. Cost-Saving Effect of Crowdsourcing Learning , 2016, IJCAI.

[70] Chien-Ju Ho,et al. Adaptive Task Assignment for Crowdsourced Classification , 2013, ICML.

[71] William A. Gale,et al. A sequential algorithm for training text classifiers , 1994, SIGIR '94.

[72] Maria-Florina Balcan,et al. Margin Based Active Learning , 2007, COLT.

[73] Lu Wang,et al. Risk Minimization in the Presence of Label Noise , 2016, AAAI.

[74] Aravind Srinivasan,et al. Approximating hyper-rectangles: learning and pseudo-random sets , 1997, STOC '97.

[75] Yan Zhou,et al. A Multiple Instance Learning Strategy for Combating Good Word Attacks on Spam Filters , 2008, J. Mach. Learn. Res..

[76] Zhi-Hua Zhou,et al. Multi-Instance Learning from Supervised View , 2006, Journal of Computer Science and Technology.

[77] Paul A. Viola,et al. Multiple Instance Boosting for Object Detection , 2005, NIPS.

[78] Javier R. Movellan,et al. Whose Vote Should Count More: Optimal Integration of Labels from Labelers of Unknown Expertise , 2009, NIPS.

[79] Shipeng Yu,et al. Eliminating Spammers and Ranking Annotators for Crowdsourced Labeling Tasks , 2012, J. Mach. Learn. Res..

[80] Xi Chen,et al. Optimistic Knowledge Gradient Policy for Optimal Budget Allocation in Crowdsourcing , 2013, ICML.

[81] Sebastian Thrun,et al. Text Classification from Labeled and Unlabeled Documents using EM , 2000, Machine Learning.

[82] Thomas G. Dietterich,et al. Solving the Multiple Instance Problem with Axis-Parallel Rectangles , 1997, Artif. Intell..

[83] Jaume Amores,et al. Multiple instance classification: Review, taxonomy and comparative study , 2013, Artif. Intell..

[84] Zhi-Hua Zhou,et al. On the relation between multi-instance learning and semi-supervised learning , 2007, ICML '07.

[85] Zhi-Hua Zhou,et al. Theoretical Foundation of Co-Training and Disagreement-Based Algorithms , 2017, ArXiv.

[86] Geoffrey E. Hinton,et al. Deep Learning , 2015, Nature.

[87] Burr Settles,et al. Active Learning Literature Survey , 2009 .

[88] Brendan T. O'Connor,et al. Cheap and Fast – But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks , 2008, EMNLP.

[89] Dana Angluin,et al. Learning from noisy examples , 1988, Machine Learning.

[90] Zhi-Hua Zhou,et al. Solving multi-instance problems with classifier ensemble based on constructive clustering , 2007, Knowledge and Information Systems.

[91] Zhi-Hua Zhou,et al. Multi-Label Active Learning: Query Type Matters , 2015, IJCAI.

[92] Adam Tauman Kalai,et al. Analysis of Perceptron-Based Active Learning , 2009, COLT.

[93] Zhi-Hua Zhou,et al. Multi-instance learning by treating instances as non-I.I.D. samples , 2008, ICML '09.

[94] Zoubin Ghahramani,et al. Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[95] Bernhard Schölkopf,et al. Learning with Local and Global Consistency , 2003, NIPS.

[96] Nicholas R. Jennings,et al. Efficient budget allocation with accuracy guarantees for crowdsourcing classification tasks , 2013, AAMAS.

[97] Fei Wang,et al. Label Propagation through Linear Neighborhoods , 2008, IEEE Trans. Knowl. Data Eng..

[98] Zhuowen Tu,et al. Unsupervised object class discovery via saliency-guided multiple class learning , 2012, CVPR.

[99] Xiaojin Zhu,et al. --1 CONTENTS , 2006 .

[100] Thomas Hofmann,et al. Support Vector Machines for Multiple-Instance Learning , 2002, NIPS.

[101] Juan José Rodríguez Diez,et al. Restricted set classification: Who is there? , 2017, Pattern Recognit..

[102] Iñaki Inza,et al. Weak supervision and other non-standard classification problems: A taxonomy , 2016, Pattern Recognit. Lett..

[103] Thorsten Joachims,et al. Transductive Inference for Text Classification using Support Vector Machines , 1999, ICML.