A brief introduction to weakly supervised learning

Supervised learning techniques construct predictive models by learning from a large number of training examples, where each training example has a label indicating its ground-truth output. Though current techniques have achieved great success, it is noteworthy that in many tasks it is difficult to get strong supervision information like fully ground-truth labels due to the high cost of the data-labeling process. Thus, it is desirable for machine-learning techniques to work with weak supervision. This article reviews some research progress of weakly supervised learning, focusing on three typical types of weak supervision: incomplete supervision, where only a subset of training data is given with labels; inexact supervision, where the training data are given with only coarse-grained labels; and inaccurate supervision, where the given labels are not always ground-truth.

[1]  Zhi-Hua Zhou,et al.  Ensemble Methods: Foundations and Algorithms , 2012 .

[2]  Guoqing Liu,et al.  Key Instance Detection in Multi-Instance Learning , 2012, ACML.

[3]  A. P. Dawid,et al.  Maximum Likelihood Estimation of Observer Error‐Rates Using the EM Algorithm , 1979 .

[4]  Aravind Srinivasan,et al.  Approximating Hyper-Rectangles: Learning and Pseudorandom Sets , 1998, J. Comput. Syst. Sci..

[5]  Alexander Zien,et al.  Semi-Supervised Learning , 2006 .

[6]  David J. Miller,et al.  A Mixture of Experts Classifier with Learning Based on Both Labelled and Unlabelled Data , 1996, NIPS.

[7]  Zhi-Hua Zhou,et al.  Crowdsourcing with unsure option , 2016, Machine Learning.

[8]  Zhi-Hua Zhou,et al.  Towards Making Unlabeled Data Never Hurt , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Fabrice Muhlenbach,et al.  Identifying and Handling Mislabelled Instances , 2004, Journal of Intelligent Information Systems.

[10]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Yixin Chen,et al.  MILES: Multiple-Instance Learning via Embedded Instance Selection , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Zhi-Hua Zhou When semi-supervised learning meets ensemble learning , 2011 .

[13]  Zhi-Hua Zhou,et al.  Multi-View Active Learning in the Non-Realizable Case , 2010, NIPS.

[14]  Matthias Hein,et al.  Manifold Denoising , 2006, NIPS.

[15]  Zhi-Hua Zhou,et al.  Active Learning from Crowds with Unsure Option , 2015, IJCAI.

[16]  Xin Xu,et al.  Logistic Regression and Boosting for Labeled Bags of Instances , 2004, PAKDD.

[17]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[18]  Naonori Ueda,et al.  A Hybrid Generative/Discriminative Approach to Semi-Supervised Classifier Design , 2005, AAAI.

[19]  Daren C. Brabham Crowdsourcing as a Model for Problem Solving , 2008 .

[20]  Xian-Sheng Hua,et al.  Two-Dimensional Active Learning for image classification , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Shai Ben-David,et al.  New England , 1894, Letters from America.

[22]  Ivor W. Tsang,et al.  Convex and scalable weakly labeled SVMs , 2013, J. Mach. Learn. Res..

[23]  Miguel Á. Carreira-Perpiñán,et al.  Proximity Graphs for Clustering and Manifold Learning , 2004, NIPS.

[24]  Zhi-Hua Zhou,et al.  Tri-training: exploiting unlabeled data using three classifiers , 2005, IEEE Transactions on Knowledge and Data Engineering.

[25]  Min-Ling Zhang,et al.  A Review on Multi-Label Learning Algorithms , 2014, IEEE Transactions on Knowledge and Data Engineering.

[26]  Steve Hanneke,et al.  Adaptive Rates of Convergence in Active Learning , 2009, COLT.

[27]  Zhi-Hua Zhou,et al.  Semi-supervised learning by disagreement , 2010, Knowledge and Information Systems.

[28]  Xiu-Shen Wei,et al.  An empirical study on image bag generators for multi-instance learning , 2016, Machine Learning.

[29]  Rong Jin,et al.  Active Learning by Querying Informative and Representative Examples , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Arnold W. M. Smeulders,et al.  Active learning using pre-clustering , 2004, ICML.

[31]  James R. Foulds,et al.  Revisiting Multiple-Instance Learning Via Embedded Instance Selection , 2008, Australasian Conference on Artificial Intelligence.

[32]  Matti Kääriäinen,et al.  Active Learning in the Non-realizable Case , 2006, ALT.

[33]  Xin Li,et al.  Active Learning with Multi-Label SVM Classification , 2013, IJCAI.

[34]  Sanjoy Dasgupta,et al.  Hierarchical sampling for active learning , 2008, ICML '08.

[35]  Jieping Ye,et al.  Querying discriminative and representative samples for batch mode active learning , 2013, KDD.

[36]  James R. Foulds,et al.  A review of multi-instance learning assumptions , 2010, The Knowledge Engineering Review.

[37]  Naftali Tishby,et al.  Homogeneous Multi-Instance Learning with Arbitrary Dependence , 2009, COLT.

[38]  Naoki Abe,et al.  Query Learning Strategies Using Boosting and Bagging , 1998, ICML.

[39]  Tat-Seng Chua,et al.  Image Annotation by Graph-Based Inference With Integrated Multiple/Single Instance Representations , 2010, IEEE Transactions on Multimedia.

[40]  Sanjoy Dasgupta,et al.  Analysis of a greedy active learning strategy , 2004, NIPS.

[41]  M. Verleysen,et al.  Classification in the Presence of Label Noise: A Survey , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[42]  Avrim Blum,et al.  Learning from Labeled and Unlabeled Data using Graph Mincuts , 2001, ICML.

[43]  Gerardo Hermosillo,et al.  Learning From Crowds , 2010, J. Mach. Learn. Res..

[44]  Ohad Shamir,et al.  Good learners for evil teachers , 2009, ICML '09.

[45]  Xiu-Shen Wei,et al.  Scalable Algorithms for Multi-Instance Learning , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[46]  Adam Tauman Kalai,et al.  A Note on Learning from Multiple-Instance Examples , 2004, Machine Learning.

[47]  Nihar B. Shah,et al.  Double or Nothing: Multiplicative Incentive Mechanisms for Crowdsourcing , 2014, J. Mach. Learn. Res..

[48]  Vittorio Castelli,et al.  On the exponential value of labeled samples , 1995, Pattern Recognit. Lett..

[49]  Friedhelm Schwenker,et al.  Partially supervised learning for pattern recognition , 2014, Pattern Recognit. Lett..

[50]  Carla E. Brodley,et al.  Identifying Mislabeled Training Data , 1999, J. Artif. Intell. Res..

[51]  R. C. Williamson,et al.  Degrees of supervision , 2011 .

[52]  Thomas Hofmann,et al.  Multiple Instance Learning for Computer Aided Diagnosis , 2007 .

[53]  Fabio Gagliardi Cozman,et al.  Unlabeled Data Can Degrade Classification Performance of Generative Classifiers , 2002, FLAIRS.

[54]  Zhi-Hua Zhou,et al.  Multi-Label Learning with Weak Label , 2010, AAAI.

[55]  Zhi-Hua Zhou,et al.  Crowdsourcing label quality: a theoretical analysis , 2015, Science China Information Sciences.

[56]  Panagiotis G. Ipeirotis,et al.  Get another label? improving data quality and data mining using multiple, noisy labelers , 2008, KDD.

[57]  Jennifer G. Dy,et al.  Active Learning from Crowds , 2011, ICML.

[58]  Devavrat Shah,et al.  Iterative Learning for Reliable Crowdsourcing Systems , 2011, NIPS.

[59]  H. Sebastian Seung,et al.  Query by committee , 1992, COLT '92.

[60]  Yixin Chen,et al.  Image Categorization by Learning and Reasoning with Regions , 2004, J. Mach. Learn. Res..

[61]  Bernhard Pfahringer,et al.  A Two-Level Learning Method for Generalized Multi-instance Problems , 2003, ECML.

[62]  Sally A. Goldman,et al.  MISSL: multiple-instance semi-supervised learning , 2006, ICML.

[63]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[64]  Hui Chen,et al.  A literature survey on smart cities , 2015, Science China Information Sciences.

[65]  Philip M. Long,et al.  PAC Learning Axis-Aligned Rectangles with Respect to Product Distributions from Multiple-Instance Examples , 1996, COLT.

[66]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[67]  Ming-Hsuan Yang,et al.  Robust Object Tracking with Online Multiple Instance Learning , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[68]  Mark Craven,et al.  Multiple-Instance Active Learning , 2007, NIPS.

[69]  Lu Wang,et al.  Cost-Saving Effect of Crowdsourcing Learning , 2016, IJCAI.

[70]  Chien-Ju Ho,et al.  Adaptive Task Assignment for Crowdsourced Classification , 2013, ICML.

[71]  William A. Gale,et al.  A sequential algorithm for training text classifiers , 1994, SIGIR '94.

[72]  Maria-Florina Balcan,et al.  Margin Based Active Learning , 2007, COLT.

[73]  Lu Wang,et al.  Risk Minimization in the Presence of Label Noise , 2016, AAAI.

[74]  Aravind Srinivasan,et al.  Approximating hyper-rectangles: learning and pseudo-random sets , 1997, STOC '97.

[75]  Yan Zhou,et al.  A Multiple Instance Learning Strategy for Combating Good Word Attacks on Spam Filters , 2008, J. Mach. Learn. Res..

[76]  Zhi-Hua Zhou,et al.  Multi-Instance Learning from Supervised View , 2006, Journal of Computer Science and Technology.

[77]  Paul A. Viola,et al.  Multiple Instance Boosting for Object Detection , 2005, NIPS.

[78]  Javier R. Movellan,et al.  Whose Vote Should Count More: Optimal Integration of Labels from Labelers of Unknown Expertise , 2009, NIPS.

[79]  Shipeng Yu,et al.  Eliminating Spammers and Ranking Annotators for Crowdsourced Labeling Tasks , 2012, J. Mach. Learn. Res..

[80]  Xi Chen,et al.  Optimistic Knowledge Gradient Policy for Optimal Budget Allocation in Crowdsourcing , 2013, ICML.

[81]  Sebastian Thrun,et al.  Text Classification from Labeled and Unlabeled Documents using EM , 2000, Machine Learning.

[82]  Thomas G. Dietterich,et al.  Solving the Multiple Instance Problem with Axis-Parallel Rectangles , 1997, Artif. Intell..

[83]  Jaume Amores,et al.  Multiple instance classification: Review, taxonomy and comparative study , 2013, Artif. Intell..

[84]  Zhi-Hua Zhou,et al.  On the relation between multi-instance learning and semi-supervised learning , 2007, ICML '07.

[85]  Zhi-Hua Zhou,et al.  Theoretical Foundation of Co-Training and Disagreement-Based Algorithms , 2017, ArXiv.

[86]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[87]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[88]  Brendan T. O'Connor,et al.  Cheap and Fast – But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks , 2008, EMNLP.

[89]  Dana Angluin,et al.  Learning from noisy examples , 1988, Machine Learning.

[90]  Zhi-Hua Zhou,et al.  Solving multi-instance problems with classifier ensemble based on constructive clustering , 2007, Knowledge and Information Systems.

[91]  Zhi-Hua Zhou,et al.  Multi-Label Active Learning: Query Type Matters , 2015, IJCAI.

[92]  Adam Tauman Kalai,et al.  Analysis of Perceptron-Based Active Learning , 2009, COLT.

[93]  Zhi-Hua Zhou,et al.  Multi-instance learning by treating instances as non-I.I.D. samples , 2008, ICML '09.

[94]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[95]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[96]  Nicholas R. Jennings,et al.  Efficient budget allocation with accuracy guarantees for crowdsourcing classification tasks , 2013, AAMAS.

[97]  Fei Wang,et al.  Label Propagation through Linear Neighborhoods , 2008, IEEE Trans. Knowl. Data Eng..

[98]  Zhuowen Tu,et al.  Unsupervised object class discovery via saliency-guided multiple class learning , 2012, CVPR.

[99]  Xiaojin Zhu,et al.  Semi-Supervised Learning Literature Survey , 2005 .

[100]  Thomas Hofmann,et al.  Support Vector Machines for Multiple-Instance Learning , 2002, NIPS.

[101]  Juan José Rodríguez Diez,et al.  Restricted set classification: Who is there? , 2017, Pattern Recognit..

[102]  Iñaki Inza,et al.  Weak supervision and other non-standard classification problems: A taxonomy , 2016, Pattern Recognit. Lett..

[103]  Thorsten Joachims,et al.  Transductive Inference for Text Classification using Support Vector Machines , 1999, ICML.