WeakAL: Combining Active Learning and Weak Supervision

Supervised Learning requires a huge amount of labeled data, making efficient labeling one of the most critical components for the success of Machine Learning (ML). One well-known method to gain labeled data efficiently is Active Learning (AL), where the learner interactively asks human experts to label the most informative data point. Nevertheless, even by applying AL in labeling tasks the amount of human effort is still too high and should be minimized further. In this paper therefore we propose WeakAL, which incorporates Weak Supervision (WS) techniques directly into the AL cycle. This allows us to reduce the number of annotations by human experts while keeping the same level of ML performance. We investigate different WS strategies as well as different parameter combinations for a wide range of real-world datasets. Our evaluation shows that for example in the context of Web table classification, 55% of otherwise manually retrieved labels can be generated by WS techniques with a negligible loss of test accuracy by 0.31% only. To further prove the general applicability of our approach we applied it to six datasets from the AL challenge from Guyon et al., where over 90% of the labels could be computed by the WS techniques, while still achieving competitive competition results.

[1]  Craig A. Knoblock,et al.  Active + Semi-supervised Learning = Robust Multi-View Learning , 2002, ICML.

[2]  Stefan C. Kremer,et al.  Clustering unlabeled data with SOMs improves classification of labeled real-world data , 2002, Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN'02 (Cat. No.02CH37290).

[3]  D. Sculley,et al.  Web-scale k-means clustering , 2010, WWW '10.

[4]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[5]  Wolfgang Lehner,et al.  Building the Dresden Web Table Corpus: A Classification Approach , 2015, 2015 IEEE/ACM 2nd International Symposium on Big Data Computing (BDC).

[6]  David A. Cohn,et al.  Improving generalization with active learning , 1994, Machine Learning.

[7]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[8]  H. J. Scudder,et al.  Probability of error of some adaptive pattern-recognition machines , 1965, IEEE Trans. Inf. Theory.

[9]  Stefan Wrobel,et al.  Active Hidden Markov Models for Information Extraction , 2001, IDA.

[10]  Christopher Ré,et al.  Snuba: Automating Weak Supervision to Label Training Data , 2018, Proc. VLDB Endow..

[11]  Isabelle Guyon,et al.  Results of the Active Learning Challenge , 2011, Active Learning and Experimental Design @ AISTATS.

[12]  Ran El-Yaniv,et al.  Online Choice of Active Learning Algorithms , 2003, J. Mach. Learn. Res..

[13]  Taghi M. Khoshgoftaar,et al.  Identifying learners robust to low quality data , 2008, 2008 IEEE International Conference on Information Reuse and Integration.

[14]  David D. Lewis,et al.  A sequential algorithm for training text classifiers: corrigendum and additional data , 1995, SIGF.

[15]  Zoubin Ghahramani,et al.  Learning from labeled and unlabeled data with label propagation , 2002 .

[16]  Zhi-Hua Zhou,et al.  A brief introduction to weakly supervised learning , 2018 .

[17]  Xiaojin Zhu,et al.  --1 CONTENTS , 2006 .

[18]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[19]  J. H. Ward Hierarchical Grouping to Optimize an Objective Function , 1963 .

[20]  Andrew McCallum,et al.  Employing EM and Pool-Based Active Learning for Text Classification , 1998, ICML.

[21]  Lehel Csató,et al.  Active Learning with Clustering , 2011, Active Learning and Experimental Design @ AISTATS.