CrowdRL: An End-to-End Reinforcement Learning Framework for Data Labelling

Data labelling is very important in many database and machine learning applications. Traditional methods rely on humans (workers or experts) to acquire labels. However, the human cost is rather expensive for a large dataset. Active learning based methods only label a small set of data with large uncertainty, train a model on these labelled data, and use the trained model to label the remainder unlabelled data. However they have two limitations. First, they cannot judiciously select appropriate data (task selection) and assign the tasks to proper humans (task assignment). Moreover, they independently process task selection and task assignment, which cannot capture the correlation between them. Second, they simply infer the truth of a task based on the answers from humans and the trained model (truth inference) by independently modeling humans and models. In other words, they ignore the correlation between them (the labelled data may have noise caused by humans with biases, and the model trained by the noisy labels may bring additional biases), and thus lead to poor inference results.To address these limitations, in this paper, we propose CrowdRL, an end-to-end reinforcement learning (RL) based framework for data labelling. To the best of our knowledge, CrowdRL is the first RL framework designed for the data labelling workflow by seamlessly integrating task selection, task assignment and truth inference together. CrowdRL fully utilizes the power of heterogeneous annotators (experts and crowdsourcing workers) and machine learning models together to infer the truth, which highly improves the quality of data labelling. CrowdRL uses RL to model task assignment and task selection, and designs an agent to judiciously assign tasks to appropriate workers. CrowdRL jointly models the answers of workers, experts and models, and designs a joint inference model to infer the truths. Experimental results on real datasets show that CrowdRL outperforms state-of-the-art approaches with the same (even fewer) monetary cost while achieving 5%-20% higher accuracy.

[1]  Tian Tian,et al.  Max-Margin Majority Voting for Learning from Crowds , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Tom Schaul,et al.  Rainbow: Combining Improvements in Deep Reinforcement Learning , 2017, AAAI.

[3]  Nicola Santoro,et al.  Min-max heaps and generalized priority queues , 1986, CACM.

[4]  Lei Chen,et al.  DLTA: A Framework for Dynamic Crowdsourcing Classification Tasks , 2019, IEEE Transactions on Knowledge and Data Engineering.

[5]  Christopher Ré,et al.  Osprey: Weak Supervision of Imbalanced Extraction Problems without Code , 2019, DEEM@SIGMOD.

[6]  Fabio Massimo Zanzotto Human-in-the-loop Artificial Intelligence , 2017, ArXiv.

[7]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[8]  Xiaoyong Du,et al.  CrowdGame: A Game-Based Crowdsourcing System for Cost-Effective Data Labeling , 2019, SIGMOD Conference.

[9]  Guoliang Li,et al.  Crowdsourcing Database Systems: Overview and Challenges , 2019, 2019 IEEE 35th International Conference on Data Engineering (ICDE).

[10]  Masaki Kobayashi,et al.  Quality-aware Dynamic Task Assignment in Human+AI Crowd , 2020, WWW.

[11]  Guoliang Li,et al.  Truth Inference in Crowdsourcing: Is the Problem Solved? , 2017, Proc. VLDB Endow..

[12]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[13]  Xuanjing Huang,et al.  Part-of-Speech Tagging for Twitter with Adversarial Neural Networks , 2017, EMNLP.

[14]  Jennifer Widom,et al.  Deco: declarative crowdsourcing , 2012, CIKM.

[15]  Guoliang Li,et al.  Crowdsourced Data Management: Overview and Challenges , 2017, SIGMOD Conference.

[16]  Martha Larson,et al.  Fashion 10000: an enriched social image dataset for fashion and clothing , 2014, MMSys '14.

[17]  Purnamrita Sarkar,et al.  Scaling Up Crowd-Sourcing to Very Large Datasets: A Case for Active Learning , 2014, Proc. VLDB Endow..

[18]  David Silver,et al.  Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.

[19]  David A. Cohn,et al.  Active Learning with Statistical Models , 1996, NIPS.

[20]  P. Alam,et al.  H , 1887, High Explosives, Propellants, Pyrotechnics.

[21]  Christopher Ré,et al.  Snorkel: Rapid Training Data Creation with Weak Supervision , 2017, Proc. VLDB Endow..

[22]  Guoliang Li,et al.  Evaluating Public Anxiety for Topic-based Communities in Social Networks , 2020 .

[23]  Guoliang Li,et al.  Crowdsourced Data Management: A Survey , 2016, IEEE Transactions on Knowledge and Data Engineering.

[24]  Daniel Gorges,et al.  Relations between Model Predictive Control and Reinforcement Learning , 2017 .

[25]  Reynold Cheng,et al.  A Crowdsourcing Framework for Collecting Tabular Data , 2020, IEEE Transactions on Knowledge and Data Engineering.

[26]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[27]  Guoliang Li,et al.  DeepEye: An automatic big data visualization framework , 2018, Big Data Min. Anal..

[28]  Xingquan Zhu,et al.  Deep Learning for User Interest and Response Prediction in Online Display Advertising , 2020, Data Science and Engineering.

[29]  Jie Yang,et al.  Leveraging Crowdsourcing Data for Deep Active Learning An Application: Learning Intents in Alexa , 2018, WWW.

[30]  Murat Demirbas,et al.  Crowdsourcing for Multiple-Choice Question Answering , 2014, AAAI.

[31]  P. Olver Nonlinear Systems , 2013 .

[32]  John Langford,et al.  Importance weighted active learning , 2008, ICML '09.

[33]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[34]  Guoliang Li,et al.  DOCS: Domain-Aware Crowdsourcing System , 2016, Proc. VLDB Endow..

[35]  Guoliang Li,et al.  A Rating-Ranking Method for Crowdsourced Top-k Computation , 2018, SIGMOD Conference.

[36]  Kun-Ta Chuang,et al.  Effective Quality Assurance for Data Labels through Crowdsourcing and Domain Expert Collaboration , 2018, EDBT.

[37]  Christopher Ré,et al.  Snuba: Automating Weak Supervision to Label Training Data , 2018, Proc. VLDB Endow..

[38]  Guoliang Li,et al.  A partial-order-based framework for cost-effective crowdsourced entity resolution , 2018, The VLDB Journal.

[39]  Xu Chu,et al.  GOGGLES: Automatic Image Labeling with Affinity Coding , 2019, SIGMOD Conference.

[40]  Guoliang Li,et al.  Crowdsourced Top-k Algorithms: An Experimental Evaluation , 2016, Proc. VLDB Endow..

[41]  Xiang Li,et al.  An End-to-End Deep RL Framework for Task Arrangement in Crowdsourcing Platforms , 2019, 2020 IEEE 36th International Conference on Data Engineering (ICDE).

[42]  Tom Schaul,et al.  Prioritized Experience Replay , 2015, ICLR.

[43]  Reynold Cheng,et al.  QASCA: A Quality-Aware Task Assignment System for Crowdsourcing Applications , 2015, SIGMOD Conference.

[44]  Guoliang Li,et al.  CrowdOTA: An Online Task Assignment System in Crowdsourcing , 2018, 2018 IEEE 34th International Conference on Data Engineering (ICDE).

[45]  Songfan Yang,et al.  Learning Effective Embeddings From Crowdsourced Labels: An Educational Case Study , 2019, 2019 IEEE 35th International Conference on Data Engineering (ICDE).

[46]  Hongzhi Wang,et al.  Mining conditional functional dependency rules on big data , 2020, Big Data Min. Anal..

[47]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.

[48]  Howard M. Schwartz,et al.  Multi-Agent Machine Learning: A Reinforcement Approach , 2014 .

[49]  Tim Kraska,et al.  CrowdER: Crowdsourcing Entity Resolution , 2012, Proc. VLDB Endow..