Addressing practical challenges in Active Learning via a hybrid query strategy

Active Learning (AL) is a powerful tool to address modern machine learning problems with significantly fewer labeled training instances. However, implementation of traditional AL methodologies in practical scenarios is accompanied by multiple challenges due to the inherent assumptions. There are several hindrances, such as unavailability of labels for the AL algorithm at the beginning; unreliable external source of labels during the querying process; or incompatible mechanisms to evaluate the performance of Active Learner. Inspired by these practical challenges, we present a hybrid query strategy-based AL framework that addresses three practical challenges simultaneously: cold-start, oracle uncertainty and performance evaluation of Active Learner in the absence of ground truth. While a pre-clustering approach is employed to address the cold-start problem, the uncertainty surrounding the expertise of labeler and confidence in the given labels is incorporated to handle oracle uncertainty. The heuristics obtained during the querying process serve as the fundamental premise for accessing the performance of Active Learner. The robustness of the proposed AL framework is evaluated across three different environments and industrial settings. The results demonstrate the capability of the proposed framework to tackle practical challenges during AL implementation in real-world scenarios. 1

[1]  Babji Srinivasan,et al.  Fault Diagnosis and Degradation Analysis of PMDC motors using FEA based models , 2020, 2020 IEEE International Conference on Power Electronics, Smart Grid and Renewable Energy (PESGRE2020).

[2]  Jean Dezert,et al.  Disagreement based semi-supervised learning approaches with belief functions , 2020, Knowl. Based Syst..

[3]  Sanjoy Dasgupta,et al.  Interactive Structure Learning with Structural Query-by-Committee , 2018, NeurIPS.

[4]  Sethuraman Panchanathan,et al.  Batch mode active sampling based on marginal probability distribution matching , 2012, TKDD.

[5]  Bernhard Sick,et al.  Combining Self-reported Confidences from Uncertain Annotators to Improve Label Quality , 2019, 2019 International Joint Conference on Neural Networks (IJCNN).

[6]  Keyvan Sasani,et al.  Collaborative Multi-Expert Active Learning for Mobile Health Monitoring: Architecture, Algorithms, and Evaluation , 2020, Sensors.

[7]  Fadi Dornaika Active Two Phase Collaborative Representation Classifier , 2019, ACM Trans. Knowl. Discov. Data.

[8]  A. Gorban,et al.  The Five Factor Model of personality and evaluation of drug consumption risk , 2015, 1506.06297.

[9]  Margret Keuper,et al.  Unsupervised Bootstrapping of Active Learning for Entity Resolution , 2020, ESWC.

[10]  Bernhard Sick,et al.  Simulation of Annotators for Active Learning: Uncertain Oracles , 2017, IAL@PKDD/ECML.

[11]  Xingquan Zhu,et al.  Active learning with uncertain labeling knowledge , 2014, Pattern Recognit. Lett..

[12]  Liye Xiao,et al.  A structure with density-weighted active learning-based model selection strategy and meteorological analysis for wind speed vector deterministic and probabilistic forecasting , 2019, Energy.

[13]  Péter Horváth,et al.  modAL: A modular active learning framework for Python , 2018, ArXiv.

[14]  Anit Kumar Sahu,et al.  Noisy Batch Active Learning with Deterministic Annealing. , 2020 .

[15]  Yifei Li,et al.  An uncertainty and density based active semi-supervised learning scheme for positive unlabeled multivariate time series classification , 2017, Knowl. Based Syst..

[16]  Qing Wang,et al.  Learning to Sample: An Active Learning Framework , 2019, 2019 IEEE International Conference on Data Mining (ICDM).

[17]  Jieping Ye,et al.  Querying discriminative and representative samples for batch mode active learning , 2013, KDD.

[18]  Lili Yin,et al.  Incorporate active learning to semi-supervised industrial fault classification , 2019, Journal of Process Control.

[19]  Weng-Keen Wong,et al.  Discovering Anomalies by Incorporating Feedback from an Expert , 2020, ACM Trans. Knowl. Discov. Data.

[20]  Rajagopalan Srinivasan,et al.  Hierarchically Distributed Fault Detection and Identification through Dempster-Shafer Evidence Fusion , 2011 .

[21]  Sebastian Nowozin,et al.  Icebreaker: Element-wise Active Information Acquisition with Bayesian Deep Latent Gaussian Model , 2019, ArXiv.

[22]  Emanuel Aldea,et al.  Evidential query-by-committee active learning for pedestrian detection in high-density crowds , 2019, Int. J. Approx. Reason..

[23]  Babji Srinivasan,et al.  Addressing Uncertainties within Active Learning for Industrial IoT , 2021, 2021 IEEE 7th World Forum on Internet of Things (WF-IoT).

[24]  Chunyan Miao,et al.  Online Active Learning with Expert Advice , 2018, ACM Trans. Knowl. Discov. Data.

[25]  Burr Settles,et al.  From Theories to Queries: Active Learning in Practice , 2011 .

[26]  Marco Loog,et al.  A variance maximization criterion for active learning , 2017, Pattern Recognit..

[27]  Hongxia Jin,et al.  Adversarial Active Learning for Sequences Labeling and Generation , 2018, IJCAI.

[28]  Moamar Sayed Mouchaweh,et al.  Online active learning for human activity recognition from sensory data streams , 2020, Neurocomputing.

[29]  Antanas Verikas,et al.  Agreeing to disagree: active learning with noisy labels without crowdsourcing , 2017, International Journal of Machine Learning and Cybernetics.

[30]  Yang Gao,et al.  Active learning with confidence-based answers for crowdsourcing labeling tasks , 2018, Knowl. Based Syst..

[31]  Xiaojin Zhu,et al.  --1 CONTENTS , 2006 .