From Active Learning to Dedicated Collaborative Interactive Learning

Active learning (AL) is a machine learning paradigm where an active learner has to train a model (e.g., a classifier) which is in principle trained in a supervised way. AL has to be done by means of a data set where a low fraction of samples (also termed data points or observations) are labeled. To obtain labels for the unlabeled samples, the active learner has to ask an oracle (e.g., a human expert) for labels. In most cases, the goal is to maximize some metric assessing the task performance (e.g., the classification accuracy) and to minimize the number of queries at the same time. In this article, we first briefly discuss the state-of-the-art in the field of AL. Then, we propose the concept of dedicated collaborative interactive learning (D-CIL) and describe some research challenges. With D-CIL, we will overcome many of the harsh limitations of current AL. In particular, we envision scenarios where the expert may be wrong for various reasons. There also might be several or even many experts with different expertise who collaborate, the experts may label not only samples but also supply knowledge at a higher level such as rules, and we consider that the labeling costs depend on many conditions. Moreover, human experts may even profit by improving their own knowledge when they get feedback from the active learner.

[1]  Jan Marco Leimeister,et al.  Collaboration Engineering - IT-gestützte Zusammenarbeitsprozesse systematisch entwickeln und durchführen , 2014 .

[2]  Vikas Sindhwani,et al.  Active Dual Supervision: Reducing the Cost of Annotating Examples and Features , 2009, HLT-NAACL 2009.

[3]  Lior Rokach,et al.  Novel active learning methods for enhanced PC malware detection in windows OS , 2014, Expert Syst. Appl..

[4]  Albrecht Schmidt Following or leading? , 2015, Interactions.

[5]  Bernhard Sick,et al.  Collaborative Knowledge Discovery & Data Mining: From Knowledge to Experience , 2007, 2007 IEEE Symposium on Computational Intelligence and Data Mining.

[6]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[7]  Gökhan Tür,et al.  An active approach to spoken language processing , 2006, TSLP.

[8]  Lorenzo Bruzzone,et al.  A multiple criteria active learning method for support vector regression , 2014, Pattern Recognit..

[9]  Ulrich Paquet,et al.  Vuvuzelas & Active Learning for Online Classification , 2010 .

[10]  Robert F Murphy,et al.  An active role for machine learning in drug development. , 2011, Nature chemical biology.

[11]  Farid Melgani,et al.  Kernel ridge regression with active learning for wind speed prediction , 2013 .

[12]  Bernhard Sick,et al.  Let us know your decision: Pool-based active training of a generative classifier with the selection strategy 4DS , 2013, Inf. Sci..

[13]  Edoardo Pasolli,et al.  Ensemble Multiple Kernel Active Learning For Classification of Multisource Remote Sensing Data , 2015, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[14]  Nikolaos Papanikolopoulos,et al.  Scalable Active Learning for Multiclass Image Classification , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  William A. Gale,et al.  A sequential algorithm for training text classifiers , 1994, SIGIR '94.

[16]  Hwanjo Yu,et al.  SVM selective sampling for ranking with application to data retrieval , 2005, KDD '05.

[17]  Paul Lukowicz,et al.  Opportunistic human activity and context recognition , 2013, Computer.

[18]  Ricardo M. Marcacini,et al.  An active learning approach to frequent itemset-based text clustering , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[19]  Robert Wagner,et al.  Technical data mining with evolutionary radial basis function classifiers , 2009, Appl. Soft Comput..

[20]  Amihai Motro,et al.  Uncertainty Management in Information Systems: From Needs to Solution , 1996 .

[21]  Farid Melgani,et al.  Gaussian process regression within an active learning scheme , 2011, 2011 IEEE International Geoscience and Remote Sensing Symposium.

[22]  Bernhard Sick,et al.  Transductive active learning - A new semi-supervised learning approach based on iteratively refined generative models to capture structure in data , 2015, Inf. Sci..

[23]  Jun Zhou,et al.  Maximizing Expected Model Change for Active Learning in Regression , 2013, 2013 IEEE 13th International Conference on Data Mining.

[24]  Ulrich Bretschneider,et al.  Towards Successful Crowdsourcing Projects: Evaluating the Implementation of Governance Mechanisms , 2015, ICIS.

[25]  Robert F. Murphy,et al.  Efficient discovery of responses of proteins to compounds using active learning , 2013, BMC Bioinformatics.

[26]  Dana Angluin,et al.  Queries and concept learning , 1988, Machine Learning.

[27]  Jan Marco Leimeister,et al.  Development of a Peer-Creation-Process to Leverage the Power of Collaborative Knowledge Transfer , 2016, 2016 49th Hawaii International Conference on System Sciences (HICSS).

[28]  Lihong Li,et al.  Unbiased online active learning in data streams , 2011, KDD.

[29]  Gökhan Tür,et al.  Active learning for spoken language understanding , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[30]  Christian Igel,et al.  Active learning with support vector machines , 2014, WIREs Data Mining Knowl. Discov..

[31]  David A. Cohn,et al.  Training Connectionist Networks with Queries and Selective Sampling , 1989, NIPS.

[32]  Gerd Stumme,et al.  Attribute Exploration with Background Implications and Exceptions , 1996 .

[33]  Bernhard Sick,et al.  Lower Bound Bayesian Networks - An Efficient Inference of Lower Bounds on Probability Distributions in Bayesian Networks , 2009, UAI.

[34]  Wolfgang Wörndl,et al.  Active learning strategies for exploratory mobile recommender systems , 2014, CARR '14.

[35]  Horace Ho-Shing Ip,et al.  Active Learning with SVM , 2009, Encyclopedia of Artificial Intelligence.

[36]  Michael Kaufmann,et al.  Modeling and Designing Real-World Networks , 2009, Algorithmics of Large and Complex Networks.

[37]  Aristidis Likas,et al.  An incremental training method for the probabilistic RBF network , 2006, IEEE Trans. Neural Networks.