Integrating Informativeness, Representativeness and Diversity in Pool-Based Sequential Active Learning for Regression

In many real-world machine learning applications, unlabeled samples are easy to obtain, but it is expensive and/or time-consuming to label them. Active learning is a common approach for reducing this data labeling effort. It optimally selects the best few samples to label, so that a better machine learning model can be trained from the same number of labeled samples. This paper considers active learning for regression (ALR) problems. Three essential criteria – informativeness, representativeness, and diversity – have been proposed for ALR. However, very few approaches in the literature have considered all three of them simultaneously. We propose three new ALR approaches, with different strategies for integrating the three criteria. Extensive experiments on 12 datasets in various domains demonstrated their effectiveness.

[1]  Zoubin Ghahramani,et al.  Deep Bayesian Active Learning with Image Data , 2017, ICML.

[2]  Anders Krogh,et al.  Neural Network Ensembles, Cross Validation, and Active Learning , 1994, NIPS.

[3]  Dongrui Wu,et al.  Speech emotion estimation in 3D space , 2010, 2010 IEEE International Conference on Multimedia and Expo.

[4]  Wenbin Cai,et al.  Batch Mode Active Learning for Regression With Expected Model Change , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[5]  Dongrui Wu,et al.  Affect Estimation in 3D Space Using Multi-Task Active Learning for Regression , 2018, IEEE Transactions on Affective Computing.

[6]  Dongrui Wu,et al.  Pool-Based Sequential Active Learning for Regression , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[7]  Naoki Abe,et al.  Query Learning Strategies Using Boosting and Bagging , 1998, ICML.

[8]  Shrikanth S. Narayanan,et al.  Primitives-based evaluation and estimation of emotions in speech , 2007, Speech Commun..

[9]  Dongrui Wu,et al.  Active Learning for Regression Using Greedy Sampling , 2018, Inf. Sci..

[10]  David A. Cohn,et al.  Active Learning with Statistical Models , 1996, NIPS.

[11]  Chris H. Q. Ding,et al.  Active Learning for Support Vector Machines with Maximum Model Change , 2014, ECML/PKDD.

[12]  Mark Craven,et al.  An Analysis of Active Learning Strategies for Sequence Labeling Tasks , 2008, EMNLP.

[13]  Shrikanth S. Narayanan,et al.  The Vera am Mittag German audio-visual emotional speech database , 2008, 2008 IEEE International Conference on Multimedia and Expo.

[14]  Dongrui Wu,et al.  Acoustic feature analysis in speech emotion primitives estimation , 2010, INTERSPEECH.

[15]  Ross D. King,et al.  Active Learning for Regression Based on Query by Committee , 2007, IDEAL.

[16]  W. Marsden I and J , 2012 .

[17]  Mark Craven,et al.  Multiple-Instance Active Learning , 2007, NIPS.

[18]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[19]  Lorenzo Bruzzone,et al.  A multiple criteria active learning method for support vector regression , 2014, Pattern Recognit..

[20]  Hwanjo Yu,et al.  Passive Sampling for Regression , 2010, 2010 IEEE International Conference on Data Mining.

[21]  Jun Zhou,et al.  Maximizing Expected Model Change for Active Learning in Regression , 2013, 2013 IEEE 13th International Conference on Data Mining.

[22]  Leonard G. C. Hamey,et al.  Minimisation of data collection by active learning , 1995, Proceedings of ICNN'95 - International Conference on Neural Networks.

[23]  Shinichi Nakajima,et al.  Pool-based active learning in approximate linear regression , 2009, Machine Learning.

[24]  K. Kroschel,et al.  Emotion Estimation in Speech Using a 3D Emotion Space Concept , 2007 .