Randomness is the Root of All Evil: More Reliable Evaluation of Deep Active Learning

Using deep neural networks for active learning (AL) poses significant challenges for the stability and the reproducibility of experimental results. Inconsistent settings continue to be the root causes for contradictory conclusions and in worst cases, for incorrect appraisal of methods. Our community is in search of a unified framework for exhaustive and fair evaluation of deep active learning. In this paper, we provide just such a framework, one which is built upon systematically fixing, containing and interpreting sources of randomness. We isolate different influence factors, such as neural-network initialization or hardware specifics, to assess their impact on the learning performance. We then use our framework to analyze the effects of basic AL settings, such as the query-batch size and the use of subset selection, and different datasets on AL performance. Our findings enable us to derive specific recommendations for the reliable evaluation of deep active learning, thus helping advance the community toward a more normative evaluation of results.

[1]  Sara Hooker,et al.  Randomness In Neural Network Training: Characterizing The Impact of Tooling , 2021, MLSys.

[2]  Zhihui Li,et al.  A Survey of Deep Active Learning , 2020, ACM Comput. Surv..

[3]  Munawar Hayat,et al.  Towards Robust and Reproducible Active Learning using Neural Networks , 2020, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Weijia Li,et al.  Influence Selection for Active Learning , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[5]  Akshay Krishnamurthy,et al.  Gone Fishing: Neural Active Learning with Fisher Embeddings , 2021, NeurIPS.

[6]  Rishabh K. Iyer,et al.  Effective Evaluation of Deep Active Learning on Image Classification Tasks , 2021, ArXiv.

[7]  Tae-Kyun Kim,et al.  Sequential Graph Convolutional Network for Active Learning , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Kwang In Kim,et al.  Task-Aware Variational Adversarial Active Learning , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Pan Zhou,et al.  Towards Theoretically Understanding Why SGD Generalizes Better Than ADAM in Deep Learning , 2020, NeurIPS.

[10]  John Langford,et al.  Deep Batch Active Learning by Diverse, Uncertain Gradient Lower Bounds , 2019, ICLR.

[11]  Alexandre Lacoste,et al.  Quantifying the Carbon Emissions of Machine Learning , 2019, ArXiv.

[12]  In So Kweon,et al.  Learning Loss for Active Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Trevor Darrell,et al.  Variational Adversarial Active Learning , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[14]  Kwang In Kim,et al.  Ranking CGANs: Subjective Control over Semantic Image Attributes , 2018, BMVC.

[15]  Silvio Savarese,et al.  Active Learning for Convolutional Neural Networks: A Core-Set Approach , 2017, ICLR.

[16]  Bernhard Sick,et al.  Challenges of Reliable, Realistic and Comparable Active Learning Evaluation , 2017, IAL@PKDD/ECML.

[17]  Nathan Srebro,et al.  The Marginal Value of Adaptive Gradient Methods in Machine Learning , 2017, NIPS.

[18]  Ruimao Zhang,et al.  Cost-Effective Active Learning for Deep Image Classification , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[19]  Zhuowen Tu,et al.  Aggregated Residual Transformations for Deep Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[21]  Amanda Ross,et al.  Basic and Advanced Statistical Tests , 2017 .

[22]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Zoubin Ghahramani,et al.  Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.

[24]  Franziska Abend,et al.  Facility Location Concepts Models Algorithms And Case Studies , 2016 .

[25]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[26]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[27]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[28]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[29]  Zoubin Ghahramani,et al.  Bayesian Active Learning for Classification and Preference Learning , 2011, ArXiv.

[30]  Joachim M. Buhmann,et al.  The Balanced Accuracy and Its Posterior Distribution , 2010, 2010 20th International Conference on Pattern Recognition.

[31]  Nikolaos Papanikolopoulos,et al.  Multi-class active learning for image classification , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[32]  Mark Craven,et al.  An Analysis of Active Learning Strategies for Sequence Labeling Tasks , 2008, EMNLP.

[33]  Kasturi R. Varadarajan,et al.  Geometric Approximation via Coresets , 2007 .

[34]  Dan Roth,et al.  Margin-Based Active Learning for Structured Output Spaces , 2006, ECML.

[35]  Tong Zhang,et al.  Solving large scale linear prediction problems using stochastic gradient descent algorithms , 2004, ICML.

[36]  Lutz Prechelt,et al.  Early Stopping-But When? , 1996, Neural Networks: Tricks of the Trade.

[37]  William A. Gale,et al.  A sequential algorithm for training text classifiers , 1994, SIGIR '94.

[38]  William H. Press,et al.  Recursive stratified sampling for multidimensional Monte Carlo integration , 1990 .

[39]  J. Rice Mathematical Statistics and Data Analysis , 1988 .