Deep Active Learning for Image Regression

Image regression is an important problem in computer vision and is useful in a variety of applications. However, training a robust regression model necessitates large amounts of labeled training data, which is time-consuming and expensive to acquire. Active learning algorithms automatically identify the salient and exemplar instances from large amounts of unlabeled data and tremendously reduce human annotation effort in inducing a machine learning model. Further, deep learning models like Convolutional Neural Networks (CNNs) have gained popularity to automatically learn representative features from a given dataset and have depicted promising performance in a variety of classification and regression applications. In this chapter, we exploit the feature learning capabilities of deep neural networks and propose a novel framework to address the problem of active learning for regression. We formulate a loss function (based on the expected model output change) relevant to the research task and exploit the gradient descent algorithm to optimize the loss and train the deep CNN. To the best of our knowledge, this is the first research effort to learn a discriminative set of features using deep neural networks to actively select informative samples in the regression setting. Our extensive empirical studies on five benchmark regression datasets (from three different application domains: rotation angle estimation of handwritten digits, age, and head pose estimation) demonstrate the merit of our framework in tremendously reducing human annotation effort to induce a robust regression model.

[1]  Thomas Brox,et al.  Learning to generate chairs with convolutional neural networks , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Ross D. King,et al.  Active Learning for Regression Based on Query by Committee , 2007, IDEAL.

[3]  William A. Gale,et al.  A sequential algorithm for training text classifiers , 1994, SIGIR '94.

[4]  Dale Schuurmans,et al.  Discriminative Batch Mode Active Learning , 2007, NIPS.

[5]  Dumitru Erhan,et al.  Deep Neural Networks for Object Detection , 2013, NIPS.

[6]  Rong Jin,et al.  Large-scale text categorization by batch mode active learning , 2006, WWW '06.

[7]  Xiaolong Wang,et al.  Active Deep Networks for Semi-Supervised Sentiment Classification , 2010, COLING.

[8]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[9]  David Martens,et al.  Active Learning-Based Pedagogical Rule Extraction , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[10]  Shaogang Gong,et al.  Fusion of perceptual cues for robust tracking of head pose and position , 2001, Pattern Recognit..

[11]  Christian Szegedy,et al.  DeepPose: Human Pose Estimation via Deep Neural Networks , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Sethuraman Panchanathan,et al.  Deep active learning for image classification , 2017, 2017 IEEE International Conference on Image Processing (ICIP).

[13]  Shinichi Nakajima,et al.  Pool-based active learning in approximate linear regression , 2009, Machine Learning.

[14]  Joachim M. Buhmann,et al.  TI-POOLING: Transformation-Invariant Pooling for Feature Learning in Convolutional Neural Networks , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Sethuraman Panchanathan,et al.  Adaptive Batch Mode Active Learning , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[16]  Yuhong Guo,et al.  Active Instance Sampling via Matrix Partition , 2010, NIPS.

[17]  Med Salim Bouhlel,et al.  Age estimation using deep learning , 2018, Comput. Electr. Eng..

[18]  Liang Lin,et al.  Deep Joint Task Learning for Generic Object Extraction , 2014, NIPS.

[19]  David A. Cohn,et al.  Active Learning with Statistical Models , 1996, NIPS.

[20]  Daniel Cremers,et al.  CAPTCHA Recognition with Active Deep Learning , 2015 .

[21]  Xiaogang Wang,et al.  Deep Convolutional Network Cascade for Facial Point Detection , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Rong Jin,et al.  Batch Mode Active Learning with Applications to Text Categorization and Image Retrieval , 2009, IEEE Transactions on Knowledge and Data Engineering.

[23]  Masashi Sugiyama,et al.  Active Learning in Approximately Linear Regression Based on Conditional Expectation of Generalization Error , 2006, J. Mach. Learn. Res..

[24]  Rama Chellappa,et al.  HyperFace: A Deep Multi-Task Learning Framework for Face Detection, Landmark Localization, Pose Estimation, and Gender Recognition , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Luc Van Gool,et al.  Deep Expectation of Real and Apparent Age from a Single Image Without Facial Landmarks , 2016, International Journal of Computer Vision.

[26]  Jun Zhou,et al.  Maximizing Expected Model Change for Active Learning in Regression , 2013, 2013 IEEE 13th International Conference on Data Mining.

[27]  Sethuraman Panchanathan,et al.  Multi-Label Deep Active Learning with Label Correlation , 2018, 2018 25th IEEE International Conference on Image Processing (ICIP).

[28]  Sethuraman Panchanathan,et al.  Multimodal emotion recognition using deep learning architectures , 2016, 2016 IEEE Winter Conference on Applications of Computer Vision (WACV).

[29]  Joachim Denzler,et al.  Active and Continuous Exploration with Deep Neural Networks and Expected Model Output Changes , 2016, ArXiv.

[30]  Xiaogang Wang,et al.  Cross-scene crowd counting via deep convolutional neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Dan Wang,et al.  A new active labeling method for deep learning , 2014, 2014 International Joint Conference on Neural Networks (IJCNN).

[32]  Xiaoou Tang,et al.  Facial Landmark Detection by Deep Multi-task Learning , 2014, ECCV.

[33]  Andrew Zisserman,et al.  Reading Text in the Wild with Convolutional Neural Networks , 2014, International Journal of Computer Vision.

[34]  Sethuraman Panchanathan,et al.  Transfer of multimodal emotion features in deep belief networks , 2016, 2016 50th Asilomar Conference on Signals, Systems and Computers.

[35]  Daphne Koller,et al.  Support Vector Machine Active Learning with Applications to Text Classification , 2000, J. Mach. Learn. Res..

[36]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[37]  Jitendra Malik,et al.  R-CNNs for Pose Estimation and Action Detection , 2014, ArXiv.

[38]  Xiaolong Wang,et al.  Deeply-Learned Feature for Age Estimation , 2015, 2015 IEEE Winter Conference on Applications of Computer Vision.

[39]  Nassir Navab,et al.  3D Pictorial Structures for Multiple Human Pose Estimation , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[40]  Roberto Battiti,et al.  Active Learning of Pareto Fronts , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[41]  Klaus Brinker,et al.  Incorporating Diversity in Active Learning with Support Vector Machines , 2003, ICML.

[42]  Luc Van Gool,et al.  DEX: Deep EXpectation of Apparent Age from a Single Image , 2015, 2015 IEEE International Conference on Computer Vision Workshop (ICCVW).

[43]  Xiang Chen,et al.  DeepCount: Crowd Counting with WiFi via Deep Learning , 2019, ArXiv.

[44]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[45]  Mark Craven,et al.  An Analysis of Active Learning Strategies for Sequence Labeling Tasks , 2008, EMNLP.

[46]  Peter Robinson,et al.  3D Constrained Local Model for rigid and non-rigid facial tracking , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[47]  Guoyan Zheng,et al.  Crowd Counting with Deep Negative Correlation Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[48]  Sethuraman Panchanathan,et al.  Batch Mode Active Sampling Based on Marginal Probability Distribution Matching , 2013, ACM Trans. Knowl. Discov. Data.

[49]  Shenghua Gao,et al.  Single-Image Crowd Counting via Multi-Column Convolutional Neural Network , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[50]  H. Sebastian Seung,et al.  Selective Sampling Using the Query by Committee Algorithm , 1997, Machine Learning.

[51]  J. Lafferty,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[52]  Andrew Zisserman,et al.  Deep Convolutional Neural Networks for Efficient Pose Estimation in Gesture Videos , 2014, ACCV.

[53]  Xiaoxiao Li,et al.  Semantic Image Segmentation via Deep Parsing Network , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[54]  Hwanjo Yu,et al.  Passive Sampling for Regression , 2010, 2010 IEEE International Conference on Data Mining.

[55]  Robert D. Nowak,et al.  Faster Rates in Regression via Active Learning , 2005, NIPS.

[56]  Xiaogang Wang,et al.  Crossing-Line Crowd Counting with Two-Phase Deep Neural Networks , 2016, ECCV.

[57]  Joachim Denzler,et al.  Selecting Influential Examples: Active Learning with Expected Model Output Changes , 2014, ECCV.

[58]  Glencora Borradaile,et al.  Batch Active Learning via Coordinated Matching , 2012, ICML.