Dynamic Task Weighting Methods for Multi-task Networks in Autonomous Driving Systems

Deep multi-task networks are of particular interest for autonomous driving systems. They can potentially strike an excellent trade-off between predictive performance, hardware constraints and efficient use of information from multiple types of annotations and modalities. However, training such models is non-trivial and requires balancing learning over all tasks as their respective losses display different scales, ranges and dynamics across training. Multiple task weighting methods that adjust the losses in an adaptive way have been proposed recently on different datasets and combinations of tasks, making it difficult to compare them. In this work, we review and systematically evaluate nine task weighting strategies on common grounds on three automotive datasets (KITTI, Cityscapes and WoodScape). We then propose a novel method combining evolutionary meta-learning and task-based selective backpropagation, for computing task weights leading to reliable network training. Our method outperforms state-of-the-art methods by a significant margin on a two-task application.

[1]  Alessandro Lazaric,et al.  Bayesian Multi-Task Reinforcement Learning , 2010, ICML.

[2]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  W. Hager,et al.  and s , 2019, Shallow Water Hydraulics.

[4]  John McDonald,et al.  Vision-Based Driver Assistance Systems: Survey, Taxonomy and Advances , 2015, 2015 IEEE 18th International Conference on Intelligent Transportation Systems.

[5]  Philip S. Yu,et al.  Learning Multiple Tasks with Multilinear Relationship Networks , 2015, NIPS.

[6]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[7]  Sinisa Segvic,et al.  Convolutional Scale Invariance for Semantic Segmentation , 2016, GCPR.

[8]  Sebastian Ramos,et al.  The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Roberto Cipolla,et al.  Multi-task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[10]  Gorjan Alagic,et al.  #p , 2019, Quantum information & computation.

[11]  P. Alam ‘G’ , 2021, Composites Engineering: An A–Z Guide.

[12]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[13]  Razvan Pascanu,et al.  Progressive Neural Networks , 2016, ArXiv.

[14]  Senthil Yogamani,et al.  NeurAll: Towards a Unified Visual Perception Model for Automated Driving , 2019, 2019 IEEE Intelligent Transportation Systems Conference (ITSC).

[15]  Martial Hebert,et al.  Cross-Stitch Networks for Multi-task Learning , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Senthil Yogamani,et al.  Monocular Fisheye Camera Depth Estimation Using Sparse LiDAR Supervision , 2018, 2018 21st International Conference on Intelligent Transportation Systems (ITSC).

[17]  Senthil Yogamani,et al.  AuxNet: Auxiliary tasks enhanced Semantic Segmentation for Automated Driving , 2019, VISIGRAPP.

[18]  Martin Jägersand,et al.  MODNet: Motion and Appearance based Moving Object Detection Network for Autonomous Driving , 2018, 2018 21st International Conference on Intelligent Transportation Systems (ITSC).

[19]  Li Fei-Fei,et al.  Dynamic Task Prioritization for Multitask Learning , 2018, ECCV.

[20]  Leonidas J. Guibas,et al.  Taskonomy: Disentangling Task Transfer Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[21]  Quoc V. Le,et al.  Searching for MobileNetV3 , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[22]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Jitendra Malik,et al.  Which Tasks Should Be Learned Together in Multi-task Learning? , 2019, ICML.

[24]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[25]  Josef Pauli,et al.  Understanding the Interplay of Simultaneous Model Selection and Representation Optimization for Classification Tasks , 2016, ICPRAM.

[26]  Andrea Vedaldi,et al.  Learning multiple visual domains with residual adapters , 2017, NIPS.

[27]  Roberto Cipolla,et al.  MultiNet: Real-time Joint Semantic Reasoning for Autonomous Driving , 2016, 2018 IEEE Intelligent Vehicles Symposium (IV).

[28]  Léon Bottou,et al.  Large-Scale Machine Learning with Stochastic Gradient Descent , 2010, COMPSTAT.

[29]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[30]  Joachim Bingel,et al.  Sluice networks: Learning what to share between loosely related tasks , 2017, ArXiv.

[31]  Stefan Milz,et al.  WoodScape: A Multi-Task, Multi-Camera Fisheye Dataset for Autonomous Driving , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[32]  Martin Jägersand,et al.  Deep semantic segmentation for automated driving: Taxonomy, roadmap and challenges , 2017, 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC).

[33]  Joachim Bingel,et al.  Latent Multi-Task Architecture Learning , 2017, AAAI.

[34]  Andrew Zisserman,et al.  Multi-task Self-Supervised Visual Learning , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[35]  Quoc V. Le,et al.  AutoAugment: Learning Augmentation Policies from Data , 2018, ArXiv.

[36]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Senthil Yogamani,et al.  MultiNet++: Multi-Stream Feature Aggregation and Geometric Loss Strategy for Multi-Task Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[38]  Senthil Yogamani,et al.  Real-time Joint Object Detection and Semantic Segmentation Network for Automated Driving , 2019, ArXiv.

[39]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[40]  Svetlana Lazebnik,et al.  PackNet: Adding Multiple Tasks to a Single Network by Iterative Pruning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[41]  Quoc V. Le,et al.  Searching for MobileNetV3 , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[42]  Iasonas Kokkinos,et al.  UberNet: Training a Universal Convolutional Neural Network for Low-, Mid-, and High-Level Vision Using Diverse Datasets and Limited Memory , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Luc Van Gool,et al.  Fast Scene Understanding for Autonomous Driving , 2017, ArXiv.

[44]  Li Fei-Fei,et al.  Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[45]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[46]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[47]  John McDonald,et al.  Computer vision in automated parking systems: Design, implementation and challenges , 2017, Image Vis. Comput..

[48]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[49]  Rich Caruana,et al.  Multitask Learning: A Knowledge-Based Source of Inductive Bias , 1993, ICML.

[50]  Ji Wu,et al.  Rapid adaptation for deep neural networks through multi-task learning , 2015, INTERSPEECH.

[51]  Svetlana Lazebnik,et al.  Piggyback: Adapting a Single Network to Multiple Tasks by Learning to Mask Weights , 2018, ECCV.

[52]  Zhao Chen,et al.  GradNorm: Gradient Normalization for Adaptive Loss Balancing in Deep Multitask Networks , 2017, ICML.

[53]  Vladlen Koltun,et al.  Multi-Task Learning as Multi-Objective Optimization , 2018, NeurIPS.

[54]  Andrew J. Davison,et al.  End-To-End Multi-Task Learning With Attention , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[55]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[56]  Ameet Talwalkar,et al.  Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization , 2016, J. Mach. Learn. Res..

[57]  Quoc V. Le,et al.  Neural Architecture Search with Reinforcement Learning , 2016, ICLR.

[58]  Jason Weston,et al.  A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[59]  Senthil Yogamani,et al.  FisheyeMODNet: Moving Object detection on Surround-view Cameras for Autonomous Driving , 2019, ArXiv.

[60]  Fred W. Glover,et al.  Future paths for integer programming and links to artificial intelligence , 1986, Comput. Oper. Res..

[61]  Senthil Yogamani,et al.  SoilingNet: Soiling Detection on Automotive Surround-View Cameras , 2019, 2019 IEEE Intelligent Transportation Systems Conference (ITSC).

[62]  Senthil Yogamani,et al.  Visual SLAM for Automated Driving: Exploring the Applications of Deep Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[63]  Tom Schaul,et al.  Natural Evolution Strategies , 2008, 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence).

[64]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.