Software/Hardware Co-design for Multi-modal Multi-task Learning in Autonomous Systems

Optimizing the quality of result (QoR) and the quality of service (QoS) of AI-empowered autonomous systems simultaneously is very challenging. First, there are multiple input sources, e.g., multimodal data from different sensors, requiring diverse data preprocessing, sensor fusion, and feature aggregation. Second, there are multiple tasks that require various AI models to run simultaneously, e.g., perception, localization, and control. Third, the computing and control system is heterogeneous, composed of hardware components with varied features, such as embedded CPUs, GPUs, FPGAs, and dedicated accelerators. Therefore, autonomous systems essentially require multi-modal multitask (MMMT) learning which must be aware of hardware performance and implementation strategies. While MMMT learning has been attracting intensive research interests, its applications in autonomous systems are still underexplored. In this paper, we first discuss the opportunities of applying MMMT techniques in autonomous systems, and then discuss the unique challenges that must be solved. In addition, we discuss the necessity and opportunities of MMMT model and hardware co-design, which is critical for autonomous systems especially with power/resource-limited or heterogeneous platforms. We formulate the MMMT model and heterogeneous hardware implementation co-design as a differentiable optimization problem, with the objective of improving the solution quality and reducing the overall power consumption and critical path latency. We advocate for further explorations of MMMT in autonomous systems and software/hardware co-design solutions.

[1]  Zhikui Chen,et al.  A Survey on Deep Learning for Multimodal Data Fusion , 2020, Neural Computation.

[2]  Chrisantha Fernando,et al.  PathNet: Evolution Channels Gradient Descent in Super Neural Networks , 2017, ArXiv.

[3]  Hairong Qi,et al.  CenterFusion: Center-based Radar and Camera Fusion for 3D Object Detection , 2020, ArXiv.

[4]  Trevor Cohn,et al.  Low Resource Dependency Parsing: Cross-lingual Parameter Sharing in a Neural Network Parser , 2015, ACL.

[5]  Dacheng Tao,et al.  Deep Multimodal Neural Architecture Search , 2020, ACM Multimedia.

[6]  Rich Caruana,et al.  Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.

[7]  Richard Socher,et al.  The Natural Language Decathlon: Multitask Learning as Question Answering , 2018, ArXiv.

[8]  Narada Warakagoda,et al.  Fusion of LiDAR and Camera Images in End-to-end Deep Learning for Steering an Off-road Unmanned Ground Vehicle , 2019, 2019 22th International Conference on Information Fusion (FUSION).

[9]  Chun C. Lai,et al.  Multisensor Fusion and Integration: Theories, Applications, and its Perspectives , 2011, IEEE Sensors Journal.

[10]  Ramesh Raskar,et al.  Designing Neural Network Architectures using Reinforcement Learning , 2016, ICLR.

[11]  Ping Li,et al.  Current trends in the development of intelligent unmanned autonomous systems , 2017, Frontiers of Information Technology & Electronic Engineering.

[12]  Mengjie Zhang,et al.  A Survey on Evolutionary Neural Architecture Search , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[13]  Tom Minka,et al.  A* Sampling , 2014, NIPS.

[14]  Francois Charette,et al.  Hierarchical Multi-task Deep Neural Network Architecture for End-to-End Driving , 2019, ArXiv.

[15]  Liang Lin,et al.  SNAS: Stochastic Neural Architecture Search , 2018, ICLR.

[16]  Azim Eskandarian,et al.  End-to-End Multi-Task Machine Learning of Vehicle Dynamics for Steering Angle Prediction for Autonomous Driving , 2019, Volume 3: 21st International Conference on Advanced Vehicle Technologies; 16th International Conference on Design Education.

[17]  Wensheng Zhang,et al.  Large-Scale Online Multitask Learning and Decision Making for Flexible Manufacturing , 2016, IEEE Transactions on Industrial Informatics.

[18]  Yiming Yang,et al.  DARTS: Differentiable Architecture Search , 2018, ICLR.

[19]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[20]  Arijit Raychowdhury,et al.  14.1 A 65nm 1.1-to-9.1TOPS/W Hybrid-Digital-Mixed-Signal Computing Platform for Accelerating Model-Based and Model-Free Swarm Robotics , 2019, 2019 IEEE International Solid- State Circuits Conference - (ISSCC).

[21]  T. Edgar,et al.  Smart Manufacturing. , 2015, Annual review of chemical and biomolecular engineering.

[22]  Yee Whye Teh,et al.  The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables , 2016, ICLR.

[23]  Oihana Otaegui,et al.  Multimodal Deep Learning for Advanced Driving Systems , 2018, AMDO.

[24]  Yao Chen,et al.  System-level design solutions: Enabling the IoT explosion , 2015, 2015 IEEE 11th International Conference on ASIC (ASICON).

[25]  Yuandong Tian,et al.  FBNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural Architecture Search , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Oscar Mayora-Ibarra,et al.  Choosing the Best Sensor Fusion Method: A Machine-Learning Approach , 2020, Sensors.

[27]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[28]  Frédéric Jurie,et al.  MFAS: Multimodal Fusion Architecture Search , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Wei Liu,et al.  MTL-NAS: Task-Agnostic Neural Architecture Search Towards General-Purpose Multi-Task Learning , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Yuhong Li,et al.  A Hybrid GPU + FPGA System Design for Autonomous Driving Cars , 2019, 2019 IEEE International Workshop on Signal Processing Systems (SiPS).

[31]  Martial Hebert,et al.  Cross-Stitch Networks for Multi-task Learning , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Mykel J. Kochenderfer,et al.  Algorithms for Optimization , 2019 .

[33]  Quoc V. Le,et al.  Neural Architecture Search with Reinforcement Learning , 2016, ICLR.

[34]  Louis-Philippe Morency,et al.  Multimodal Machine Learning: A Survey and Taxonomy , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35]  Jason Cong,et al.  Platform choices and design demands for IoT platforms: cost, power, and performance tradeoffs , 2016, IET Cyper-Phys. Syst.: Theory & Appl..

[36]  Markus Lienkamp,et al.  A Deep Learning-based Radar and Camera Sensor Fusion Architecture for Object Detection , 2019, 2019 Sensor Data Fusion: Trends, Solutions, Applications (SDF).

[37]  Marcus Rohrbach,et al.  12-in-1: Multi-Task Vision and Language Representation Learning , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Chunxiao Liu,et al.  Learning to Steer by Mimicking Features from Heterogeneous Auxiliary Networks , 2018, AAAI.

[39]  Donghui Guo,et al.  Improving the Generalization Ability of Deep Neural Networks for Cross-Domain Visual Recognition , 2020 .

[40]  Yu Zhang,et al.  A Survey on Multi-Task Learning , 2017, IEEE Transactions on Knowledge and Data Engineering.

[41]  Frank Hutter,et al.  Neural Architecture Search: A Survey , 2018, J. Mach. Learn. Res..

[42]  Jongmin Yu,et al.  Context-Aware Multi-Task Learning for Traffic Scene Recognition in Autonomous Vehicles , 2020, 2020 IEEE Intelligent Vehicles Symposium (IV).

[43]  Yu Cheng,et al.  Fully-Adaptive Feature Sharing in Multi-Task Networks with Applications in Person Attribute Classification , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Jiebo Luo,et al.  End-to-end Multi-Modal Multi-Task Vehicle Control for Self-Driving Cars with Visual Perceptions , 2018, 2018 24th International Conference on Pattern Recognition (ICPR).

[45]  Ramakanth Pasunuru,et al.  Dynamic Multi-Level Multi-Task Learning for Sentence Simplification , 2018, COLING.

[46]  Priyanka Agrawal,et al.  OmniNet: A unified architecture for multi-modal multi-task learning , 2019, ArXiv.

[47]  Senthil Yogamani,et al.  NeurAll: Towards a Unified Model for Visual Perception in Automated Driving , 2019, ArXiv.

[48]  Jinjun Xiong,et al.  EDD: Efficient Differentiable DNN Architecture and Implementation Co-search for Embedded AI Solutions , 2020, 2020 57th ACM/IEEE Design Automation Conference (DAC).

[49]  Oscar Beijbom,et al.  PointPainting: Sequential Fusion for 3D Object Detection , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[50]  Rameswar Panda,et al.  AdaShare: Learning What To Share For Efficient Deep Multi-Task Learning , 2020, NeurIPS.

[51]  Lukasz Kaiser,et al.  One Model To Learn Them All , 2017, ArXiv.

[52]  Farshad Khorrami,et al.  Sliding-Window Temporal Attention Based Deep Learning System for Robust Sensor Modality Fusion for UGV Navigation , 2019, IEEE Robotics and Automation Letters.

[53]  Elliot Meyerson,et al.  Evolutionary architecture search for deep multitask networks , 2018, GECCO.

[54]  Hyemin Lee,et al.  Multi-task Learning with Future States for Vision-based Autonomous Driving , 2020 .

[55]  Wolfram Burgard,et al.  VLocNet++: Deep Multitask Learning for Semantic Visual Localization and Odometry , 2018, IEEE Robotics and Automation Letters.

[56]  Asif Ekbal,et al.  A Deep Multi-task Contextual Attention Framework for Multi-modal Affect Analysis , 2020, ACM Trans. Knowl. Discov. Data.

[57]  Huapeng Wu,et al.  FusionLane: Multi-Sensor Fusion for Lane Marking Semantic Segmentation Using Deep Neural Networks , 2020, IEEE Transactions on Intelligent Transportation Systems.

[58]  G. Ding Discrete optimization , 1977 .

[59]  Michael Crawshaw,et al.  Multi-Task Learning with Deep Neural Networks: A Survey , 2020, ArXiv.

[60]  Karl Zipser,et al.  MultiNet: Multi-Modal Multi-Task Learning for Autonomous Driving , 2017, 2019 IEEE Winter Conference on Applications of Computer Vision (WACV).

[61]  Jingda Wu,et al.  Multi-Modal Sensor Fusion-Based Deep Neural Network for End-to-End Autonomous Driving With Scene Understanding , 2020, IEEE Sensors Journal.

[62]  Deming Chen,et al.  NAIS: Neural Architecture and Implementation Search and its Applications in Autonomous Driving , 2019, 2019 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).