论文信息 - Workshop on Machine Learning for Autonomous Vehicles 2017

Workshop on Machine Learning for Autonomous Vehicles 2017

3: Efficient deep neural networks for perception in autonomous driving (Jose M. Alvarez, TRI) in ICML Workshop on Machine Learning for Autonomous Vehicles 2017, 09:00 AM Abstract Convolutional neural networks have achieved impressive success in many tasks in computer vision such as image classification, object detection / recognition or semantic segmentation. While these networks have proven effective in all these applications, they come at a high memory and computational cost, thus not feasible for applications where power and computational resources are limited. In addition, the process to train the network reduces productivity as it not only requires large computer servers but also takes a significant amount of time (several weeks) with the additional cost of engineering the architecture. In this talk, I first introduce our efficient architecture based on filter-compositions and then, a novel approach to jointly learn the architecture and explicitly account for compression during the training process. Our results show that we can learn much more compact models and significantly reduce training and inference time.Convolutional neural networks have achieved impressive success in many tasks in computer vision such as image classification, object detection / recognition or semantic segmentation. While these networks have proven effective in all these applications, they come at a high memory and computational cost, thus not feasible for applications where power and computational resources are limited. In addition, the process to train the network reduces productivity as it not only requires large computer servers but also takes a significant amount of time (several weeks) with the additional cost of engineering the architecture. In this talk, I first introduce our efficient architecture based on filter-compositions and then, a novel approach to jointly learn the architecture and explicitly account for compression during the training process. Our results show that we can learn much more compact models and significantly reduce training and inference time. Bio: Dr. Jose Alvarez is a senior research scientist at Toyota Research Institute. His main research interests are in developing robust and efficient deep learning algorithms for perception with focus on autonomous vehicles. Previously, he was a researcher at Data61 / CSIRO (formerly NICTA), a Postdoctoral researcher at the Courant Institute of Mathematical Science, New York University, and visiting scholar at University of Amsterdam and Group Research Electronics at Volkswagen. Dr. Alvarez graduated in 2012 and he was awarded the best Ph.D. Thesis award. Dr. Alvarez serves as associate editor for IEEE Trans. on Intelligent Transportation Systems. Abstract 4: Visual 3D Scene Understanding and Prediction for ADAS (Manmohan Chandraker, NEC Labs) in ICML Workshop on Machine Learning for Autonomous Vehicles 2017, 09:30 AM4: Visual 3D Scene Understanding and Prediction for ADAS (Manmohan Chandraker, NEC Labs) in ICML Workshop on Machine Learning for Autonomous Vehicles 2017, 09:30 AM Abstract: Modern advanced driver assistance systems (ADAS) rely on a range of sensors including radar, ultrasound, LIDAR and cameras. Active sensors have found applications in detecting traffic participants (TPs) such as cars or pedestrians and scene elements (SEs) such as roads. However, camera-based systems have the potential to achieve or augment these capabilities at a much lower cost, while allowing new ones such as determination of TP and SE semantics as well as their interactions in complex traffic scenes. Modern advanced driver assistance systems (ADAS) rely on a range of sensors including radar, ultrasound, LIDAR and cameras. Active sensors have found applications in detecting traffic participants (TPs) such as cars or pedestrians and scene elements (SEs) such as roads. However, camera-based systems have the potential to achieve or augment these capabilities at a much lower cost, while allowing new ones such as determination of TP and SE semantics as well as their interactions in complex traffic scenes. In this talk, we present several technical advances for vision-based ADAS. A common theme is to overcome the challenges posed by lack of large-scale annotations in deep learning frameworks. We introduce approaches to correspondence estimation that are trained on purely synthetic data but adapt well to real data at test-time. We introduce object detectors that are light enough for ADAS, trained with knowledge distillation to retain accuracies of deeper architectures. Our semantic segmentation methods are trained on weak supervision that requires only a tenth of conventional annotation time. We propose methods for 3D reconstruction that use deep supervision to recover fine TP part locations while relying on purely synthetic 3D CAD models. We develop deep ICML 2017 Workshop book Generated Tue Nov 21, 2017 Page 4 of 30 learning frameworks for multi-target tracking, as well as occlusion-reasoning in TP localization and SE layout estimation. Finally, we present a framework for TP behavior prediction in complex traffic scenes that accounts for TP-TP and TP-SE interactions. Our approach allows prediction of diverse multimodal outcomes and aims to account for long-term strategic behaviors in complex scenes. Bio: Manmohan Chandraker is an assistant professor at the CSE department of the University of California, San Diego and leads the computer vision research effort at NEC Labs America in Cupertino. He received a B.Tech. in Electrical Engineering at the Indian Institute of Technology, Bombay and a PhD in Computer Science at the University of California, San Diego. His personal research interests are 3D scene understanding and reconstruction, with applications to autonomous driving and human-computer interfaces. His works have received the Marr Prize Honorable Mention for Best Paper at ICCV 2007, the 2009 CSE Dissertation Award for Best Thesis at UCSD, a PAMI special issue on best papers of CVPR 2011 and the Best Paper Award at CVPR 2014. Abstract 6: 2 x 15 Contributed Talks on Datasets and Occupancy Maps in ICML Workshop on Machine Learning for Autonomous Vehicles 2017, 10:30 AM6: 2 x 15 Contributed Talks on Datasets and Occupancy Maps in ICML Workshop on Machine Learning for Autonomous Vehicles 2017, 10:30 AM Jonathan Binas, Daniel Neil, Shih-Chii Liu, Tobi Delbruck, DDD17: End-To-End DAVIS Driving Dataset Ransalu Senanayake and Fabio Ramos, Bayesian Hilbert Maps for Continuous Occupancy Mapping in Dynamic Environments Abstract 7: Beyond Hand Labeling: Simulation and Self-Supervision for Self-Driving Cars (Matt Johnson, University of Michigan) in ICML Workshop on Machine Learning for Autonomous Vehicles 2017, 11:00 AM7: Beyond Hand Labeling: Simulation and Self-Supervision for Self-Driving Cars (Matt Johnson, University of Michigan) in ICML Workshop on Machine Learning for Autonomous Vehicles 2017, 11:00 AM Self-driving cars now deliver vast amounts of sensor data from large unstructured environments. In attempting to process and interpret this data there are many unique challenges in bridging the gap between prerecorded data sets and the field. This talk will present recent work addressing the application of deep learning techniques to robotic perception. We focus on solutions to several pervasive problems when attempting to deploy such techniques on fielded robotic systems. The themes of the talk revolve around alternatives to gathering and curating data sets for training. Are there ways of avoiding the labor-intensive human labeling required for supervised learning? These questions give rise to several lines of research based around self-supervision, adversarial learning, and simulation. We will show how these approaches applied to self-driving car problems have great potential to change the way we train, test, and validate machine learning-based systems. Bio: Matthew Johnson-Roberson is Assistant Professor of Engineering in the Department of Naval Architecture & Marine Engineering and the Department of Electrical Engineering and Computer Science at the University of Michigan. He received a PhD from the University of Sydney in 2010. He has held prior postdoctoral appointments with the Centre for Autonomous Systems CAS at KTH Royal Institute of Technology in Stockholm and the Australian Centre for Field Robotics at the University of Sydney. He is a recipient of the NSF CAREER award (2015). He has worked in robotic perception since the first DARPA grand challenge and his group focuses on enabling robots to better see and understand their environment. Abstract 8: Learning Affordance for Autonomous Driving (JianXiong Xiao, AutoX) in ICML Workshop on Machine Learning for Autonomous Vehicles 2017, 11:30 AM8: Learning Affordance for Autonomous Driving (JianXiong Xiao, AutoX) in ICML Workshop on Machine Learning for Autonomous Vehicles 2017, 11:30 AM Today, there are two major paradigms for vision-based autonomous driving systems: mediated perception approaches that parse an entire scene to make a driving decision, and behavior reflex approaches that directly map an input image to a driving action by a regressor. In this paper, we propose a third paradigm: a direct perception based approach to estimate the affordance for driving. We propose to map an input image to a small number of key perception indicators that directly relate to the affordance of a road/traffic state for driving. Our representation provides a set of compact yet complete descriptions of the scene to enable a simple controller to drive autonomously. Falling in between the two extremes of mediated perception and behavior reflex, we argue that our direct perception representation provides the right level of abstraction. We evaluate our approach in a virtual racing game as well as real world driving and show that our model can work well to drive a car in a very diverse set of virtual and realistic environments. Jianxiong Xiao (

[1] Joshua B. Tenenbaum,et al. Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation , 2016, NIPS.

[2] Honglak Lee,et al. Control of Memory, Active Perception, and Action in Minecraft , 2016, ICML.

[3] Julian Togelius,et al. General Video Game AI: Competition, Challenges and Opportunities , 2016, AAAI.

[4] Shie Mannor,et al. A Deep Hierarchical Approach to Lifelong Learning in Minecraft , 2016, AAAI.

[5] Santiago Ontañón,et al. A Survey of Real-Time Strategy Game AI Research and Competition in StarCraft , 2013, IEEE Transactions on Computational Intelligence and AI in Games.

[6] Katja Hofmann,et al. The Malmo Platform for Artificial Intelligence Experimentation , 2016, IJCAI.

[7] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[8] Tom Schaul,et al. Unifying Count-Based Exploration and Intrinsic Motivation , 2016, NIPS.