Deep Reinforcement Learning Meets Structured Prediction

s (11): Abstract 2: A New Perspective on Adversarial Perturbations in Debugging Machine Learning Models, Madry 08:00 AM2: A New Perspective on Adversarial Perturbations in Debugging Machine Learning Models, Madry 08:00 AM The widespread susceptibility of the current ML models to adversarial perturbations is an intensely studied but still mystifying phenomenon. A popular view is that these perturbations are aberrations that arise due to statistical fluctuations in the training data and/or high-dimensional nature of our inputs. But is this really the case? In this talk, I will present a new perspective on the phenomenon of adversarial perturbations. This perspective ties this phenomenon to the existence of "non-robust" features: features derived from patterns in the data distribution that are highly predictive, yet brittle and incomprehensible to humans. Such patterns turn out to be prevalent in our real-world datasets and also shed light on previously observed phenomena in adversarial robustness, including transferability of adversarial examples and properties of robust models. Finally, this perspective suggests that we may need to recalibrate our expectations in terms of how models should make their decisions, and how we should interpret them. Abstract 3: Similarity of Neural Network Representations Revisited in Debugging Machine Learning Models, Kornblith 08:30 AM Recent work has sought to understand the behavior of neural networks by comparing representations between layers and between different trained models. We introduce a similarity index that measures the relationship between representational similarity matrices. We show that this similarity index is equivalent to centered kernel alignment (CKA) and analyze its relationship to canonical correlation analysis. Unlike other methods, CKA can reliably identify correspondences between representations of layers in networks trained from different initializations. Moreover, CKA can reveal network pathology that is not evident from test accuracy alone.3: Similarity of Neural Network Representations Revisited in Debugging Machine Learning Models, Kornblith 08:30 AM Recent work has sought to understand the behavior of neural networks by comparing representations between layers and between different trained models. We introduce a similarity index that measures the relationship between representational similarity matrices. We show that this similarity index is equivalent to centered kernel alignment (CKA) and analyze its relationship to canonical correlation analysis. Unlike other methods, CKA can reliably identify correspondences between representations of layers in networks trained from different initializations. Moreover, CKA can reveal network pathology that is not evident from test accuracy alone. Abstract 6: Verifiable Reinforcement Learning via Policy Extraction in Debugging Machine Learning Models, Bastani 09:10 AM6: Verifiable Reinforcement Learning via Policy Extraction in Debugging Machine Learning Models, Bastani 09:10 AM While deep reinforcement learning has successfully solved many challenging control tasks, its real-world applicability has been limited by the inability to ensure the safety of learned policies. We propose VIPER, an approach to verifiable reinforcement learning by training decision tree policies, which can represent complex policies (since they are nonparametric), yet can be efficiently verified using existing techniques (since they are highly structured). We use VIPER to learn a decision tree policy for a toy game based on Pong that provably never loses. Abstract 7: Debugging Machine Learning via Model Assertions in Debugging Machine Learning Models, Kang 09:40 AM7: Debugging Machine Learning via Model Assertions in Debugging Machine Learning Models, Kang 09:40 AM Machine learning models are being deployed in mission-critical settings, such as self-driving cars. However, these models can fail in complex ways, so it is imperative that application developers find ways to debug these models. We propose adapting software assertions, or boolean statements about the state of a program that must be true, to the task of debugging ML models. With model assertions, ML developers can specify constraints on model outputs, e.g., cars should not disappear and reappear in successive frames of a video. We propose several ways to use model assertions in ML debugging, including use in runtime monitoring, in performing corrective actions, and in collecting “hard examples” to further train models with human labeling or weak supervision. We show that, for a video analytics task, simple assertions can effectively find errors and correction rules can effectively correct model output (up to 100% and 90% respectively). We additionally collect and label parts of video where assertions fire (as a form of active ICLR 2019 Workshop book Generated Thu Sep 10, 2020 Page 8 of 20 learning) and show that this procedure can improve model performance by up to 2×. Abstract 10: Discovering Natural Bugs Using Adversarial Data Perturbations in Debugging Machine Learning Models, Singh 10:10 AM10: Discovering Natural Bugs Using Adversarial Data Perturbations in Debugging Machine Learning Models, Singh 10:10 AM Determining when a machine learning model is “good enough” is challenging since held-out accuracy metrics significantly overestimate real-world performance. In this talk, I will describe automated techniques to detect bugs that can occur naturally when a model is deployed. I will start by approaches to identify “semantically equivalent” adversaries that should not change the meaning of the input, but lead to a change in the model’s predictions. Then I will present our work on evaluating the consistency behavior of the model by exploring performance on new instances that are “implied” by the model’s predictions. I will also describe a method to understand and debug models by adversarially modifying the training data to change the model’s predictions. The talk will include applications of these ideas on a number of NLP tasks, such as reading comprehension, visual QA, and knowledge graph completion. Abstract 11: "Debugging" Discriminatory ML Systems in Debugging Machine Learning Models, Raji 10:40 AM11: "Debugging" Discriminatory ML Systems in Debugging Machine Learning Models, Raji 10:40 AM If a machine learning (ML) model is illegally discriminatory towards vulnerable and underrepresented populations, can we really say it works? Of course not! That illegal behaviour negates the functionality of the ML model, just as much as overfitting or other typically acknowledged ML "bugs". This talk explores the redefinition of what it means for a model to "work" well enough to deploy and dives into the analogy of software engineering debugging practice to explain current strategies for diagnosing, reporting, addressing and preventing the further development of discriminatory ML models. Abstract 12: NeuralVerification.jl: Algorithms for Verifying Deep Neural Networks in Debugging Machine Learning Models, Arnon, Lazarus 11:00 AM12: NeuralVerification.jl: Algorithms for Verifying Deep Neural Networks in Debugging Machine Learning Models, Arnon, Lazarus 11:00 AM Deep neural networks (DNNs) are widely used for nonlinear function approximation with applications ranging from computer vision to control. Although DNNs involve the composition of simple arithmetic operations, it can be very challenging to verify whether a particular network satisfies certain input-output properties. This work introduces NeuralVerification.jl, a software package that implements methods that have emerged recently for soundly verifying such properties. These methods borrow insights from reachability analysis, optimization, and search. We present the formal problem definition and briefly discuss the fundamental differences between the implemented algorithms. In addition, we provide a pedagogical example of how to use the library. Abstract 15: Safe and Reliable Machine Learning: Preventing and Identifying Failures in Debugging Machine Learning Models, Saria 01:30 PM15: Safe and Reliable Machine Learning: Preventing and Identifying Failures in Debugging Machine Learning Models, Saria 01:30 PM Machine Learning driven decision-making systems are being increasingly used to decide bank loans, make hiring decisions,perform clinical decision-making, and more. As we march towards a future in which these systems underpin most of society’s decision-making infrastructure, it is critical for us to understand the principles that will help us engineer for reliability. Drawing from reliability engineering, we will briefly outline three principles to group and guide technical solutions for addressing and ensuring reliability in machine learning systems: 1) Failure Prevention, 2) Failure Identification, and 3) Maintenance. In particular, we will discuss a framework (https://arxiv.org/abs/1904.07204) for preventing failures due to differences between the training and deployment environments that proactively addresses the problem of dataset shift. We will contrast this with typical reactive solutions which require deployment environment data and discuss relations with similar problems such as robustness to adversarial examples. Abstract 16: Better Code for Less Debugging with AutoGraph in Debugging Machine Learning Models, Moldovan 02:00 PM16: Better Code for Less Debugging with AutoGraph in Debugging Machine Learning Models, Moldovan 02:00 PM The fast-paced nature of machine learning research and development, with many ideas advancing rapidly from research to production, puts it at increased risk of programming errors, which can be particularly insidious when combined with machine learning. In this talk we discuss defensive design as a way to reduce the chance for such errors to occur in the first place, and present AutoGraph, a tool which facilitates defensive design by allowing more legible code that is still efficient and portabl

[1]  Pieter Abbeel,et al.  Meta Learning Shared Hierarchies , 2017, ICLR.

[2]  Hal Daumé,et al.  Structured Prediction via Learning to Search under Bandit Feedback , 2017, SPNLP@EMNLP.

[3]  Chen Liang,et al.  Memory Augmented Policy Optimization for Program Synthesis with Generalization , 2018, ArXiv.

[4]  Jascha Sohl-Dickstein,et al.  Adversarial Examples that Fool both Human and Computer Vision , 2018, ArXiv.

[5]  Moustapha Cissé,et al.  ConvNets and ImageNet Beyond Accuracy: Understanding Mistakes and Uncovering Biases , 2017, ECCV.

[6]  Dawn Xiaodong Song,et al.  SQLNet: Generating Structured Queries From Natural Language Without Reinforcement Learning , 2017, ArXiv.

[7]  Jure Leskovec,et al.  Human Decisions and Machine Predictions , 2017, The quarterly journal of economics.

[8]  William Yang Wang,et al.  Robust Distant Supervision Relation Extraction via Deep Reinforcement Learning , 2018, ACL.

[9]  Marc'Aurelio Ranzato,et al.  Sequence Level Training with Recurrent Neural Networks , 2015, ICLR.

[10]  Pushmeet Kohli,et al.  A Dual Approach to Scalable Verification of Deep Networks , 2018, UAI.

[11]  Martin Wattenberg,et al.  Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV) , 2017, ICML.

[12]  Sergey Levine,et al.  QT-Opt: Scalable Deep Reinforcement Learning for Vision-Based Robotic Manipulation , 2018, CoRL.

[13]  Chen Liang,et al.  Neural Symbolic Machines: Learning Semantic Parsers on Freebase with Weak Supervision , 2016, ACL.

[14]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[15]  Wenhan Xiong,et al.  DeepPath: A Reinforcement Learning Method for Knowledge Graph Reasoning , 2017, EMNLP.

[16]  Mohammad Norouzi,et al.  Deep Value Networks Learn to Evaluate and Iteratively Refine Structured Outputs , 2017, ICML.

[17]  Dale Schuurmans,et al.  Reward Augmented Maximum Likelihood for Neural Structured Prediction , 2016, NIPS.

[18]  Sergey Levine,et al.  Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection , 2016, Int. J. Robotics Res..

[19]  Vaibhava Goel,et al.  Self-Critical Sequence Training for Image Captioning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Eric Horvitz,et al.  Identifying Unknown Unknowns in the Open World: Representations and Policies for Guided Exploration , 2016, AAAI.

[21]  Regina Barzilay,et al.  Improving Information Extraction by Acquiring External Evidence with Reinforcement Learning , 2016, EMNLP.

[22]  Percy Liang,et al.  From Language to Programs: Bridging Reinforcement Learning and Maximum Marginal Likelihood , 2017, ACL.

[23]  Nan Jiang,et al.  Hierarchical Imitation and Reinforcement Learning , 2018, ICML.

[24]  Sergey Levine,et al.  Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[25]  Suman Jana,et al.  DeepTest: Automated Testing of Deep-Neural-Network-Driven Autonomous Cars , 2017, 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE).

[26]  Quoc V. Le,et al.  Understanding and Simplifying One-Shot Architecture Search , 2018, ICML.

[27]  Kirthevasan Kandasamy,et al.  Batch Policy Gradient Methods for Improving Neural Conversation Models , 2017, ICLR.

[28]  Richard Socher,et al.  A Deep Reinforced Model for Abstractive Summarization , 2017, ICLR.

[29]  Daniel Kroening,et al.  A Survey of Automated Techniques for Formal Software Verification , 2008, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[30]  Palash Goyal,et al.  Graph Embedding Techniques, Applications, and Performance: A Survey , 2017, Knowl. Based Syst..

[31]  Hany Farid,et al.  The accuracy, fairness, and limits of predicting recidivism , 2018, Science Advances.

[32]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.

[33]  Kyunghyun Cho,et al.  Task-Oriented Query Reformulation with Reinforcement Learning , 2017, EMNLP.

[34]  Maya R. Gupta,et al.  Optimization with Non-Differentiable Constraints with Applications to Fairness, Recall, Churn, and Other Goals , 2018, J. Mach. Learn. Res..

[35]  Ian Goodfellow,et al.  TensorFuzz: Debugging Neural Networks with Coverage-Guided Fuzzing , 2018, ICML.

[36]  Peter L. Bartlett,et al.  RL$^2$: Fast Reinforcement Learning via Slow Reinforcement Learning , 2016, ArXiv.

[37]  Pierre Vandergheynst,et al.  Geometric Deep Learning: Going beyond Euclidean data , 2016, IEEE Signal Process. Mag..

[38]  Yee Whye Teh,et al.  Filtering Variational Objectives , 2017, NIPS.

[39]  Jannis Bulian,et al.  Ask the Right Questions: Active Question Reformulation with Reinforcement Learning , 2017, ICLR.

[40]  Quoc V. Le,et al.  Neural Architecture Search with Reinforcement Learning , 2016, ICLR.

[41]  Stuart J. Russell,et al.  Reinforcement Learning with Hierarchies of Machines , 1997, NIPS.

[42]  Tom Schaul,et al.  FeUdal Networks for Hierarchical Reinforcement Learning , 2017, ICML.

[43]  Jure Leskovec,et al.  Representation Learning on Graphs: Methods and Applications , 2017, IEEE Data Eng. Bull..

[44]  Stephen J. Wright,et al.  Training Set Debugging Using Trusted Items , 2018, AAAI.

[45]  David L. Dill,et al.  Developing Bug-Free Machine Learning Systems With Formal Mathematics , 2017, ICML.

[46]  N. Daw,et al.  Deciding How To Decide: Self-Control and Meta-Decision Making , 2015, Trends in Cognitive Sciences.

[47]  Matthew Wicker,et al.  Feature-Guided Black-Box Safety Testing of Deep Neural Networks , 2017, TACAS.

[48]  Julius Adebayo,et al.  CREDIT SCORING IN THE ERA OF BIG DATA , 2017 .

[49]  Jianfeng Gao,et al.  Deep Reinforcement Learning for Dialogue Generation , 2016, EMNLP.

[50]  Razvan Pascanu,et al.  Relational inductive biases, deep learning, and graph networks , 2018, ArXiv.

[51]  Seth Neel,et al.  Preventing Fairness Gerrymandering: Auditing and Learning for Subgroup Fairness , 2017, ICML.

[52]  Joelle Pineau,et al.  An Actor-Critic Algorithm for Sequence Prediction , 2016, ICLR.

[53]  Percy Liang,et al.  Understanding Black-box Predictions via Influence Functions , 2017, ICML.

[54]  Shie Mannor,et al.  Bayesian Reinforcement Learning: A Survey , 2015, Found. Trends Mach. Learn..

[55]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[56]  Matthew E. Taylor,et al.  Abstraction and Generalization in Reinforcement Learning: A Summary and Framework , 2009, ALA.

[57]  M. Landy,et al.  Decision making, movement planning and statistical decision theory , 2008, Trends in Cognitive Sciences.

[58]  Peter Stone,et al.  Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..

[59]  Zeb Kurth-Nelson,et al.  Learning to reinforcement learn , 2016, CogSci.

[60]  Alexei A. Efros,et al.  Investigating Human Priors for Playing Video Games , 2018, ICML.

[61]  Samuel Gershman,et al.  Novelty and Inductive Generalization in Human Reinforcement Learning , 2015, Top. Cogn. Sci..

[62]  Quoc V. Le,et al.  Neural Program Synthesis with Priority Queue Training , 2018, ArXiv.

[63]  Thomas G. Dietterich Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..

[64]  D. Sculley,et al.  The ML test score: A rubric for ML production readiness and technical debt reduction , 2017, 2017 IEEE International Conference on Big Data (Big Data).

[65]  Doina Precup,et al.  Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..

[66]  Evgeniy Gabrilovich,et al.  A Review of Relational Machine Learning for Knowledge Graphs , 2015, Proceedings of the IEEE.