Fixes That Fail: Self-Defeating Improvements in Machine-Learning Systems

Machine-learning systems such as self-driving cars or virtual assistants are composed of a large number of machine-learning models that recognize image content, transcribe speech, analyze natural language, infer preferences, rank options, etc. These systems can be represented as directed acyclic graphs in which each vertex is a model, and models feed each other information over the edges. Oftentimes, the models are developed and trained independently, which raises an obvious concern: Can improving a machinelearning model make the overall system worse? We answer this question affirmatively by showing that improving a model can deteriorate the performance of downstream models, even after those downstream models are retrained. Such self-defeating improvements are the result of entanglement between the models. We identify different types of entanglement and demonstrate via simple experiments how they can produce self-defeating improvements. We also show that self-defeating improvements emerge in a realistic stereo-based object detection system. The first rule of systems engineering is: If you optimize the components you will probably ruin the system performance.

[1]  Huimin Ma,et al.  3D Object Proposals for Accurate Object Class Detection , 2015, NIPS.

[2]  Yong-Sheng Chen,et al.  Pyramid Stereo Matching Network , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[3]  Alan Edelman,et al.  A Differentiable Programming System to Bridge Machine Learning and Scientific Computing , 2019, ArXiv.

[4]  V. Vapnik Estimation of Dependences Based on Empirical Data , 2006 .

[5]  J. Glenn Brookshear Computer Science: An Overview (9th Edition) , 2006 .

[6]  Atsuto Maki,et al.  Factors of Transferability for a Generic ConvNet Representation , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Yan Wang,et al.  Pseudo-LiDAR From Visual Depth Estimation: Bridging the Gap in 3D Object Detection for Autonomous Driving , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Leonidas J. Guibas,et al.  Frustum PointNets for 3D Object Detection from RGB-D Data , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[9]  Koby Crammer,et al.  A theory of learning from different domains , 2010, Machine Learning.

[10]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[11]  H. Shimodaira,et al.  Improving predictive inference under covariate shift by weighting the log-likelihood function , 2000 .

[12]  Stefano Soatto,et al.  Positive-Congruent Training: Towards Regression-Free Model Updates , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Kilian Q. Weinberger,et al.  On Calibration of Modern Neural Networks , 2017, ICML.

[14]  Ivica Crnkovic,et al.  Large-scale machine learning systems in real-world industrial settings: A review of challenges and solutions , 2020, Inf. Softw. Technol..

[15]  Ipek Ozkaya,et al.  What Is Really Different in Engineering AI-Enabled Systems? , 2020, IEEE Softw..

[16]  W. Hager,et al.  and s , 2019, Shallow Water Hydraulics.

[17]  Alexander D'Amour,et al.  Underspecification Presents Challenges for Credibility in Modern Machine Learning , 2020, J. Mach. Learn. Res..

[18]  Harald C. Gall,et al.  Software Engineering for Machine Learning: A Case Study , 2019, 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP).

[19]  Cynthia Dwork,et al.  Fairness Under Composition , 2018, ITCS.

[20]  Sergey Levine,et al.  Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[21]  Richard R. Hamming The Art of Doing Science and Engineering: Learning to Learn , 1997 .

[22]  Shang-Hong Lai,et al.  Unified Representation Learning for Cross Model Compatibility , 2020, BMVC.

[23]  Yan Wang,et al.  Pseudo-LiDAR++: Accurate Depth for 3D Object Detection in Autonomous Driving , 2019, ICLR.

[24]  Tong Zhang Statistical behavior and consistency of classification methods based on convex risk minimization , 2003 .

[25]  Eric Horvitz,et al.  An Empirical Analysis of Backward Compatibility in Machine Learning Systems , 2020, KDD.

[26]  Jeff Dyck,et al.  Machine learning for engineering , 2018, 2018 23rd Asia and South Pacific Design Automation Conference (ASP-DAC).

[27]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[28]  Tiark Rompf,et al.  Backpropagation with Callbacks: Foundations for Efficient and Expressive Differentiable Programming , 2018, NeurIPS.

[29]  Xiaogang Wang,et al.  PointRCNN: 3D Object Proposal Generation and Detection From Point Cloud , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Mark Chen,et al.  Language Models are Few-Shot Learners , 2020, NeurIPS.

[31]  Eric Horvitz,et al.  On Human Intellect and Machine Failures: Troubleshooting Integrative Machine Learning Systems , 2016, AAAI.

[32]  Massimiliano Pontil,et al.  Multi-task Learning , 2020, Transfer Learning.

[33]  Léon Bottou,et al.  The Tradeoffs of Large Scale Learning , 2007, NIPS.

[34]  Joaquin Quiñonero Candela,et al.  Counterfactual reasoning and learning systems: the example of computational advertising , 2013, J. Mach. Learn. Res..

[35]  Rama Chellappa,et al.  Visual Domain Adaptation: A survey of recent advances , 2015, IEEE Signal Processing Magazine.

[36]  Tricia Walker,et al.  Computer science , 1996, English for academic purposes series.

[37]  Ingo Steinwart,et al.  Fast Rates for Support Vector Machines , 2005, COLT.

[38]  D. Sculley,et al.  Hidden Technical Debt in Machine Learning Systems , 2015, NIPS.

[39]  Stefano Soatto,et al.  Towards Backward-Compatible Representation Learning , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[41]  Quoc V. Le,et al.  Do Better ImageNet Models Transfer Better? , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[44]  Noel E. O'Connor,et al.  Unsupervised label noise modeling and loss correction , 2019, ICML.

[45]  Eric Horvitz,et al.  Updates in Human-AI Teams: Understanding and Addressing the Performance/Compatibility Tradeoff , 2019, AAAI.

[46]  Shan Lu,et al.  Are Machine Learning Cloud APIs Used Correctly? , 2021, 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE).

[47]  Andrew Y. Ng,et al.  Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .