AI Safety Subproblems for Software Engineering Researchers

In this 4-page manuscript we discuss the problem of long-term AI Safety from a Software Engineering (SE) research viewpoint. We briefly summarize long-term AI Safety, and the challenge of avoiding harms from AI as systems meet or exceed human SE capabilities, including software engineering capabilities (and approach AGI /"HLMI"). We perform a quantified literature review suggesting that AI Safety discussions are not common at SE venues. We make conjectures about how software might change with rising capabilities, and categorize"subproblems"which fit into traditional SE areas, proposing how work on similar problems might improve the future of AI and SE.

[1]  Kevin Jesse,et al.  Large Language Models and Simple, Stupid Bugs , 2023, 2023 IEEE/ACM 20th International Conference on Mining Software Repositories (MSR).

[2]  Tom B. Brown,et al.  The Capacity for Moral Self-Correction in Large Language Models , 2023, ArXiv.

[3]  Daniel Buschek,et al.  Co-Writing with Opinionated Language Models Affects Users’ Views , 2023, CHI.

[4]  M. Babar,et al.  A Survey on Data-driven Software Vulnerability Assessment and Prioritization , 2021, ACM Comput. Surv..

[5]  Tom B. Brown,et al.  Constitutional AI: Harmlessness from AI Feedback , 2022, ArXiv.

[6]  T. Zimmermann,et al.  “It would work for me too”: How Online Communities Shape Software Developers’ Trust in AI-Powered Code Generation Tools , 2022, ACM Trans. Interact. Intell. Syst..

[7]  Rohin Shah,et al.  Goal Misgeneralization: Why Correct Specifications Aren't Enough For Correct Goals , 2022, ArXiv.

[8]  Richard Yuanzhe Pang,et al.  What Do NLP Researchers Believe? Results of the NLP Community Metasurvey , 2022, ACL.

[9]  Shuvendu K. Lahiri,et al.  Interactive Code Generation via Test-Driven User-Intent Formalization , 2022, ArXiv.

[10]  Tom B. Brown,et al.  Language Models (Mostly) Know What They Know , 2022, ArXiv.

[11]  Dan Hendrycks,et al.  X-Risk Analysis for AI Research , 2022, ArXiv.

[12]  S. Sundar,et al.  Designing for Responsible Trust in AI Systems: A Communication Perspective , 2022, FAccT.

[13]  Tom B. Brown,et al.  Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback , 2022, ArXiv.

[14]  Ayça Kolukısa Tarhan,et al.  Systematic literature review on software quality for AI-based software , 2022, Empirical Software Engineering.

[15]  Ryan J. Lowe,et al.  Training language models to follow instructions with human feedback , 2022, NeurIPS.

[16]  Junchao Wang,et al.  A Survey of Automatic Source Code Summarization , 2022, Symmetry.

[17]  T. Besiroglu,et al.  Compute Trends Across Three Eras of Machine Learning , 2022, 2022 International Joint Conference on Neural Networks (IJCNN).

[18]  Michael Matthews,et al.  The Alignment Problem: Machine Learning and Human Values , 2022, Personnel Psychology.

[19]  Foutse Khomh,et al.  How to certify machine learning based safety-critical systems? A systematic literature review , 2021, Automated Software Engineering.

[20]  Nan Duan,et al.  Learning to Complete Code with Sketches , 2021, ICLR.

[21]  Xavier Franch,et al.  Software Engineering for AI-Based Systems: A Survey , 2021, ACM Trans. Softw. Eng. Methodol..

[22]  M. Kirikova,et al.  Challenges of Low-Code/No-Code Software Development: A Literature Review , 2022, International Workshop on Bibliometric-enhanced Information Retrieval.

[23]  Po-Sen Huang,et al.  Ethical and social risks of harm from Language Models , 2021, ArXiv.

[24]  Florian Saurwein,et al.  Automated Trouble: The Role of Algorithmic Selection in Harms on Social Media Platforms , 2021, Media and Communication.

[25]  Ziwei Liu,et al.  Generalized Out-of-Distribution Detection: A Survey , 2021, International Journal of Computer Vision.

[26]  Jan Leike,et al.  Recursively Summarizing Books with Human Feedback , 2021, ArXiv.

[27]  Wojciech Zaremba,et al.  Evaluating Large Language Models Trained on Code , 2021, ArXiv.

[28]  Allan Dafoe,et al.  Ethics and Governance of Artificial Intelligence: Evidence from a Survey of Machine Learning Researchers , 2021, J. Artif. Intell. Res..

[29]  Roman V. Yampolskiy,et al.  AI Risk Skepticism , 2021, ArXiv.

[30]  Zheng Leong Chua,et al.  Scalable Quantitative Verification for Deep Neural Networks , 2020, 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE).

[31]  Yogesh Kumar Dwivedi,et al.  Artificial Intelligence (AI): Multidisciplinary perspectives on emerging challenges, opportunities, and agenda for research, practice and policy , 2019, International Journal of Information Management.

[32]  Christian Bird,et al.  Today Was a Good Day: The Daily Life of Software Developers , 2019, IEEE Transactions on Software Engineering.

[33]  Eric D. Ragan,et al.  A Multidisciplinary Survey and Framework for Design and Evaluation of Explainable AI Systems , 2018, ACM Trans. Interact. Intell. Syst..

[34]  J. Pfau,et al.  Objective Robustness in Deep Reinforcement Learning , 2021, ArXiv.

[35]  Seok-Won Lee,et al.  Multilayered review of safety approaches for machine learning-based systems in the days of AI , 2021, J. Syst. Softw..

[36]  Jonathan Stray,et al.  Aligning AI Optimization to Community Well-Being , 2020, International Journal of Community Well-Being.

[37]  Gerson Sunyé,et al.  Challenges & opportunities in low-code testing , 2020, MoDELS.

[38]  Andrew Critch,et al.  AI Research Considerations for Human Existential Safety (ARCHES) , 2020, ArXiv.

[39]  Mario Brcic,et al.  AI safety: state of the field through quantitative lens , 2020, 2020 43rd International Convention on Information, Communication and Electronic Technology (MIPRO).

[40]  Simos Gerasimou,et al.  Importance-Driven Deep Learning System Testing , 2020, 2020 IEEE/ACM 42nd International Conference on Software Engineering: Companion Proceedings (ICSE-Companion).

[41]  Daniel S. Weld,et al.  S2ORC: The Semantic Scholar Open Research Corpus , 2020, ACL.

[42]  Stuart Russell Human Compatible: Artificial Intelligence and the Problem of Control , 2019 .

[43]  Christoph Treude,et al.  Automatic Generation of Pull Request Descriptions , 2019, 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[44]  Scott Garrabrant,et al.  Risks from Learned Optimization in Advanced Machine Learning Systems , 2019, ArXiv.

[45]  Anne Lauscher Life 3.0: being human in the age of artificial intelligence , 2019, Internet Histories.

[46]  Tim Miller,et al.  Explanation in Artificial Intelligence: Insights from the Social Sciences , 2017, Artif. Intell..

[47]  Onur Ozdemir,et al.  Automated Vulnerability Detection in Source Code Using Deep Representation Learning , 2018, 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA).

[48]  John Salvatier,et al.  When Will AI Exceed Human Performance? Evidence from AI Experts , 2017, ArXiv.

[49]  Kilian Q. Weinberger,et al.  On Calibration of Modern Neural Networks , 2017, ICML.

[50]  Shane Legg,et al.  Deep Reinforcement Learning from Human Preferences , 2017, NIPS.

[51]  Charles Blundell,et al.  Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles , 2016, NIPS.

[52]  N. Soares,et al.  Agent Foundations for Aligning Machine Intelligence with Human Interests: A Technical Research Agenda , 2017 .

[53]  Stuart J. Russell,et al.  Research Priorities for Robust and Beneficial Artificial Intelligence , 2015, AI Mag..

[54]  Manish Mahajan,et al.  Proof carrying code , 2015 .

[55]  Roman V Yampolskiy,et al.  Responses to catastrophic AGI risk: a survey , 2014 .

[56]  N. Oreskes,et al.  Merchants of Doubt: How a Handful of Scientists Obscured the Truth on Issues from Tobacco Smoke to Global Warming , 2010 .

[57]  Shane Legg,et al.  A Collection of Definitions of Intelligence , 2007, AGI.

[58]  N Wiener,et al.  Some moral and technical consequences of automation , 1960, Science.