Toward Comprehensive Risk Assessments and Assurance of AI-Based Systems

[1]  Sandra Wachter,et al.  The Unfairness of Fair Machine Learning: Levelling down and strict egalitarianism by default , 2023, ArXiv.

[2]  Tom B. Brown,et al.  Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned , 2022, ArXiv.

[3]  Joshua Achiam,et al.  A Hazard Analysis Framework for Code Synthesis Large Language Models , 2022, ArXiv.

[4]  Ryan J. Lowe,et al.  Training language models to follow instructions with human feedback , 2022, NeurIPS.

[5]  Geoffrey Irving,et al.  Red Teaming Language Models with Language Models , 2022, EMNLP.

[6]  Amandalynne Paullada,et al.  AI and the Everything in the Whole Wide World Benchmark , 2021, NeurIPS Datasets and Benchmarks.

[7]  Elchanan Mossel,et al.  Information Spread with Error Correction , 2021, ArXiv.

[8]  Christy Dennison,et al.  Process for Adapting Language Models to Society (PALMS) with Values-Targeted Datasets , 2021, NeurIPS.

[9]  David Krueger,et al.  Goal Misgeneralization in Deep Reinforcement Learning , 2021, ICML.

[10]  Emily M. Bender,et al.  On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜 , 2021, FAccT.

[11]  Christo Wilson,et al.  Building and Auditing Fair Algorithms: A Case Study in Candidate Screening , 2021, FAccT.

[12]  Joshua A. Kroll Outlining Traceability: A Principle for Operationalizing Accountability in Computing Systems , 2021, FAccT.

[13]  Mohit Bansal,et al.  Robustness Gym: Unifying the NLP Evaluation Landscape , 2021, NAACL.

[14]  Scott Niekum,et al.  Value Alignment Verification , 2020, ICML.

[15]  Peter Henderson,et al.  Toward Trustworthy AI Development: Mechanisms for Supporting Verifiable Claims , 2020, ArXiv.

[16]  Inioluwa Deborah Raji,et al.  Closing the AI accountability gap: defining an end-to-end framework for internal algorithmic auditing , 2020, FAT*.

[17]  Timnit Gebru,et al.  Lessons from archives: strategies for collecting sociocultural data in machine learning , 2019, FAT*.

[18]  N. Leveson Improving the Standard Risk Matrix using STPA , 2019, Journal of System Safety.

[19]  Adi Shamir,et al.  A Simple Explanation for the Existence of Adversarial Examples with Small Hamming Distance , 2019, ArXiv.

[20]  Inioluwa Deborah Raji,et al.  Model Cards for Model Reporting , 2018, FAT.

[21]  Sandra Wachter,et al.  A Right to Reasonable Inferences: Re-Thinking Data Protection Law in the Age of Big Data and AI , 2018 .

[22]  Eric Thorn,et al.  A Framework for Automated Driving System Testable Cases and Scenarios , 2018 .

[23]  Timnit Gebru,et al.  Datasheets for datasets , 2018, Commun. ACM.

[24]  Timnit Gebru,et al.  Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification , 2018, FAT.

[25]  Jérémie Guiochet,et al.  Can Robot Navigation Bugs Be Found in Simulation? An Exploratory Study , 2017, 2017 IEEE International Conference on Software Quality, Reliability and Security (QRS).

[26]  Philip Koopman,et al.  Robustness Testing of Autonomy Software , 2017, 2018 IEEE/ACM 40th International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP).

[27]  Michael P. Wellman,et al.  Towards the Science of Security and Privacy in Machine Learning , 2016, ArXiv.

[28]  Arvind Narayanan,et al.  Semantics derived automatically from language corpora contain human-like biases , 2016, Science.

[29]  John Schulman,et al.  Concrete Problems in AI Safety , 2016, ArXiv.

[30]  Oliver Zendel,et al.  CV-HAZOP: Introducing Test Data Validation for Computer Vision , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[31]  R. Sealy,et al.  Equality and Human Rights Commission , 2013 .

[32]  Nancy G. Leveson,et al.  Engineering a Safer World: Systems Thinking Applied to Safety , 2012 .

[33]  Sorensen Tc,et al.  Humans in the Loop , 2016 .

[34]  Peter G. Bishop,et al.  Safety and Assurance Cases: Past, Present and Possible Future - an Adelard Perspective , 2010, SSS.

[35]  Philip Koopman,et al.  Better Embedded System Software , 2010 .

[36]  Adam Shostack,et al.  Experiences Threat Modeling at Microsoft , 2008, MODSEC@MoDELS.