论文信息 - Toward Comprehensive Risk Assessments and Assurance of AI-Based Systems - 字舞流文

Toward Comprehensive Risk Assessments and Assurance of AI-Based Systems

[1] Sandra Wachter,et al. The Unfairness of Fair Machine Learning: Levelling down and strict egalitarianism by default , 2023, ArXiv.

[2] Tom B. Brown,et al. Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned , 2022, ArXiv.

[3] Joshua Achiam,et al. A Hazard Analysis Framework for Code Synthesis Large Language Models , 2022, ArXiv.

[4] Ryan J. Lowe,et al. Training language models to follow instructions with human feedback , 2022, NeurIPS.

[5] Geoffrey Irving,et al. Red Teaming Language Models with Language Models , 2022, EMNLP.

[6] Amandalynne Paullada,et al. AI and the Everything in the Whole Wide World Benchmark , 2021, NeurIPS Datasets and Benchmarks.

[7] Elchanan Mossel,et al. Information Spread with Error Correction , 2021, ArXiv.

[8] Christy Dennison,et al. Process for Adapting Language Models to Society (PALMS) with Values-Targeted Datasets , 2021, NeurIPS.

[9] David Krueger,et al. Goal Misgeneralization in Deep Reinforcement Learning , 2021, ICML.

[10] Emily M. Bender,et al. On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜 , 2021, FAccT.

[11] Christo Wilson,et al. Building and Auditing Fair Algorithms: A Case Study in Candidate Screening , 2021, FAccT.

[12] Joshua A. Kroll. Outlining Traceability: A Principle for Operationalizing Accountability in Computing Systems , 2021, FAccT.

[13] Mohit Bansal,et al. Robustness Gym: Unifying the NLP Evaluation Landscape , 2021, NAACL.

[14] Scott Niekum,et al. Value Alignment Verification , 2020, ICML.

[15] Peter Henderson,et al. Toward Trustworthy AI Development: Mechanisms for Supporting Verifiable Claims , 2020, ArXiv.

[16] Inioluwa Deborah Raji,et al. Closing the AI accountability gap: defining an end-to-end framework for internal algorithmic auditing , 2020, FAT*.

[17] Timnit Gebru,et al. Lessons from archives: strategies for collecting sociocultural data in machine learning , 2019, FAT*.

[18] N. Leveson. Improving the Standard Risk Matrix using STPA , 2019, Journal of System Safety.

[19] Adi Shamir,et al. A Simple Explanation for the Existence of Adversarial Examples with Small Hamming Distance , 2019, ArXiv.

[20] Inioluwa Deborah Raji,et al. Model Cards for Model Reporting , 2018, FAT.

[21] Sandra Wachter,et al. A Right to Reasonable Inferences: Re-Thinking Data Protection Law in the Age of Big Data and AI , 2018 .

[22] Eric Thorn,et al. A Framework for Automated Driving System Testable Cases and Scenarios , 2018 .

[23] Timnit Gebru,et al. Datasheets for datasets , 2018, Commun. ACM.

[24] Timnit Gebru,et al. Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification , 2018, FAT.

[25] Jérémie Guiochet,et al. Can Robot Navigation Bugs Be Found in Simulation? An Exploratory Study , 2017, 2017 IEEE International Conference on Software Quality, Reliability and Security (QRS).

[26] Philip Koopman,et al. Robustness Testing of Autonomy Software , 2017, 2018 IEEE/ACM 40th International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP).

[27] Michael P. Wellman,et al. Towards the Science of Security and Privacy in Machine Learning , 2016, ArXiv.

[28] Arvind Narayanan,et al. Semantics derived automatically from language corpora contain human-like biases , 2016, Science.

[29] John Schulman,et al. Concrete Problems in AI Safety , 2016, ArXiv.

[30] Oliver Zendel,et al. CV-HAZOP: Introducing Test Data Validation for Computer Vision , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[31] R. Sealy,et al. Equality and Human Rights Commission , 2013 .

[32] Nancy G. Leveson,et al. Engineering a Safer World: Systems Thinking Applied to Safety , 2012 .

[33] Sorensen Tc,et al. Humans in the Loop , 2016 .

[34] Peter G. Bishop,et al. Safety and Assurance Cases: Past, Present and Possible Future - an Adelard Perspective , 2010, SSS.

[35] Philip Koopman,et al. Better Embedded System Software , 2010 .

[36] Adam Shostack,et al. Experiences Threat Modeling at Microsoft , 2008, MODSEC@MoDELS.