Risk assessment at AGI companies: A review of popular risk assessment techniques from other safety-critical industries
暂无分享,去创建一个
[1] Gillian K. Hadfield,et al. Frontier AI Regulation: Managing Emerging Risks to Public Safety , 2023, ArXiv.
[2] An Overview of Catastrophic AI Risks , 2023, ArXiv.
[3] Andrew Critch,et al. TASRA: a Taxonomy and Analysis of Societal-Scale Risks from AI , 2023, ArXiv.
[4] Nancy J. Cooke,et al. Managing the risks of artificial general intelligence: A human factors and ergonomics perspective , 2023, Human Factors and Ergonomics in Manufacturing & Service Industries.
[5] Jonas Schuett. AGI labs need an internal audit function , 2023, ArXiv.
[6] Sebastian Farquhar,et al. Model evaluation for extreme risks , 2023, ArXiv.
[7] M. Choudhury,et al. Tricking LLMs into Disobedience: Understanding, Analyzing, and Preventing Jailbreaks , 2023, ArXiv.
[8] Emma Bluemke,et al. Towards best practices in AGI safety and governance: A survey of expert opinion , 2023, ArXiv.
[9] Julian Hazell. Large Language Models Can Be Used To Effectively Scale Spear Phishing Campaigns , 2023, ArXiv.
[10] Chunyuan Li,et al. Instruction Tuning with GPT-4 , 2023, ArXiv.
[11] Hannah Rose Kirk,et al. Assessing Language Model Deployment with Risk Cards , 2023, ArXiv.
[12] Markus Anderljung,et al. Protecting Society from AI Misuse: When are Restrictions on Capabilities Warranted? , 2023, ArXiv.
[13] Henrique Pondé de Oliveira Pinto,et al. GPT-4 Technical Report , 2023, 2303.08774.
[14] A. Stolzer,et al. Safety Management Systems in Aviation , 2023 .
[15] Dmitrii Krasheninnikov,et al. Harms from Increasingly Agentic Algorithmic Systems , 2023, FAccT.
[16] Yue Liu,et al. Towards Concrete and Connected AI Risk Assessment (C2AIRA): A Systematic Mapping Study , 2023, 2023 IEEE/ACM 2nd International Conference on AI Engineering – Software Engineering for AI (CAIN).
[17] Chris Ventura,et al. Examining the Differential Risk from High-level Artificial Intelligence and the Question of Control , 2022, Futures.
[18] Negar Rostamzadeh,et al. Sociotechnical Harms of Algorithmic Systems: Scoping a Taxonomy for Harm Reduction , 2022, AIES.
[19] Richard Ngo. The alignment problem from a deep learning perspective , 2022, ArXiv.
[20] Michael K Cohen,et al. Advanced Artificial Agents Intervene in the Provision of Reward , 2022, AI Mag..
[21] Richard Yuanzhe Pang,et al. What Do NLP Researchers Believe? Results of the NLP Community Metasurvey , 2022, ACL.
[22] Tom B. Brown,et al. Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned , 2022, ArXiv.
[23] Shiri Dori-Hacohen,et al. Current and Near-Term AI as a Potential Existential Risk Factor , 2022, AIES.
[24] Joshua Achiam,et al. A Hazard Analysis Framework for Code Synthesis Large Language Models , 2022, ArXiv.
[25] Jess Whittlestone,et al. A Survey of the Potential Long-term Impacts of AI: How AI Could Lead to Long-term Changes in Science, Cooperation, Power, Epistemics and Values , 2022, AIES.
[26] Inioluwa Deborah Raji,et al. The Fallacy of AI Functionality , 2022, FAccT.
[27] Lisa Anne Hendricks,et al. Taxonomy of Risks posed by Language Models , 2022, FAccT.
[28] Dan Hendrycks,et al. Actionable Guidance for High-Consequence AI Risk Management: Towards Standards Addressing AI Catastrophic Risks , 2022, ArXiv.
[29] Joseph Carlsmith. Is Power-Seeking AI an Existential Risk? , 2022, ArXiv.
[30] Dan Hendrycks,et al. X-Risk Analysis for AI Research , 2022, ArXiv.
[31] Michael C. Horowitz,et al. Forecasting AI Progress: Evidence from a Survey of Machine Learning Researchers , 2022, ArXiv.
[32] Jamy J. Li,et al. FMEA-AI: AI fairness impact assessment using failure mode and effects analysis , 2022, AI and Ethics.
[33] S. Ekins,et al. Dual use of artificial-intelligence-powered drug discovery , 2022, Nature Machine Intelligence.
[34] Roel Dobbe. System Safety and Artificial Intelligence , 2022, FAccT.
[35] Geoffrey Irving,et al. Red Teaming Language Models with Language Models , 2022, EMNLP.
[36] S. Kauffman,et al. How Organisms Come to Know the World: Fundamental Limits on Artificial General Intelligence , 2021, Frontiers in Ecology and Evolution.
[37] Haoran Sun,et al. Towards artificial general intelligence via a multimodal foundation model , 2021, Nature Communications.
[38] Ben Buchanan,et al. Truth, Lies, and Automation: How Language Models Could Change Disinformation , 2021 .
[39] Jess Whittlestone,et al. Artificial Canaries: Early Warning Signs for Anticipatory and Democratic Governance of AI , 2021, Int. J. Interact. Multim. Artif. Intell..
[40] Emily M. Bender,et al. On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜 , 2021, FAccT.
[41] S. Baum. Quantifying the probability of existential catastrophe: A reply to Beard et al. , 2020, Futures.
[42] S. Beard,et al. Existential risk assessment: A reply to Baum , 2020 .
[43] B. Bolwell. Good Judgment , 2020, Oncology Times.
[44] Cfp,et al. What can we learn from COVID-19? , 2020 .
[45] Andrew Critch,et al. AI Research Considerations for Human Existential Safety (ARCHES) , 2020, ArXiv.
[46] Peter Henderson,et al. Toward Trustworthy AI Development: Mechanisms for Supporting Verifiable Claims , 2020, ArXiv.
[47] Owen Cotton-Barratt,et al. Defence in Depth Against Human Extinction: Prevention, Response, Resilience, and Why They All Matter , 2020, Global policy.
[48] Vanessa J. Schweizer,et al. Reflections on cross-impact balances, a systematic method constructing global socio-technical scenarios for climate change research , 2020, Climatic Change.
[49] Inioluwa Deborah Raji,et al. Closing the AI accountability gap: defining an end-to-end framework for internal algorithmic auditing , 2020, FAT*.
[50] Stuart Russell. Human Compatible: Artificial Intelligence and the Problem of Control , 2019 .
[51] Mingguo Zhao,et al. Towards artificial general intelligence with hybrid Tianjic chip architecture , 2019, Nature.
[52] J. Guttag,et al. A Framework for Understanding Sources of Harm throughout the Machine Learning Life Cycle , 2019, EAAMO.
[53] Inioluwa Deborah Raji,et al. Actionable Auditing: Investigating the Impact of Publicly Naming Biased Performance Results of Commercial AI Products , 2019, AIES.
[54] David B. Paradice,et al. Forecasting Transformative AI: An Expert Survey , 2019, ArXiv.
[55] Emily M. Bender,et al. Data Statements for Natural Language Processing: Toward Mitigating System Bias and Enabling Better Science , 2018, TACL.
[56] Inioluwa Deborah Raji,et al. Model Cards for Model Reporting , 2018, FAT.
[57] M. Maas,et al. Governing Boring Apocalypses: A new typology of existential vulnerabilities and exposures for existential risk research , 2018, Futures.
[58] Seán Ó hÉigeartaigh,et al. Classifying global catastrophic risks , 2018, Futures.
[59] R. MacDonell-Yilmaz. At the precipice , 2018, Pediatric blood & cancer.
[60] Timnit Gebru,et al. Datasheets for datasets , 2018, Commun. ACM.
[61] Hyrum S. Anderson,et al. The Malicious Use of Artificial Intelligence: Forecasting, Prevention, and Mitigation , 2018, ArXiv.
[62] Timnit Gebru,et al. Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification , 2018, FAT.
[63] Anthony Michael Barrett,et al. Value of Global Catastrophic Risk (GCR) Information: Cost-Effectiveness-Based Approach for GCR Reduction , 2017, Decis. Anal..
[64] John Salvatier,et al. When Will AI Exceed Human Performance? Evidence from AI Experts , 2017, ArXiv.
[65] C. Robert. Superintelligence: Paths, Dangers, Strategies , 2017 .
[66] Doug Miller,et al. Intelligent, automated red team emulation , 2016, ACSAC.
[67] J. Lawler,et al. Viral agents of human disease: biosafety concerns. , 2016 .
[68] Anthony Michael Barrett,et al. A model of pathways to artificial superintelligence catastrophe for risk and decision analysis , 2016, J. Exp. Theor. Artif. Intell..
[69] Adam Tauman Kalai,et al. Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings , 2016, NIPS.
[70] Sangsung Park,et al. A Hybrid Method of Analyzing Patents for Sustainable Technology Management in Humanoid Robot Industry , 2016 .
[71] Oliver Zendel,et al. CV-HAZOP: Introducing Test Data Validation for Computer Vision , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[72] Roman V. Yampolskiy,et al. Taxonomy of Pathways to Dangerous AI , 2015, ArXiv.
[73] Seth D. Baum,et al. Risk Analysis and Risk Management for the Artificial Superintelligence Research and Development Process , 2015 .
[74] John Quigley,et al. Systemic risk elicitation: Using causal maps to engage stakeholders and build a comprehensive view of risks , 2014, Eur. J. Oper. Res..
[75] Stuart Armstrong,et al. The errors, insights and lessons of famous AI predictions – and what they mean for the future , 2014, J. Exp. Theor. Artif. Intell..
[76] M. G. Morgan. Use (and abuse) of expert elicitation in support of decision making for public policy , 2014, Proceedings of the National Academy of Sciences.
[77] Faisal Aqlan,et al. Integrating lean principles and fuzzy bow-tie analysis for risk assessment in chemical industry , 2014 .
[78] Lacey Colligan,et al. Assessing the validity of prospective hazard analysis methods: a comparison of two techniques , 2014, BMC Health Services Research.
[79] Michael G. Mitchell. Taxonomy , 2013, Viruses and the Lung.
[80] Nick Bostrom,et al. Existential Risk Prevention as Global Priority , 2013 .
[81] Lee T. Ostrom,et al. Risk Assessment: Tools, Techniques, and Their Applications , 2012 .
[82] Rick Parente,et al. A case study of long-term Delphi accuracy , 2011 .
[83] Marvin Rausand,et al. Risk Assessment: Theory, Methods, and Applications , 2011 .
[84] Thomas J. Chermack,et al. Scenario Planning in Organizations: How to Create, Use, and Assess Scenarios , 2011 .
[85] Andrew Lakoff,et al. Are we Prepared for the Next Disaster? , 2007 .
[86] J. Bryson,et al. Visible Thinking: Unlocking Causal Mapping for Practical Business Results , 2004 .
[87] Carl L. Pritchard,et al. Risk Management: Concepts and Guidance , 2001 .
[88] J. Reason. Human error: models and management , 2000, BMJ : British Medical Journal.
[89] George Wright,et al. The Delphi technique as a forecasting tool: issues and analysis , 1999 .
[90] J. Crisp,et al. The Delphi method? , 1997, Nursing research.
[91] R. Fildes. Scenarios: The Art of Strategic Conversation , 1996, J. Oper. Res. Soc..
[92] R. Schifter. White House , 1996 .
[93] Alan L. Porter,et al. Cross-impact analysis , 1990 .
[94] John R. Searle,et al. Minds, brains, and programs , 1980, Behavioral and Brain Sciences.
[95] Jeffrey L. Johnson,et al. A ten-year Delphi forecast in the electronics industry , 1976 .
[96] R. Korotev. Method , 1966, Understanding Religion.
[97] M. R. Leadbetter,et al. Hazard Analysis , 2018, System Safety Engineering and Risk Assessment.
[98] Our Principles , 1913, Texas medical journal.
[99] D. Manheim. Building a Culture of Safety for AI: Perspectives and Challenges , 2023, SSRN Electronic Journal.
[100] James Fox,et al. An analysis and evaluation of methods currently used to quantify the likelihood of existential hazards , 2020 .
[101] Jade Leung. Who will govern artificial intelligence? : learning from the history of strategic politics in emerging technologies , 2019 .
[102] M. Westerlund. The Emergence of Deepfake Technology: A Review , 2019, Technology Innovation Management Review.
[103] Roland Müller,et al. Fundamentals and Structure of Safety Management Systems in Aviation , 2014 .
[104] Nick Bostrom,et al. Future Progress in Artificial Intelligence: A Survey of Expert Opinion , 2013, PT-AI.
[105] R. Penrose,et al. How Long Until Human-Level AI ? Results from an Expert Assessment , 2011 .
[106] Tom Ritchey,et al. Modelling Society ’ s Capacity to Manage Extraordinary Events Developing a Generic Design Basis ( GDB ) Model for Extraordinary Societal Events using Computer-Aided Morphological , 2011 .
[107] Ben Goertzel,et al. How long until human-level AI? Results from an expert assessment , 2011 .
[108] N. Bostrom,et al. Global Catastrophic Risks , 2008 .
[109] Hannah Kosow,et al. Methods of Future and Scenario Analysis: Overview, Assessment, and Selection Criteria , 2008 .
[110] Eliezer Yudkowsky. Artificial Intelligence as a Positive and Negative Factor in Global Risk , 2006 .
[111] 知秋. Microsoft:微软“变脸” , 2006 .
[112] Martin Davies,et al. Safety First - Scenario Analysis under Basel II , 2006 .
[113] Steve Lewis,et al. Lessons Learned from Real World Application of the Bow-tie Method , 2005 .
[114] Richard A. Posner,et al. Catastrophe: Risk and Response , 2004 .
[115] Martin J. Rees,et al. Our final hour : a scientist's warning : how terror, error, and environmental disaster threaten humankind's future in this century-- on earth and beyond , 2003 .
[116] N. Bostrom. Existential risks: analyzing human extinction scenarios and related hazards , 2002 .
[117] U. Epa. Guidelines for ecological risk assessment , 1998 .
[118] Theodore Jay Gordon,et al. CROSS-IMPACT METHOD , 1994 .
[119] Richard Wilson. Risk analysis , 1986, Nature.
[120] J. Voelkel. Guide to Quality Control , 1982 .
[121] Heidy Khlaaf. Toward Comprehensive Risk Assessments and Assurance of AI-Based Systems , 2022 .