Resilience Engineering : New directions for measuring and maintaining safety in complex systems Final Report , November 2008

Executive summary Resilience Engineering represents a new way of thinking about safety. Whereas established risk management approaches are based on hindsight and emphasise error tabulation and calculation of failure probabilities, Resilience Engineering looks for ways to enhance the ability of organisations to create processes that are robust yet flexible, to monitor and revise risk models, and to use resources proactively in the face of disruptions or ongoing production and economic pressures. In Resilience Engineering failures do not stand for a breakdown or malfunctioning of normal system functions, but rather represent the converse of the adaptations necessary to cope with the real world complexity. Individuals and organisations must always adjust their performance to the current conditions; and because resources and time are finite it is inevitable that such adjustments are approximate. Success has been ascribed to the ability of groups, individuals, and organisations to anticipate the changing shape of risk before damage occurs; failure is simply the temporary or permanent absence of that. In resilience engineering, assuring safety does not mean tighter monitoring of performance, more counting of errors, or reducing violations, since that may well be based on a faulty assumption: that safety should be defined as the absence of something because systems are already safe. The corrolary of this wrong assumption is that safety-critical systems need protection from unreliable humans—by more procedures, tighter monitoring, automation. We are not custodians of already safe systems. These systems always have to meet multiple opposing goals at the same time, and always with limited resources. It's only people who can reconcile these conflicting demands, who can hold together such inherently imperfect systems. People, at all levels of an organization, create safety through practice. So safety is not about the absence of something. It is about the presence of something. 3 But the presence of what? When we see things go right under difficult circumstances, we've found that it's mostly because of people's adaptive capacity—their ability to recognize, absorb, and adapt to changes and disruptions—some of which may even fall outside of what the system has been trained or designed to do. This is why we call it resilience—the ability to accommodate change, conflict, disturbance, without breaking down, without catastrophic failure. Resilience is not about reducing negatives (incidents, errors, violations). It's about identfying and then enhancing the positive capabilities of people and organizations that allow them to adapt effectively and safely under pressure. Resilience …

[1]  Erik Hollnagel Critical Information Infrastructures: Should Models Represent Structures or Functions? , 2008, SAFECOMP.

[2]  Gene I. Rochlin,et al.  Safe operation as a social construct , 1999 .

[3]  James T. Reason,et al.  Managing the risks of organizational accidents , 1997 .

[4]  Christopher Nemeth,et al.  Remaining sensitive to the possibility of failure , 2008 .

[5]  Gene I. Rochlin Iran Air Flight 655 and the USS Vincennes , 1990 .

[6]  Johan Bergström,et al.  Learning from failures in emergency response: Two empirical studies , 2008 .

[7]  Karlene H. Roberts,et al.  The Self-Designing High-Reliability Organization: Aircraft Carrier Flight Operations at Sea , 1987 .

[8]  J. Shaoul Human Error , 1973, Nature.

[9]  David Okrent,et al.  Man-made disasters , 1998 .

[10]  G. Klein,et al.  Decision Making in Action: Models and Methods , 1993 .

[11]  D. C. Miller,et al.  Evaluation of proposed control room improvements through analysis of critical operator decisions , 1981 .

[12]  Sidney Dekker Reporting and investigating events , 2009 .

[13]  Erik Hollnagel,et al.  Barriers And Accident Prevention , 2004 .

[14]  NTERNATIOf iROUP Defence in Depth in Nuclear Safety INSAG-10 , 2003 .

[15]  Robert L. Wears,et al.  Resilience Engineering: Concepts and Precepts , 2006, Quality and Safety in Health Care.

[16]  Sidney Dekker,et al.  CREW RESILIENCE AND SIMULATOR TRAINING IN AVIATION , 2008 .

[17]  J. E. Groves,et al.  Made in America: Science, Technology and American Modernist Poets , 1989 .

[18]  D. Mccormick Normal Accidents , 1991, Bio/Technology.

[19]  Sidney Dekker,et al.  From Threat and Error Management (TEM) to Resilience , 2006 .

[20]  Shawn Pruchnicki,et al.  Analysis of Comair flight 5191 with the functional resonance accident model , 2008 .

[21]  R. Westrum Cultures with Requisite Imagination , 1993 .

[22]  Sidney Dekker,et al.  Ten Questions About Human Error : A New View of Human Factors and System Safety , 2004 .

[23]  Margareta Lützhöft,et al.  RULE- AND ROLE-RETREAT: AN EMPIRICAL STUDY OF PROCEDURES AND RESILIENCE , 2009 .

[24]  Sidney W. A. Dekker,et al.  To Intervene or not to Intervene: The Dilemma of Management by Exception , 1999, Cognition, Technology & Work.

[25]  René Amalberti,et al.  The paradoxes of almost totally safe transportation systems , 2001 .

[26]  K. Weick The Vulnerable System: An Analysis of the Tenerife Air Disaster , 1990 .

[27]  K. WeiK. The Collapse of Sensemaking in Organizations: The Mann Gulch Disaster , 2009, STUDI ORGANIZZATIVI.

[28]  J. Logsdon The challenger launch decision: Risky technology, culture, and deviance at NASA , 1997 .

[29]  John M. Flach,et al.  Control Theory for Humans: Quantitative Approaches To Modeling Performance , 2002 .

[30]  Erik Hollnagel,et al.  Cognitive reliability and error analysis method , 1998 .

[31]  Christopher Nemeth,et al.  Resilience Engineering: Preparation and Restoration , 2009 .

[32]  R. Westrum,et al.  Requisite Imagination: The Fine Art of Anticipating What Might Go Wrong , 2003 .

[33]  J. A. Wise,et al.  Verification and Validation of Complex Systems: Human Factors Issues , 1993, NATO ASI Series.

[34]  N. Pidgeon,et al.  Man-made disasters: Why technology and organizations (sometimes) fail. , 2000 .

[35]  Jens Rasmussen,et al.  Risk management in a dynamic society: a modelling problem , 1997 .

[36]  Arnold Barnett,et al.  PASSENGER-MORTALITY RISK ESTIMATES PROVIDE PERSPECTIVES ABOUT AIRLINE SAFETY. , 2000 .

[37]  Erik Hollnagel,et al.  From protection to resilience: Changing views on how to achieve safety , 2008 .

[38]  J. Rassmusen,et al.  Information Processing and Human - Machine Interaction: An Approach to Cognitive Engineering , 1986 .

[39]  Anjum Naweed,et al.  Joint cognitive systems: Patterns in cognitive systems engineering , 2008 .

[40]  HERBERT A. SIMON,et al.  The Architecture of Complexity , 1991 .

[41]  Christopher Nemeth,et al.  Resilience Engineering: Remaining sensitive to the possibility of failure , 2008 .

[42]  Stefanie Huber,et al.  Learning from organizational incidents: Resilience engineering for high‐risk process environments , 2009 .

[43]  Neil S. Cherniack,et al.  Into Thin Air , 2000, Respiration.

[44]  Henry Petroski Vanities of the Bonfire , 2000 .

[45]  Sidney Dekker,et al.  From Punitive Action to Confidential Reporting , 2007 .

[46]  A. Tversky,et al.  Judgment under Uncertainty: Heuristics and Biases , 1974, Science.

[47]  T. Laporte,et al.  Working in Practice But Not in Theory: Theoretical Challenges of “High-Reliability Organizations” , 1991 .

[48]  K. Weick,et al.  Organizing for high reliability: Processes of collective mindfulness. , 1999 .

[49]  A.M.H. Nieuwpoort,et al.  Safety aspects of aircraft operations in crosswind , 2001 .

[50]  Christopher Nemeth,et al.  What went wrong at the Beatson Oncology Centre , 2008 .