Towards a Conceptual Framework for Resilience Engineering

As systems continue to grow in size and complexity, they pose increasingly greater safety and risk management challenges. Today when complex systems fail and mishaps occur, there is an initial tendency to attribute the failure to human error. Yet research has repeatedly shown that more often than not it is not human error but organizational factors that set up adverse conditions that increase the likelihood of system failure. Resilience engineering is concerned with building systems that are able to circumvent accidents through anticipation, survive disruptions through recovery, and grow through adaptation. This paper defines resilience from different perspectives, provides a conceptual framework for understanding and analyzing disruptions, and presents principles and heuristics based on lessons learned that can be employed to build resilient systems.

[1]  J. Tankard The Challenger Launch Decision: Risky Technology, Culture, and Deviance at NASA , 1996 .

[2]  Paul S. Fischbeck,et al.  Risk Management for the Tiles of the Space Shuttle , 1994 .

[3]  C. S. Holling Engineering Resilience versus Ecological Resilience , 1996 .

[4]  Wolter J. Fabrycky,et al.  Systems engineering and analysis , 1981 .

[5]  Nikos Zarboutis,et al.  Using Complexity Theories to Reveal Emerged Patterns that Erode the Resilience of Complex Systems , 2006 .

[6]  Azad M. Madni,et al.  5.4.1 ProACT™: Process‐aware Zero Latency System for Distributed, Collaborative Enterprises , 2002 .

[7]  Azad M. Madni,et al.  ProcessWeb/sup TM/: Web-enabled process support for planning the formation of a virtual enterprise , 1998, SMC'98 Conference Proceedings. 1998 IEEE International Conference on Systems, Man, and Cybernetics (Cat. No.98CH36218).

[8]  S. Luthar,et al.  The construct of resilience: a critical evaluation and guidelines for future work. , 2000, Child development.

[9]  Sidney Dekker,et al.  Ten Questions About Human Error : A New View of Human Factors and System Safety , 2004 .

[10]  Scott Jackson,et al.  Architecting Resilient Systems: Accident Avoidance and Survival and Recovery from Disruptions , 2008 .

[11]  Scott Jackson,et al.  SYSTEMS ENGINEERING FOR COMMERCIAL AIRCRAFT , 1997 .

[12]  Jean-Luc Gaudiot,et al.  Network Resilience: A Measure of Network Fault Tolerance , 1990, IEEE Trans. Computers.

[13]  Peter G. Neumann,et al.  Practical Architectures for Survivable Systems and Networks , 1999 .

[14]  D. Hantula Sources of Power: How People Make Decisions , 2001 .

[15]  W. A. Wallace,et al.  Adaptive Capacity: Electric Power Restoration in New York City following the 11 September 2001 Attacks , 2001 .

[16]  Diane Vaughan,et al.  The Challenger Launch Decision: Risky Technology, Culture, and Deviance at NASA , 1996 .

[17]  Ola Svenson,et al.  Value conflict and post-decision consolidation. , 2002, Scandinavian journal of psychology.

[18]  Hector N. Qirko Collapse: How Societies Choose to Fail or Succeed , 2005 .

[19]  J. C. Tressler,et al.  Fourth Edition , 2006 .

[20]  Charles E. Billings,et al.  Aviation Automation: The Search for A Human-centered Approach , 1996 .

[21]  Larry A. Mallak Toward a theory of organizational resilience , 1999, PICMET '99: Portland International Conference on Management of Engineering and Technology. Proceedings Vol-1: Book of Summaries (IEEE Cat. No.99CH36310).

[22]  Nancy G. Leveson,et al.  A New Approach To System Safety Engineering , 2005 .

[23]  E. H. Conrow,et al.  Effective Risk Management: Some Keys to Success , 2003 .

[24]  Karin M. Fviburg The Dance of Change , 2008 .

[25]  Robert L. Wears,et al.  Resilience Engineering: Concepts and Precepts , 2006, Quality and Safety in Health Care.

[26]  M E Paté-Cornell,et al.  Organizational aspects of engineering system safety: the case of offshore platforms. , 1990, Science.

[27]  Erik Hollnagel,et al.  Barriers And Accident Prevention , 2004 .

[28]  Scott Jackson 4.2.2 Organizational Safety: A Systems Engineering Perspective , 2002 .

[29]  Karl E. Weick,et al.  Managing the unexpected: Assuring high performance in an age of complexity. , 2001 .

[30]  Daniel E. Hastings,et al.  Defining Survivability for Engineering Systems , 2007 .

[31]  Kenneth L. Carper,et al.  Inviting Disaster: Lessons from the Edge of Technology , 2001 .

[32]  Herbert A. Simon,et al.  Models of Man: Social and Rational. , 1957 .

[33]  J. Logsdon The challenger launch decision: Risky technology, culture, and deviance at NASA , 1997 .

[34]  Scott Jackson,et al.  6.5.1 Attributes of a Managerial and Organizational Infrastructure to Enable Safe Systems , 2004 .

[35]  James T. Reason,et al.  Managing the risks of organizational accidents , 1997 .

[36]  Neil R. Storey,et al.  Safety-critical computer systems , 1996 .