Understanding and Managing Complexity Risk

In the past, companies have tried to manage risks by focusing on potential threats outside the organization: competitors, shifts in the strategic landscape, natural disasters or geopolitical events. They are generally less adept at detecting internal vulnerabilities that creep into organizations and other human-designed systems. Indeed, as companies increase the complexity of their systems ? products, processes, technologies, organizational structures, legal contracts and so on ? they often fail to pay sufficient attention to the introduction and proliferation of loopholes and flaws. Ericsson, Barings Bank and Comair are but a few examples of companies that have suffered disastrous breakdowns in their complex internal systems. A crucial thing to remember is that the possibility of random failure rises as the number of combinations of things that can go wrong increases, and the opportunity for acts of malicious intent also goes up. Build new applications on top of legacy systems, and errors creep in between the lines of code. Merge two companies, and weaknesses sprout between the organizational boundaries. Build Byzantine corporate structures and processes, and obscure pockets are created where bad behavior can hide. Furthermore, the enormous complexity of large systems like communications networks means that even tiny glitches can cascade into catastrophic events. In fact, catastrophic events are almost guaranteed to occur in many complex systems, much like big earthquakes are bound to happen. So, without the benefit of perfect foresight, how can businesses uncover and forestall the fatal flaws lurking within their organizations? There are three complementary strategies: (1) Assess the risk to make better-informed decisions, such as purchasing an insurance policy to cover the risk; (2) spot vulnerabilities and fix them before catastrophic events occur; and (3) design out weaknesses through resilience. These ideas have been around for years, but researchers have recently had to reinvent them in the context of extremely complex, interconnected cascade-prone systems.

[1]  A. J. Grimes Normal Accidents: Living with High Risk Technologies , 1985 .

[2]  P. Kidwell,et al.  The mythical man-month: Essays on software engineering , 1996, IEEE Annals of the History of Computing.

[3]  D. Mccormick Normal Accidents , 1991, Bio/Technology.