chapter 1. Why Do We Need Something Different? This book presents a new approach to building safer systems that departs in impor- tant ways from traditional safety engineering. While the traditional approaches worked well for the simpler systems of the past for which they were devised, signifi- cant changes have occurred in the types of systems we are attempting to build today and the context in which they are being built. These changes are stretching the limits of safety engineering.. point one. •Fast pace of technological change. Although learning from past accidents is still an important part of safety engineering, lessons learned over centuries about designing to prevent accidents may be lost or become ineffective when older technologies are replaced with new ones. Technology is changing much faster than our engineering techniques are responding to these changes. New technology introduces unknowns into our systems and creates new paths to losses.. point two. •Reduced ability to learn from experience. At the same time that the develop- ment of new technology has sprinted forward, the time to market for new products has greatly decreased, and strong pressures exist to decrease this time even further. The average time to translate a basic technical discovery into a commercial product in the early part of this century was thirty years. Today our technologies get to market in two to three years and may be obsolete in five. We no longer have the luxury of carefully testing systems and designs to understand all the potential behaviors and risks before commercial or scientific use.. point three. •Changing nature of accidents. As our technology and society change, so do the causes of accidents. System engineering and system safety engineering techniques have not kept up with the rapid pace of technological innovation. Digital technology, in particular, has created a quiet revolution in most fields of engineering. Many of the approaches to prevent accidents that worked on electromechanical components—such as replication of components to protect against individual component failure—are ineffective in controlling accidents that arise from the use of digital systems and software.. point four. •New types of hazards. Advances in science and societal changes have created new hazards. For example, the public is increasingly being exposed to new man- made chemicals or toxins in our food and our environment. Large numbers of people may be harmed by unknown side effects of pharmaceutical products. Misuse or overuse of antibiotics has given rise to resistant microbes. The most common safety engineering strategies have limited impact on many of these new hazards. point five. •Increasing complexity and coupling. Complexity comes in many forms, most of which are increasing in the systems we are building. Examples include interactive complexity (related to interaction among system components), dynamic complexity (related to changes over time), decompositional complex- ity (where the structural decomposition is not consistent with the functional decomposition), and nonlinear complexity (where cause and effect are not related in a direct or obvious way). The operation of some systems is so complex that it defies the understanding of all but a few experts, and some- times even they have incomplete information about the system’s potential behavior. The problem is that we are attempting to build systems that are beyond our ability to intellectually manage; increased complexity of all types makes it difficult for the designers to consider all the potential system states or for operators to handle all normal and abnormal situations and distur- bances safely and effectively. In fact, complexity can be defined as intellectual unmanageability. This situation is not new. Throughout history, inventions and new technology have often gotten ahead of their scientific underpinnings and engineering knowledge, but the result has always been increased risk and accidents until science and engineering caught up.1 We are now in the position of having to catch up with our technological advances by greatly increasing the power of current approaches to controlling risk and creating new improved risk management strategies. Footnote to point five. As an example, consider the introduction of high-pressure steam engines in the first half of the nine- teenth century, which transformed industry and transportation but resulted in frequent and disastrous explosions. While engineers quickly amassed scientific information about thermodynamics, the action of steam in the cylinder, the strength of materials in the engine, and many other aspects of steam engine operation, there was little scientific understanding about the buildup of steam pressure in the boiler, the effect of corrosion and decay, and the causes of boiler explosions. High-pressure steam had made the current boiler design obsolete by producing excessive strain on the boilers and exposing weaknesses in the materials and construction. Attempts to add technological safety devices were unsuccessful because engineers did not fully understand what went on in steam boilers. It was not until well after the middle of the century that the dynamics of steam generation was understood. point six. •Decreasing tolerance for single accidents. The losses stemming from acci- dents are increasing with the cost and potential destructiveness of the systems we build. New scientific and technological discoveries have not only created new or increased hazards (such as radiation exposure and chemical pollution) but have also provided the means to harm increasing numbers of people as the scale of our systems increases and to impact future generations through envi- ronmental pollution and genetic damage. Financial losses and lost potential for scientific advances are also increasing in an age where, for example, a space- craft may take ten years and up to a billion dollars to build, but only a few minutes to lose. Financial system meltdowns can affect the world’s economy in our increasingly connected and interdependent global economy. Learning from accidents or major losses (the fly-fix-fly approach to safety) needs to be supplemented with increasing emphasis on preventing the first one. point seven •Difficulty in selecting priorities and making tradeoffs. At the same time that potential losses from single accidents are increasing, companies are coping with aggressive and competitive environments in which cost and productivity play a major role in short-term decision making. Government agencies must cope with budget limitations in an age of increasingly expensive technology. Pres- sures are great to take shortcuts and to place higher priority on cost and sched- ule risks than on safety. Decision makers need the information required to make these tough decisions. point eight. •More complex relationships between humans and automation. Humans are increasingly sharing control of systems with automation and moving into positions of higher-level decision making with automation implementing the decisions. These changes are leading to new types of human error—such as various types of mode confusion—and a new distribution of human errors, for example, increasing errors of omission versus commission. Inade- quate communication between humans and machines is becoming an increas- ingly important factor in accidents. Current approaches to safety engineering are unable to deal with these new types of errors. All human behavior is influenced by the context in which it occurs, and operators in high-tech systems are often at the mercy of the design of the auto- mation they use or the social and organizational environment in which they work. Many recent accidents that have been blamed on operator error could more accurately be labeled as resulting from flaws in the environment in which they operate. New approaches to reducing accidents through improved design of the workplace and of automation are long overdue. point nine. •Changing regulatory and public views of safety. In today’s complex and interrelated societal structure, responsibility for safety is shifting from the individual to government. Individuals no longer have the ability to control the risks around them and are demanding that government assume greater respon- sibility for ensuring public safety through laws and various forms of oversight and regulation as companies struggle to balance safety risks with pressure to satisfy time-to-market and budgetary pressures. Ways to design more effective regulatory strategies without impeding economic goals are needed. The alter- native is for individuals and groups to turn to the courts for protection, which has many potential downsides, such as stifling innovation through fear of law- suits as well as unnecessarily increasing costs and decreasing access to products and services. end of points. Incremental improvements in traditional safety engineering approaches over time have not resulted in significant improvement in our ability to engineer safer systems. A paradigm change is needed in the way we engineer and operate the types of systems and hazards we are dealing with today. This book shows how systems theory and systems thinking can be used to extend our understanding of accident causation and provide more powerful (and surprisingly less costly) new accident analysis and prevention techniques. It also allows a broader definition of safety and accidents that go beyond human death and injury and includes all types of major losses including equipment, mission, financial, and information. Part I of this book presents the foundation for the new approach. The first step is to question the current assumptions and oversimplifications about the cause of accidents that no longer fit today’s systems (if they ever did) and create new assump- tions to guide future progress. The new, more realistic assumptions are used to create goals to reach for and criteria against which new approaches can be judged. Finally, the scientific and engineering foundations for a new approach are outlined. Part II presents a new, more inclusive model of causality, followed by part III, which describes how to take advantage of the expanded accident causality model to better manage safety in the twenty-first century.