Saturday, October 22, 2005

Further Thoughts on Engineering Disasters

My TiVo managed to save a few more episodes of Modern Marvels. You may remember I discussed engineering disasters last month. This episode of the show of the same title took a broader look at the problem. Three experts provided comments that resonated with me.

First, Dr. Roger McCarthy of Exponent, Inc. offered the following story about problems with the Hubble Space Telescope. When Hubble was built on earth, engineers did not sufficiently address issues with the weight of the lens on Earth and deflections caused by gravity. When Hubble was put in orbit, the lens no longer deflected and as a result it was not the proper shape. Engineers on Earth had never tested the lens because they could not figure out a way to do it.

So, they launched and hoped for the best -- only to encounter a disaster that required a $50 million orbital repair mission. Dr. McCarthy's comment was "A single test is worth a thousand expert opinions." This is an example of management by fact instead of management by belief, mentioned previously on this blog.

Second, Dr. Charles Perrow, author of Normal Accidents: Living With High-Risk Technologies, explained the makings of a disaster. Essentially, he said disasters are caused by the unforeseen consequences of multiple, individually non-devastating, failures in complex systems. Most catastrophes could be prevented if any one of the small failures had not occurred. Third, Mary Schiavo commented on the Challenger disaster. She described the well-known problems with operating the Shuttle's rocket O-rings in temperatures below 53 degrees F. The Shuttle had launched at lower temperatures prior to the Challenger explosion, but NASA knew they were risking catastrophe. Ms. Schiavo said NASA engineers begged their managers not to let Challenger launch, seeing that chunks of ice covered the launch pad and Shuttle. They were overruled and disaster occurred.

This struck a chord with me, because a few days earlier I read a new story in Time about how Steve Jobs gets Apple to bring innovative products to market:

Apple CEO Steve Jobs [will] tell you an instructive little story. Call it the Parable of the Concept Car. "Here's what you find at a lot of companies," he says, kicking back in a conference room at Apple's gleaming white Silicon Valley headquarters, which looks something like a cross between an Ivy League university and an iPod. "You know how you see a show car, and it's really cool, and then four years later you see the production car, and it sucks? And you go, What happened? They had it! They had it in the palm of their hands! They grabbed defeat from the jaws of victory!

"What happened was, the designers came up with this really great idea. Then they take it to the engineers, and the engineers go, 'Nah, we can't do that. That's impossible.' And so it gets a lot worse. Then they take it to the manufacturing people, and they go, 'We can't build that!' And it gets a lot worse."

When Jobs took up his present position at Apple in 1997, that's the situation he found. He and Jonathan Ive, head of design, came up with the original iMac, a candy-colored computer merged with a cathode-ray tube that, at the time, looked like nothing anybody had seen outside of a Jetsons cartoon. "Sure enough," Jobs recalls, "when we took it to the engineers, they said, 'Oh.' And they came up with 38 reasons. And I said, 'No, no, we're doing this.' And they said, 'Well, why?' And I said, 'Because I'm the CEO, and I think it can be done.'"


Would Steve Jobs have overruled the NASA engineers and launched Challenger? Who knows.

From what I have learned, disasters are prone to happen in complex, tightly-coupled systems. The only way to try to avoid them is to test and monitor their operation, exercise response, and then implement those plans when catastrophe occurs. Anything less is like launching a defective, untested Hubble and hoping for the best, and then paying through the nose to clean up the mess.

Here are a few footnotes to this post. Dr. McCarthy's company offers security engineering services, including services for information systems. They are described thus: "We have assembled one of the largest private collections of computerized accident and incident data in the world. Our web-based solutions put this information at your disposal, giving you comprehensive risk data quickly and at low cost." Dr. McCarthy was recently elected to the National Academy of Engineering, which has a Computer Science and Telecommunications Board with a Improving Cybersecurity Research in the United States project. My research for this story also led me to the System Safety Society.

No comments: