Watching more man-made failures on The History Channel's "Engineering Disasters," I realized lessons learned the hard way by safety, mechanical, and structural engineers and operators can be applied to those practicing digital security. >In 1983, en route from Virginia to Massachusetts, the World War II-era bulk carrier SS Marine Electric sank in high seas. The almost forty year-old ship was ferrying 24,000 tons of coal and 34 Merchant Mariners, none of whom had survival suits to resist the February chill of the Atlantic. All but three died.
The owner of the ship, Marine Transport Lines (MTL), blamed the crew and one of the survivors, Chief Mate Bob Cusick, for the disaster. Investigations of the wreck and a trial revealed the Marine Electric's coal hatch covers were in disrepair, as reported by Cusick prior to the disaster. Apparently the American Bureau of Shipping (ABS), an inspection organization upon which the Coast Guard relied, but funded by ship operators like MTL, had faked reports on the Marine Electric's status. With gaping holes in the coal hatches, the ship's coal containers filled with water in high seas and doomed the crew.
In the wake of the disaster, the Coast Guard recognized that ABS could not be an impartial investigator because ship owners could essentially pay to have their vessels judged seaworthy. Widespread analysis of ship inspections revealed many similar ships and others were unsound, and they were removed from service. Unreliable Coast Guard inspectors were removed. Finally, the Coast Guard created its rescue swimmer team (dramatized by the recent movie "The Guardian") to act as a rapid response unit.
The lessons from the Marine Electric disaster are numerous.
- First, be prepared for incidents and have an incident response team equipped and trained for rapid and effective "rescue."
- Second, be suspicious of reports done by parties with conflicts of interest. Stories abound of vulnerability assessment companies who find all of their clients "above average." To rate them otherwise would be to potentially lose future business.
- Third, understand how to perform forensics to discover root causes of security incidents, and be willing to act decisively if those findings demonstrate problems applicable to other business assets.
Despite the crude state of crash forensics in 1931, the Civil Aeronautics Authority (CAA) determined the plane crashed because its wood wing separated from its steel body during bad weather. TWA, operator of the doomed flight, removed all F-10s from service and burned them. Public pressure forced the CAA, forerunner of today's Federal Aviation Administration, to remove the veil of secrecy applied to its investigation and reporting processes. TWA turned to Donald Douglas for a replacement aircraft, and the very successful DC-3 was born.
The crash of TWA flight 599 provides several sad lessons for digital security.
- First, few seem to care about disasters involving new technologies until a celebrity dies. While no one would like to see such an event occur, it's possible real change of opinion and technology will not happen until a modern Knute Rockne suffers at the hands of a security incident.
- Second, authorities often do not have a real incentive to fix processes and methods until a tragedy like this occurs. Out of this incident came pressure to deploy flight data recorders and more robust aviation organizations.
- Third, real inspection regulations and technological innovation followed the crash, so such momentum may appear after digital wrecks.
Investigators decided to model the entire facility in a computer simulation, then monitor for the highest levels of sunlight over the course of a year. Using this data, 2% of the building's skin was discovered to be causing the reflection problems. The remediation plan, implemented in March 2005, resulted in sanding problematic panels to remove their sheen. The six-week, $60,000 effort fixed the glare.
The lessons from the concert hall involve complexity and unexpected consequences. Architect Frank Geary wanted to push the envelope of architecture with his design. His innovation caused a building that no one, prior to its construction, really understood. Had the system been modeled before being built, it's possible problems could have been avoided. This situation is similar to those involving enterprise network and software architects who design systems that no single person truly understands. Worse, the system may expose services or functionality never expected by its creators. Explicitly taking steps to simulate and test a new design prior to deployment is critical.
Digital security engineers should not ignore the lessons their analog counterparts have to offer. A commitment to learn from the past is the best way to avoid disasters in the future.