The day AT&T’s lines went dead
The Y2K bug was the long-awaited disaster that didn’t happen; the AT&T crash 10 years earlier was the software disaster everyone thought couldn’t happen. Ma Bell had one of the world’s largest and most famously reliable networks: hurricanes and earthquakes couldn’t shake it, a 1989 U.S. Congressional report on the general unreliability of government software lauded the dependability of AT&T’s, and the company’s ads impugned the glitches that pestered upstart competitors Sprint and MCI. Then, on Jan. 15, 1990, a single switch at one of AT&T’s 114 switching centers suffered a minor mechanical malfunction, momentarily shutting down that center. When it came back up, it sent out a signal that made other centers trip and reset-and send out similar signals. The centers crashed, writes Leonard Lee in The Day the Phones Stopped, “like a hundred mud wrestlers crowded into a too-small arena,” each pulling himself up by pulling down the others. American Airlines estimated it lost 200,000 reservation calls, and CBS couldn’t even reach its local bureaus to check on the story. The culprit proved to be a single line of faulty code in a complex software upgrade recently implemented to speed up calling. AT&T’s much touted backup switching system carried the same fault and suffered the same crash. “The condition spread,” AT&T chairman Robert Allen confessed afterward, “because of our own redundancy.” The company did not keep that redundancy sufficiently insulated from the main system; it could have retained the old software in its backup system until it had thoroughly road-tested the new. But just maybe, the company’s programmers had come to believe their own good press.