Between the publication of Shannon’s paper and the early 1990s, researchers proposed better and better codes and also better and better decoding algorithms. But a practical capacity-approaching code remained elusive. “There used to be a saying among coding theorists,” says Forney, “that almost any code is a good one–except for all the ones we can think of.”
The codes that Gallager presented in his 1960 doctoral thesis were an attempt to preserve some of the randomness of Shannon’s hypothetical system without sacrificing decoding efficiency. Like many earlier codes, Gallager’s used so-called parity bits, which indicate whether some other group of bits have even or odd sums. But earlier codes generated the parity bits in a systematic fashion: the first parity bit might indicate whether the sum of message bits one through three was even; the next parity bit might do the same for message bits two through four, the third for bits three through five, and so on. In Gallager’s codes, by contrast, the correlation between parity bits and message bits was random: the first parity bit might describe, say, the sum of message bits 4, 27, and 83; the next might do the same for message bits 19, 42, and 65.
Gallager was able to demonstrate mathematically that for long messages, his “pseudo-random” codes were capacity-approaching. “Except that we knew other things that were capacity-approaching also,” he says. “It was never a question of which codes were good. It was always a question of what kinds of decoding algorithms you could devise.”
That was where Gallager made his breakthrough. His codes used iterative decoding, meaning that the decoder would pass through the data several times, making increasingly refined guesses about the identity of each bit. If, for example, the parity bits described triplets of bits, then reliable information about any two bits might convey information about a third. Gallager’s iterative-decoding algorithm is the one most commonly used today, not only to decode his own codes but, frequently, to decode turbo codes as well. It has also found application in the type of statistical reasoning used in many artificial-intelligence systems.
“Iterative techniques involve making a first guess of what a received bit might be and giving it a weight according to how reliable it is,” says Forney. “Then maybe you get more information about it because it’s involved in parity checks with other bits, and so that gives you an improved estimate of its reliability.” Ultimately, Forney says, the guesses should converge toward a consistent interpretation of all the bits in the message.
Although Gallager hadn’t been able to muster the courage to ask Shannon to be his advisor, he says that he did talk to Shannon “three or four times” while writing his thesis. “Except that talking to Claude three or four times was like talking to most people 50 times,” he says. “He was somebody who really caught on to the ideas very, very fast. He was not great at all the technical details. But to see the structure of something, to see why it ought to work, and to see what might make it better–well, he was certainly the smartest person I’ve ever met.”