Shannon’s distaste for the limelight bordered on reclusiveness. According to Joel West ‘79, a professor at San José State University’s College of Business who’s writing a book about the development of information theory, Shannon advised only seven graduate students during his 22 years at MIT. “He was quite shy and retiring, so if you wanted to get him as a supervisor, you really had to be quite aggressive about it,” says Gallager. “I was shy and retiring also, and didn’t have enough self-confidence to even go in and talk to the guy.”
As a teacher, Shannon had little patience for the tedium of the familiar. “He was much more interested in the new than in the old,” says Elwyn Berlekamp ‘62, SM ‘62, PhD ‘64, a professor emeritus of mathematics at the University of California, Berkeley, who (with Gallager) was a coauthor on Shannon’s final published paper.
“He did not teach a whole lot,” says Gallager. “But when he taught, it was like giving research talks. I remember once he gave a course, which was about 25 lectures during the term, and every lecture was a new research result. He would do them one after another and never failed to come up with something interesting. It was a really fantastic period.”
“Shannon was, in my opinion, a little bit out of place in academia,” says James L. Massey, SM ‘60, PhD ‘62, an information theorist and professor emeritus at ETH Zurich. “His real genre was to be an independent researcher and do things in his own highly individualistic way.”
It may be, too, that Shannon was simply uncomfortable with adulation. Berlekamp recalls when the IEEE Information Theory Society invited Shannon to deliver a lecture and receive its inaugural Shannon Award in Israel in 1973. “I’ve never seen anybody with more butterflies than him,” he says. “Five minutes before the talk’s to start, he’s at the bar, and he’s pretty depressed. He’s really afraid of going on stage and disappointing everybody. Because of course they expect God, which is true, and he knows he can’t perform like God.”
But if Shannon was rarely a direct mentor to young students of information theory, he had set them an irresistible challenge. Random coding would never work in practice: the size of Shannon’s hypothetical codebook doubled with each additional bit in the message. The codebook for a single 1,000-bit packet of data traveling over the Internet would require more entries than there are atoms in the universe. But any more practical coding mechanism–like repeating the original message, or adding extra bits that described message bits–was the equivalent of some random coding scheme, in that it would generate the same code words. And by demonstrating that the vast majority of random coding schemes were capacity-approaching, Shannon offered hope that one of the practical ones was as well.
Instead of using a codebook to match code words and messages, a practical coding scheme would provide a way to extract the message from the code words computationally. A series of mathematical operations could, with a high probability of accuracy, identify and correct errors in a possibly corrupted bit sequence received over a noisy channel.
It’s one of the peculiarities of error-correcting codes that a good encoding algorithm doesn’t necessarily imply a good decoding algorithm. Using statistical analyses similar to Shannon’s, coding theorists were able to show that a given code was capacity-approaching–that it would maximize the difference between code words. But that didn’t mean they had an efficient way to decode it.