A Collection of Articles
Edit

Web

Akamai's Algorithms

Tom Leighton has the formula for going from MIT math professor to Internet gazillionaire.

You do the math. Tom Leighton, a professor at MIT’s Laboratory for Computer Science, or LCS, holds nearly 10 million shares in Akamai Technologies, a company he co-founded in August 1998. Last October, Akamai went public, with prices at the initial public offering (IPO) starting off at $26 a share; by the end of the day, investors had bid the price up to $145 a share. A month later the stock was selling at $327 a share. No matter how much math anxiety you might have, you get the point-Tom Leighton had become a very rich man.

An academic whose expertise is in parallel algorithms and applied mathematics, Leighton is at first glance an unlikely candidate for an Internet tweeds-to-riches success story. But on closer examination, it makes perfect sense. For years, Leighton has been scrutinizing how complex networks operate-and how they can be optimized. So, five years ago, when Tim Berners-Lee (the inventor of the World Wide Web) came down the hall at LCS looking for ways to better manage the escalating traffic flow on the Internet, Leighton and his crew of graduate students were an obvious place to drop in.

During the next several years, Leighton and a mix of MIT graduate students and undergrads tried to figure out a better way to manage and distribute content over the Web. In early 1998, the group, which included grad student Daniel Lewin (who along with Leighton and Jonathan Seelig, a student at MIT’s Sloan School, went on to found Akamai), entered the MIT $50K Entrepreneurship Competition. The team was a finalist but didn’t win. Still, the venture capitalists came knocking. And the rest is Internet history. Today the company runs a worldwide network of more than 4,000 servers that distributes Web content for such customers as Yahoo!, CNN and C-SPAN; if a PC user requests, for example, videostreaming from C-SPAN’s Web site, the Akamai system of servers helps to deliver that content, thereby avoiding bottlenecks at C-SPAN’s centralized site. The distributed network makes content delivery over the Web quicker and more reliable.

Despite hitting the IPO jackpot, the soft-spoken MIT professor (currently on a leave of absence from LCS) displays few overt signs of material success. At Akamai’s new headquarters adjacent to the MIT campus, Leighton, the company’s chief scientist, occupies a modest corner office overseeing a maze of cubicles. It’s very much the office of a professor, and Leighton speaks in the patient and precise words of someone used to explaining how things work. TR Senior Editor David Rotman recently went over for a lesson on managing traffic on today’s Internet.

TR: When did it occur to you that you could use algorithms to optimize content delivery on the Web?
LEIGHTON: The first time I ever thought about the Internet was in 1995. My office [at MIT’s LCS] is down the hall from Tim Berners-Lee and the Web Consortium. Over time we talked about some of the issues facing the Internet. These are the kinds of large-scale networking problems that our group was working on and that I have a long-term interest in. So we took on some of them as research projects.

TR: In a sense, the Internet is really the ultimate networking challenge, isn’t it?
LEIGHTON: Yes. That’s right.

TR: What was the problem that you started with in ‘95?
LEIGHTON: We were looking at ways to deal with flash crowding and hot-spotting. That’s where a lot of people go to one site at one time and swamp the site and bring down the network around it-and make everyone unhappy.

TR: Can you explain the technologies you’ve developed?
LEIGHTON: Today we’re probably one of the world’s largest distributed networks. At a high level, we’re serving content or handling applications for end users, and we’re doing that from servers that are close to the end users. “Close” is something that changes dynamically, based on network conditions, server performance and load. Because we’re close, we can avoid a lot of the hangups, delays and packet loss that you might experience if you’re far away. Before, you typically got your interaction with a central Web site. And typically that was far away. Now you typically have a lot of your interactions-not all, but a lot-with an Akamai server that is near you and is selected in real time.

TR: What are the tricks and challenges to making this distributed system work?
LEIGHTON: It’s an extremely hard area; you can’t go and just throw a bunch of servers out there and have them all work with each other. The servers themselves are going to fail. Processors are going to fail. The Internet has all sorts of its own issues and failure modes. So all these kinds of things have to be built into the algorithmic approach. How do you develop a decentralized algorithm with imperfect information that is still going to work? That’s a huge challenge. But it’s clearly what you have to do. You can’t have any central point of failure or the system will come down. I can’t think of a component or a piece of hardware that hasn’t failed at some point or some place. So, it’s a given [that you need a distributed system].

When a client comes to one of our customers looking for content, we have to figure out where that client is, which of our locations at that moment is the best to serve the client from, and what load conditions are, so we don’t overload anything. We have got to handle flash crowds that are both geographic and content specific. We have got to replicate the content immediately to handle any of those kinds of issues, but you can’t afford to have copies of everything everywhere. You’ve got to make these decisions and respond back to the clients in milliseconds. We’ve got to be automatic. And when pieces fail, you’ve got to compensate automatically for that.

TR: That’s what you call fault tolerant?
LEIGHTON: Yes, and you have to be fault tolerant across all aspects. Then there are also the non-obvious things. Like billing. We’re serving billions of hits a day, and we’re billing for every single hit. We’ve got to figure out whose content it was and how many bytes it has, and bill them for it. On top of that, we have a service that we offer our customers, where they can see within 60 seconds how many hits we served for them in the last 60 seconds. In addition, we can break down for our customers where the hits are coming from by country or state. It’s a challenging algorithmic problem. How do you actually do that? And make it work with a finite amount of hardware and resources?

TR: Hardware isn’t really the key to this, is it?
LEIGHTON: It’s not even a major component. I don’t want to belittle our hardware partners, but the key here is the algorithmic and software infrastructure. It’s critical.

TR: What is your competition in offering a distributed network for content delivery?
LEIGHTON: There’s not really much out there. We’re at a time when there’s a lot of business plans and there’s a lot of stories. There’s not much in the way of real services available today. Pretty much the only competitor in our space is Digital Island, which recently acquired Sandpiper [Networks]. There are others that have announced [business plans] but are not actively carrying traffic yet. One of the things that distinguishes Akamai is the amount of research and engineering and R&D effort that went into designing the system. It’s not just throwing a bunch of boxes out there. There are companies that have tried do that with no distributed system. The companies that announced services based on that approach two or three years ago aren’t still in business. Doing that didn’t work.

TR: What are the upcoming challenges for the technology? Is it to deliver content faster?
LEIGHTON: That’s a component. We’re trying to deliver on the promise of the Internet. There is the idea that there is a tremendous revolution happening with regard to the Internet. At the same time, there’s frustration because of the limitations. What we’re trying to do is to make the Internet more useful. And a component of that is making it faster and more reliable. Another component, somewhat related, is enabling the delivery of more enriching, more enabling content. If we can make streaming better, and in this case speed is not so much the issue, it’s bandwidth and not having packet loss, you’re going to get a much better image on your screen; you’ll do more with it, and more people are going to use it to convey content and information. And that’s invaluable in enriching the power of the Internet.

But not everything is pushing bits. Akamai offers services for capabilities such as Internet conferencing that enable, for example, distance learning. With these services, content providers or enterprise customers can effectively deliver content and interact with small or large audiences on the Web through live audio and video; there are features for sharing presentations, audience polling and moderating messaging.

TR: When you introduce a new function like conferencing, for example, what demands does it place on the network?
LEIGHTON: How are you going to implement it? How are you going to integrate it into this massive distributed platform? How are you going to maintain it for thousands of customers? You have thousands of customers and hundreds of millions of people accessing those customers, and we’re sitting in between. And it all has to work by itself. You can’t be monkeying around. Delivering conferencing sounds simple. But it’s not so simple when you’re talking this kind of scale. When people think about streaming they think of a single source where the content comes from, and then it branches out in a tree through the Internet. Those places can break down and then all those people downstream are out of luck. We’ve developed an entirely new way of going about it so that there’s no critical point of failure. If the source dies, then you’re stuck. But once [the content] is out of the source, we replicate it and spread it throughout the system. So, it’s not a tree.

TR: What does it look like?
LEIGHTON: It’s hard to describe. The way to think about it is that between the source and destination, you have multiple transmissions going on such that you can lose content on those paths; you can have packet loss on any or all of them, but at the endpoint you have enough information coming in from those locations so you can reconstruct the signal. So, if something gets killed along the way, such as a path gets killed, nobody’s affected.

TR: We’ve all experienced frustrations with videostreaming. In terms of the technology, what will it take to make it more reliable? When will we be able to watch webcasts as easily as TV on a full screen?
LEIGHTON: In order that videostreaming be more reliable, you need a content distribution service to deliver the bits reliably to the edge of the network, and then you have to have a reliable last-mile connection to the Internet. If you want high-quality video, then you better have a high-bandwidth connection to the Internet. It will still be some time before you can get TV-quality videostreams on a widespread basis.

We’ve demonstrated a megabit-per-second live stream. In fact, just recently we carried thousands of one-megabit-per-second streams to live customers accessing a conference keynote address by Steve Jobs [CEO of Apple Computer]. This is a major milestone for the Internet. With that technology you get a very high quality videostream. If the last mile is broadband, then you’re all set to go. One thing we’re working on is bandwidth profiling. The idea is to automatically detect the bandwidth of the last mile. Does the client have a broadband connection, a 28K modem, or is it narrow band-a cell phone or something? Then we deliver the content as a function of that. So if you detect that the client has high bandwidth, they get the high-bandwidth version-the streamed version as opposed to the static version. Or in the case of narrow bandwidth, you get a printed version as opposed to the graphics.

TR: The very nature of the Web seems to be changing with such functions as videostreaming and conferencing. What will Akamai be working on in five years? What do you think the Internet will be like then?
LEIGHTON: Things move so fast, it’s really hard to predict. People who try to predict end up eating their words. I think we’re just at the beginning of the Internet revolution. I don’t think we’ve even begun to think of all the things that we can be doing on the Internet. I can’t tell you what will be the hot service five years from now. I don’t know. I would hope by then that, for example, the quality of streaming is much better. That it’s part of daily life. At the least, I would expect the typical Web experience to become richer, more efficient and more reliable than it is today.

TR: You are seen by many as a model of an academic making it big as an entrepreneur in the new economy. What do you tell those looking to emulate your success?
LEIGHTON: I never had an aspiration to be an entrepreneur. I love academics and co-founded Akamai because we felt it was the best way to transfer our technology from a research environment into practice. It felt really nice to be taking technology, especially technology out of a university, and making a difference with it. That’s probably the biggest reward. It often takes 10 to 20 years for a technology in a university to really manifest itself in practice. And this time we’re able to decrease that time dramatically. I’m perfectly happy writing a paper that only five people read. Pretty smart people will read it, and I get a kick out of that. It’s what I spent all my life doing. But this is something with a chance to make a difference.

TR: Do you ever miss the days when, as you put it, you spent your time writing papers that maybe five people were able to read and understand?
LEIGHTON: Yes, although I don’t have much time to think about it.

You've read of free articles this month.