Hello,

We noticed you're browsing in private or incognito mode.

To continue reading this article, please exit incognito mode or log in.

Not an Insider? Subscribe now for unlimited access to online articles.

How to Speed Up Movie Downloads

Researchers have designed a new way to get the most out of peer-to-peer file-sharing networks, decreasing the time it takes to download movies and music.

Let’s face it: peer-to-peer file transfers on the Internet are slow. More than half of all downloads fail, and the average transfer time for a 100-megabyte file is more than 24 hours. But now, a team of computer scientists led by Himabindu Pucha at Purdue University, in Indiana, say that they can double the speed of these transfers by taking advantage of overlap in data chunks contained within nonidentical multimedia files posted on peer-to-peer distribution networks. This would improve the likelihood of success of these transfers.

Speedy downloads: A new file-sharing approach can decrease the time it takes to download movies and music on peer-to-peer networks.

Peer-to-peer distribution networks such as BitTorrent and Kazaa allow people to download individual files from others’ computers. These systems first locate the copies of the requested file in the network’s global lookup table using its “hash”–a unique identifier computed from the file’s data sequence. Then, the file is divided into chunks so that each user’s computer only has to upload a small piece of it. This technique speeds up file transfers because home users typically have greater bandwidth allocated to downloads compared with uploads. Of course, the overall speed of the transfer will depend on the number of file sources and how much spare upload capacity they have. The more popular a file is, the faster it is to download and the greater the chance of success.

Computer scientist David Andersen, a professor of computer science at Carnegie Mellon University, worked with the Purdue group to develop a way to increase the size of the pool of uploaders called similarity-enhanced transfer (SET). The approach takes advantage of multiple variants of the same music files, video clips, and software, which are often floating around file-distribution networks. “We hope that SET gives you access to a larger pool of people to download from,” says Andersen. “And by doing so, we think you’re more likely to find one of these people who have more spare capacity.”

Before Andersen and his colleagues conducted their study, it was not at all clear how much redundancy existed in file-sharing networks and whether it could be exploited, says Cornell University computer scientist Emin Gün Sirer, who was not involved in the study. The SET team analyzed almost two terabytes of music and video files from file-sharing networks, and it discovered that similar files typically shared anywhere between 20 and 99 percent of their content. With music files, even misspellings in user-defined header labels that identify artist and song titles are enough to throw off BitTorrent, despite the fact that 99 percent of the file is the same. Similarly, multiple versions of the same video are often available with different language tracks.

One challenge in devising a distribution system that can locate similar files is that the system must search not just for each file but also for every chunk within that file. A 700-megabyte video clip may be divided into 40,000 chunks, which means that the system must make several billion comparisons. SET is a hybrid system that first locates users with identical files before searching for requested chunks in file variants. SET’s innovation in the latter task is what the researchers call handprinting, which efficiently identifies similar files using a constant number of search queries regardless of the file size. SET divides the requested file into 16-kilobyte chunks, which are then distilled into 160-bit-chunk hashes, or fingerprints. These fingerprints are sorted based on their numeric value, and the system selects the first few to form the handprint. Comparing handprints, says Andersen, “gives you a 90 percent chance of discovering a file that is 10 percent or more similar.”

Locating that file with just 10 percent similarity could speed up downloads by 8 percent. For music files with greater than 90 percent similarity, a five-minute download on BitTorrent would take just over two minutes with SET. For a single user, the savings could be even greater if he or she happens to be downloading an unpopular variant of a common file. Andersen proposes a scenario in which a U.S.-based user downloads a German version of a popular movie. Currently, the movie would most likely be transferred from a slower overseas connection. But with SEC, users could take advantage of faster local sources for video and receive only the audio from German peers.

“It’s a very clever scheme for finding the chunks in common,” says Sirer. However, he says that “for the most popular content, [SET] won’t make too much of a difference because there are already plenty of other peers who host that content. But I can imagine that other content which would otherwise be slow to get from a single swarm might actually be easier to download.”

Although the researchers have released the source code for the SET system, they have no plans to build a graphical user interface for it or to deploy it in current file-sharing networks. “The math behind it was complex to analyze,” Andersen says, “but the idea is relatively straightforward, and the implementation won’t be bad.” He says he wouldn’t be surprised if someone deployed the SET system in the next year.

Cut off? Read unlimited articles today.

Become an Insider
Already an Insider? Log in.
Want more award-winning journalism? Subscribe to Insider Plus.
  • Insider Plus {! insider.prices.plus !}*

    {! insider.display.menuOptionsLabel !}

    Everything included in Insider Basic, plus the digital magazine, extensive archive, ad-free web experience, and discounts to partner offerings and MIT Technology Review events.

    See details+

    What's Included

    Unlimited 24/7 access to MIT Technology Review’s website

    The Download: our daily newsletter of what's important in technology and innovation

    Bimonthly print magazine (6 issues per year)

    Bimonthly digital/PDF edition

    Access to the magazine PDF archive—thousands of articles going back to 1899 at your fingertips

    Special interest publications

    Discount to MIT Technology Review events

    Special discounts to select partner offerings

    Ad-free web experience

/3
You've read of three free articles this month. for unlimited online access. You've read of three free articles this month. for unlimited online access. This is your last free article this month. for unlimited online access. You've read all your free articles this month. for unlimited online access. You've read of three free articles this month. for more, or for unlimited online access. for two more free articles, or for unlimited online access.