Emerging Technology

Embrace the collective power of file sharing on the World Wide Web

By Steven Johnson|Thursday, March 31, 2005
RELATED TAGS: COMPUTERS
emergingtech
emergingtech
Illustration by John Hersey

Chances are, whether you’re aware of it or not, you’ve participated in a spontaneous mass audience on the World Wide Web. Someone somewhere decides to share something that catches your interest: a home video, say, of the Indian Ocean tsunami taking out a beach resort. At first that sort of offering gathers an audience slowly. A few people send a link to friends, but soon the links become part of a positive feedback loop, and before long big media news sites have noticed the file, and the initial cluster of visitors becomes a swarm.

That cycle is a ubiquitous part of the Web’s information ecosystem, and many different terms have been coined to describe it, including tipping point, idea virus, and peer-to-peer marketing. The fundamental principle is that buzz spreads in a distributed, rather than centralized, fashion. Instead of turning to The New York Times or the network news to find out about online happenings, we hear about them from friends, or friends of friends. The Internet is so densely interconnected that a distributed mode of communication can quickly build into a spontaneous mass audience.

There’s a catch. Buzz about a new video clip spreads in a distributed way, but the process of viewing the clip remains defiantly centralized. The millions of people who heard about a tsunami video got word of it from thousands of different sources, but they all descended on a single Web server that hosted it. And when a million people all try to request a file from a server ill-prepared for the traffic, the result is like a thousand people showing up to take a ferry designed to hold a hundred passengers: Either most of the visitors get turned away, or the ferry sinks.

Online cognoscenti call this the Slashdot effect, after the popular technology site known for linking to interesting new online files and instantly swamping their servers. Think of it as the digital-age version of the old Oscar Wilde line “Each man kills the thing he loves.” Slashdot links to a new file because it contains something laudable or interesting, and thanks to that link, the file instantly disappears from view.

In our everyday world, that kind of popularity curse makes sense. Every holiday season some hot toy frustrates parents when demand exceeds supply. But in a world of pure information, the traditional rules of scarcity don’t necessarily apply: When you see signs of scarcity online, it’s a design flaw rather than a natural phenomenon. Ten years after the Web became a mass medium, a few software pioneers are finally fixing that flaw. And they’re doing it by embracing the power of swarms.

For some time now, there has been an ad hoc means of dealing with logjams created by spontaneous mass audiences—mirror sites containing copies of the original file. So when someone spreads news of a hot link, they typically offer a supplementary list of mirror sites in case the original is down. The idea is to manage the swarm by dispersing it.

An even better idea has emerged recently. Instead of creating mirror sites, some Webheads create a so-called torrent when a file is in great demand. That enables others to download the file using BitTorrent, a small but elegant program that actively encourages swarm formation and has a paradoxical effect. The more popular the file, the easier it becomes to download.

A few days after the Indian Ocean tsunami struck in December, a semianonymous user named Camiseta created a compilation of home video recordings of the disaster. The file itself was huge—21 megabytes, hundreds of times larger than a typical Web page. Within a matter of hours, as word got out, demand became overwhelming. But because Camiseta had created a torrent file for the video, it began circulating across the Net effortlessly. Meanwhile, another blogger called PunditGuy published a few large video files the old-fashioned way on his site—and within days was hit with a thousand-dollar bill from his service provider for handling all the traffic.

The secret to BitTorrent is collaboration. Every user trying to download a file also participates in sharing the file with other users. Instead of the traditional client and server relationship in which multiple clients request a file from a single server, every client in a BitTorrent system is also a server. These networks are called peer-to-peer. Older file-sharing systems like Napster also involved peer-to-peer networks, but BitTorrent takes extraordinary measures to ensure collaboration within the swarm. First, each file is divided into smaller pieces that can be freely distributed by peers before being assembled again into the complete file. When I downloaded the tsunami videos, I relied on a swarm of more than 50 peers, each sending me small bits of the larger video. As I downloaded the file, I was distributing the pieces that I had assembled so far to other peers. The more peers in the swarm, the faster the download.

In a traditional peer-to-peer system, the other users would have to wait until I’d downloaded the entire file before they could start requesting it from me—and if I logged off my computer or quit the application in the middle of their request, too bad for them. With BitTorrent, peers come and go without disturbing the process. If one of the peers logs off before you can get 15 seconds of video near the end of the clip, no problem. There’s usually another peer with that piece handy. To speed downloads, BitTorrent also uses an ingenious, if slightly counterintuitive, principle in selecting which piece of a file to request from another peer. The software scans all the available pieces from the connected peers and selects the scarcest piece in the swarm. If all peers simply downloaded pieces in order, the swarm would quickly accumulate an unnecessarily large supply of initial pieces and a much smaller supply of later ones. BitTorrent’s approach, called rarest first, creates a more even distribution of available pieces. If a given piece is in short supply, peers will start downloading it more often, thereby creating more available copies.

BitTorrent’s creator, Bram Cohen, also borrowed a crucial game-theory strategy for persuading peers to engage in unselfish behavior: Reward cooperation and punish cheating. The BitTorrent software is designed to bring cooperative peers together. If you have a track record of uploading information, and not just sucking down data from other peers without returning the favor, the software will automatically connect to other, equally magnanimous users, allowing you to download files faster. Cohen calls this “leech

resistance.”

As you might expect, BitTorrent has become the technology du jour for all sorts of illicit file-sharing activities, particularly for files that would have been prohibitively large for services like Napster. Entire seasons of current television shows are available, as are many new movies and video games. The Motion Picture Association filed suit late last year against several sites that track BitTorrent files. Much of the public discussion of the software has focused on the potential for intellectual-property abuse. In the long run, however, the real significance of BitTorrent may be the future it suggests for the architecture of the Web. For someone trying to get a large collection of bits out to the largest number of people, swarm collaboration is clearly the best approach. Instead of paying the cost of delivering those bits yourself, you borrow small chunks of bandwidth from each member of your audience. Computer game developers now regularly use torrents to distribute demo versions of their products, which can sometimes be hundreds of megabytes in size.

Ultimately, the entertainment industry might be better off embracing the BitTorrent approach by releasing torrent files that contain a pay-as-you-go copy protection. You’d download the latest Matrix movie, for example, via a swarm of other Matrix fans, but when you launched the assembled file, you’d have to pay a fee to watch it. End users would benefit by having a reliable supply of the latest media available at the click of the mouse, and the entertainment companies would benefit by off-loading all their distribution costs to the swarm.

Some aficionados believe torrents should be extended beyond online happenings and entertainment to the entire Web. For all the rhetoric of decentralization that surrounds the Internet, a limited number of popular Web servers do most of the heavy lifting when it comes to processing individual requests for information. That’s fine if the popular servers are all run by giant corporations. But in a world where spontaneous mass audiences are increasingly commonplace, we need a more collaborative model. Right now, BitTorrent is optimized for large files, but it’s conceivable that a true peer-to-peer system could distribute almost all the information online. The result would be fewer bottlenecks and a more even allocation of bandwidth costs. The small fries could get their data to a spontaneous mass audience, thanks to the swarm’s golden rule: The more you give, the easier it becomes to take.

Comment on this article
ADVERTISEMENT

Discover's Newsletter

Sign up to get the latest science news delivered weekly right to your inbox!

ADVERTISEMENT
ADVERTISEMENT
Collapse bottom bar
DSCSeptCover
+

Log in to your account

X
Email address:
Password:
Remember me
Forgot your password?
No problem. Click here to have it emailed to you.

Not registered yet?

Register now for FREE. It takes only a few seconds to complete. Register now »