Content Is King Can Researchers Design An Information Centric Internet

In 2009 singer Susan Boyle’s extremely popular YouTube video of the Les Miserables song “I Dreamed a Dream” racked up 140 million hits in just four days, the equivalent of a digital tsunami that blasted the Internet with gale-force winds. Given that the Internet was created more than four decades ago primarily as a communications network, few content providers other than Google could have successfully managed the storm of requests coming in for access to that video without crashing.

The Internet was designed for “computers to make phone calls to other computers, and that’s a really inefficient way of distributing content,” Van Jacobson, a former research fellow at Palo Alto Research Center (PARC), said in a 2011 video interview on the company’s Web site. YouTube successfully handled the inundation of requests for Boyle’s video “because they’re a big, distributed content source spread out all over the planet, so they wouldn’t get the kind of traffic concentration that would prevent something from working. Basically, Google’s the only place that can do that.”

The Susan Boyle YouTube phenomenon punctuated Jacobson’s contention, shared by many computer scientists worldwide, that the Internet is desperately in need of a makeover to transform it from a network that emphasizes where data is located to one that focuses more on the nature of the data itself.

From his arrival at the Xerox subsidiary Palo Alto Research Center (PARC) in 2006 until his departure in October, Jacobson led the organization’s Project CCNx effort to overcome the current Internet architecture’s shortcomings as a media distribution platform. “The goal of content-centric networking is to get out of this phone-call world and instead ask the network for what you want,” says Jacobson, who in the 1980s helped improve TCP/IP (Transfer Control Protocol/Internet Protocol), the protocol that the Internet uses to handle network congestion. Jacobson is still involved in the Named Data Networking (NDN) Project, an 11-university research program funded under the National Science Foundation’s Future Internet Architectures Project.

Changing times Since its inception as a system of interconnected computer networks in the late 1960s and early ’70s the Internet has grown into a global electronic backbone for commerce, entertainment, finance, health care and nearly every other facet of life. In addition, the emergence of cloud computing, social networks and the mobile Web have turned the Internet into a distributed system for posting and accessing information from a variety of devices in ways that existing Internet protocols were not designed to accommodate. The location of the data in question—expressed by its IP address—has become less relevant because that data can reside on several different servers (known colloquially as “the cloud”) and be stored in short-term cache memory at various locations throughout a network.

Of course, the Internet still functions quite well despite the changing ways in which we use it, but this situation only exists because the basic model for enabling communication between source and destination addresses has been heavily modified over time, a team of researchers noted in the July issue of IEEE Communications Magazine (pdf).

Focusing the Internet’s routing infrastructure on content rather than addresses would better accommodate today’s speed and security needs. “The interesting thing about allowing routers to use bits in the packets that are not addresses [is] that you can configure a network or network of networks around something other than formal address structures,” says Vint Cerf, the Internet pioneer who co-developed TCP/IP and now serves as Google’s chief Internet evangelist. In this scenario, content-centric identifiers would tell the routers to forward a particular packet in several directions because there are multiple parties on the Net that are interested in seeing that content, he adds.

Information-centric There is no shortage of research underway to improve the Internet’s performance and security mechanisms. Some projects aim to boost network speeds or reroute data to meet the growing demands of bandwidth-hogging multimedia content. Others focus on better protecting Net-connected computers, servers and other devices from malicious software and other digital threats. An emerging area of research that includes Project CCNx is information-centric networking (ICN), which seeks to cover all of these bases, and then some. An information-centric version of the Internet would include several fundamental changes: For starters, data packets would be labeled according to the information they contain rather than an IP address. Ideally, this change would give Internet users more direct control over their personal information, allowing them to restrict access to their data and monitor how and when it is accessed. “Such control is achieved by tying the security of the content to the identification of it,” says Dirk Trossen, a senior researcher at the University of Cambridge Computer Laboratory’s Networks and Operating Systems group.

The ICN model also proposes that users retrieve information from locations closest to them, a process much more efficient than the current approach of routing information requests throughout the Internet. If an Internet user in the U.S. is looking for the latest BBC news, for example, this information is likely to exist in cache memory in computers within the U.S., Trossen explains. The ability to access this data domestically, rather than routing the same content from computers in the U.K., reduces traffic over a larger expanse of network.

Hot PURSUIT Trossen is the lead researcher on an ICN project called Publish Subscribe Internet Technology (PURSUIT), which espouses a variation on the “publish-and-subscribe” model already popular with Internet users who sign up for RSS feeds and e-mail distribution lists, to name a few. In principle, a network built using the PURSUIT model would ensure that users receive only content in which they have explicitly expressed an interest. This specificity would go far toward cutting down on spam and computer viruses as well as speeding up network traffic.

The three-year, $6.7-million PURSUIT Internet improvement project—essentially a continuation of the work begun in the Publish–Subscribe Internet Routing Paradigm (PSIRP) Project from January 2008 to June 2010—wraps up in February. Based in Europe, it includes eight research organizations across Finland, Germany, Greece and the U.K. (pdf). After February Trossen and his colleagues plan to demonstrate a prototype PURSUIT network known as Blackadder to technology companies and other researchers who might be interested in continuing and/or funding this work.

One of these demos is designed to show the PURSUIT network’s resilience in delivering data to subscribers even if part of the network becomes disconnected. A second demo illustrates PURSUIT’s ability to adjust the delivery of streaming video based on a particular country’s constraints on what it considers objectionable content. “Since the various pieces of data are individually identified, they are automatically retrieved from the nearest cache storage where they are available,” Trossen says. “You can then replace any offending content piece with an alternative clip while leaving the rest of the content intact.”

Trossen and his team want to expand their Blackadder PURSUIT-based network beyond its current 40 nodes so that it includes hundreds of devices sending and receiving data. Only at this scale and beyond can the researchers determine how well their publish-and-subscribe model will hold up and whether the idea merits further investment.

Whether PURSUIT, PARC’s CCNx or one of the various other ICN projects under development operate in conjunction with each other or independently remains to be seen. It is likely, however, that one or more of them is necessary to ensure the Internet can evolve to meet the ever-increasing demands placed on it.