Turning Augmented Reality into an Open Standard

A team at Georgia Tech hopes to make it easier to create and use augmented reality applications.

Christopher Mimsarchive page

March 7, 2011

A research team at Georgia Tech hopes to make augmented reality (AR) on smart phones more useful by developing an open standard for it.

**Good Kharma:** The Argon browser in action on an iPhone.

Currently, there is no standard way to create or render AR applications, which overlay information on the live video feed from a phone’s camera. Companies such as Layar help app developers create AR functions, but they use proprietary technologies. That means, among other things, that different AR apps may be unable to talk to each other or share data. The Georgia Tech team hopes that its open standard, an enhancement of existing Web protocols, will yield a common way for every Web browser to store, transmit, and manipulate data for augmented reality services. If it does, you wouldn’t need a separate app for each AR function on your phone—one browser could show them all.

“We’re the only people who are trying to piggyback intimately on top of Web architecture,” says Blair Macintyre, director of the Augmented Environments Lab at Georgia Tech. The standard developed by his group is known as KML/HTML Augmented Reality Mobile Architecture, or KHARMA. It combines the Keyhole Markup Language (KML) used by the Google Earth mapping program with existing HTML and a handful of other protocols invented by Macintyre’s team. The group also has built a reference browser—a sample of the technology in action—called Argon.

KHARMA is an evolution of the Web, rather than a wholesale invention of standards specifically designed for augmented reality. In contrast, almost all previous attempts to create a standard platform for augmented reality either in academia or commercially have been proprietary or purpose-built. This difference, argues Macintyre, could be the key to Argon and KHARMA’s success. Argon is not yet open-source, but once it’s stabilized, Macintyre’s team will release the code base.

“My hope is that three years from now, all we’ve done is folded back into Webkit [the open-source engine that runs the Safari and Chrome browsers],” says Macintyre. “I joked at one of the meetings at one point that we know we’ve succeeded if the project is gone in three years.”

Not everyone is sold on this vision. Avideh Zakhor, a professor of electrical engineering at the University of California, Berkeley, and a leading AR creator, believes it may be premature to work on standards when the technology itself is still in its infancy. “To me, the research problems are still wide open,” she says.

Zakhor’s work focuses on creating truly believable augmented reality experiences. She, together with her team, uses image-recognition algorithms to chase the ultimate goal: pixel-perfect imposition of objects, surfaces, images, and textures onto the reality viewed through a smart phone or headset. For now, AR apps don’t recognize objects as much as they merely know where in the world they are, by relying on GPS and the location of cell-phone towers.

Argon is a browser built with expansion in mind, however. Its basic framework should be able to incorporate the kind of image-recognition algorithms that Zakhor is working on, Macintyre says.

One of the potential barriers to adoption of Argon—which has thwarted all augmented reality browsers to date—is the lack of a “killer app” for AR. But just as a Web browser is valuable mainly for its universality—it allows for e-commerce and e-mail and the consumption of media and a hundred other functions—a flexible AR browser could be greater than the sum of its parts.

“One of the goals of Argon is to find out what’s the killer app for AR,” says Hafez Rouzati, the lead developer of Argon and a doctoral student at Georgia Tech. “By releasing something like Argon, we’re allowing people to show us. There’s only so much we can imagine ourselves.”

Keep Reading

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.