Can Microsoft’s “Hologram” Maker Become the New Sears Portrait Studio?

The volumetric videos could one day take those awkward family photos to a whole new level.

Rachel Metzarchive page

October 25, 2017

Microsoft

A day before I was due at Microsoft’s new Mixed Reality Capture Studio in San Francisco, I got an e-mail about the dress code. I was going to be recorded as a “hologram”—Microsoft’s term for volumetric videos, which you can view through augmented-reality or virtual-reality headsets, or on a flat screen. Apparently, much as is the case on TV, not everything looks good when you’re being digitally preserved in multiple dimensions.

What can’t you wear? Hats, glasses, super-dark clothes, super-white clothes, or any “high-frequency patterns” (houndstooth was mentioned specifically as one to avoid). I showed up in a polka-dotted navy dress and left my glasses and hat in my bag.

I stood in the center of a large, white room and made some silly dance moves while 106 cameras and four microphones recorded me. It took about 10 seconds to get the cringe-worthy result you see on this page.

Chances are most of us aren’t going to be heading down to the studio to immortalize ourselves, or make holiday-themed volumetric family videos, anytime soon. Microsoft wouldn’t say how much they cost to make (rates are shown under nondisclosure agreements and can be negotiated privately with partners), and there aren’t all that many virtual-reality or augmented-reality headsets out there right now with which you could watch them, anyway. (There are a handful of VR headsets and so-called Windows Mixed Reality headsets for viewing VR, but Microsoft’s HoloLens headset is still a $3,000 developer device.)

Still, after working on the technology for seven years and capturing thousands of performances in this manner at its headquarters in Redmond, Washington, Microsoft is trying to make the medium more popular by opening more such studios.

The San Francisco capture studio, which is located within a Microsoft technical event space in the city’s tech-heavy SoMa neighborhood, welcomes outsiders to visit and make videos. Studio general manager Steve Sullivan says the idea is to have all kinds of people and companies come in and record things—ranging from celebrity or circus performances to virtual patients for doctors to train on. Then this content can be viewed in a number of ways: on headsets like HoloLens that mix the digital and real worlds, on totally immersive virtual-reality headsets, or on flat screens.

“It’s a kind of medium where it looks like video when you’re looking from any particular point of view, but you can change the point of view during the performance,” Sullivan says.

I watched a video of two break-dancers, who were making all kinds of moves on a sidewalk. They were captured in one of Microsoft’s studios, while the background was filmed elsewhere, but I couldn’t tell while watching it with a virtual-reality headset: the dancers and background fit together flawlessly, with proper shadows on the ground, and the images looked sharp as I moved around.

In its studio, Microsoft depends on an array of cameras—half regular color models, half infrared—to shoot many different views of the subject. The color cameras are used for image texture, while infrared emitters and cameras are used to help reconstruct 3-D shapes. The footage is used to make a texturized mesh that can then be used by, say, video-game designers to build a game.

Recording with all those cameras at the standard video rate of 30 frames per second requires 10 gigabytes per second, which is whittled down during the production process to 10 megabits per second—the kind of thing you could stream via Wi-Fi and watch on a range of devices. For now, the longest pieces recorded run about three and a half to four minutes, Sullivan says.

Sullivan believes that in “a time frame of years,” this kind of video capture will be available to the average person. In fact, he told me he has been recording his two kids for several years (when I expressed surprise, he added that they would be coming in this week to take their annual hologram). The results, he say, are more powerful than a photo to look back on.

“If I put on HoloLens and see my now seven-year-old [son], walking around as a four-year-old telling knock-knock jokes, it’s a really visceral, engaging kind of thing,” he says.

Not too surprisingly, though, he says his son is not nearly as fond of viewing this kind of footage.

Keep Reading

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.