A View from Kate Greene
Photosynth for Video and Other TechFest Treats
More highlights from Microsoft’s annual research event in Redmond, WA.
Every year, Microsoft hosts an open house called TechFest to showcase some of its more flashy research projects. This year, the event in Redmond, WA boasted 37 demos, ranging from gesture-based interfaces to augmented reality and better image search. Below is a brief summary of some of the projects showcased on Tuesday that caught my eye.
1. Photosynth for Video
Based on the popularity of Photosynth it’s not surprising that Microsoft researchers are now trying to extend the technology to video. The regular software seamlessly stitches together pictures taken at a certain location, from different cameras, to create a zoomable, pannable, panoramic image. The new idea is that multiple people–eye-witnesses at a news event or fans at a music concert, for instance– record video of the event on their cell phones and stream it to a central server. Using the phones’ locations, and using image-recognition algorithms, software organizes and pieces together the mobile streams into a larger scene.
Ayman Kaheel, an engineer in Microsoft’s Cairo lab, demonstrated the software in the convention hall by holding up two cell phones, in camera mode, at different heights and pointed in slightly different directions. On his laptop, the two video feeds were merged in real time, to create a larger, more complete video. Impressive stuff.
2. Writing in the Air
One of the problems with video game systems, is that its hard for a player to enter text, to name a game character, or to chat with other players on a network for example. Since some of these systems are played on computers with Web cameras or infrared cameras (like the Wii), researchers at Microsoft’s research center in Beijing reasoned that hand-waving gestures could replace the traditional and clunky text input.
The researchers wrote software that tracks the movement of a colorful object, such as an apple or a ball, in a user’s hand and interprets, based on the path of the object, the character that the user outlines in the air. Hsiao-Wuen Hon, the director of Microsoft Research Asia says that the system works well for Chinese characters, and should be even better with English ones because there are far fewer of them.
3. A Color Palette for Better Image Search
Today’s image search engines do a decent job–up to a point. Search for a “tiger” and you’ll generally get a collection of orange, white, and black big cats. But, right now, it’s nearly impossible to tweak a search to find, say, a tiger on a white background, or a black and white tiger against blue sky.
Xian-Sheng Hua, a researcher at the Beijing center and a 2008 TR35 thinks he’s found a better way to home in on the right image. His search interface provides a color palette to the side of the results, and an gridded square that a user can fill with colors from the palette to winnow the search. For instance, if you’d like to search for a tiger with a blue sky, simply fill a few grids at the top of the square with blue and search for tiger again. This color-based filtering system eliminates the need to use extra metadata or tags describing the scene. Hsiao-Wuen says that such an interface would be relatively simple to integrate into today’s search engines.
4. Surface Goes 3-D
Surface seems to be the darling of Microsoft Research. The multi-touch tabletop is the high-profile project that took the fast track from the lab to consumers. And now, Andy Wilson, one of the researchers who worked on Surface is directing his attention to a touch-interface in the sky.
At TechFest, Wilson demonstrated a projector-and-infrared camera system that produces images inside a dome, and can recognize gestures made by people’s hands. In the demo, images from Microsoft’s World Wide Telescope, which provides a virtual tour of the night sky, were projected onto the dome; a researcher panned and zoomed accross the stars with the wave of a hand and a pinch of the thumb. Wilson believes that his team could build the system inexpensively enough for it to be used in school planetariums, or anywhere where people want to interact with large, panoramic projections.
The AI revolution is here. Will you lead or follow?
Join us at EmTech Digital 2019.