Atop an elevated train barreling through Downtown, the masked man in the red and blue suit is in trouble.
Not only is he fighting a lunatic scientist who’s trying to kill him with robotic tentacles, but he also needs to save the passengers on the train. It’s all in a day’s work for superhero Peter Parker, also known as Spider-Man – but it means months of work for an elite team of graphics gurus.
This action sequence from the megahit Spider-Man 2 has dazzled millions of moviegoers this summer. Look closely at the villain, Doctor Octopus (played by actor Alfred Molina), and you’ll see him grin maniacally and yell while ambulating along the top and sides of the train. Later in the scene, Spider-Man (Tobey Maguire) loses his mask as he braces himself against the front of the train, bringing it to a screeching halt. His frantic facial expressions appear convincingly natural.
With all the swift cross-cutting between images of superhero and villain, the audience probably does not suspect that the faces and figures appearing on the screen much of the time are not the real thing. Rather, they are digital concoctions created inside a computer at Sony Pictures Imageworks in Culver City, CA.
“We’ve reached a point where we can make every single thing computer-generated. Everything from the shattered glass in the train to all the buildings, tracks, and people,” says Mark Sagar, a graphics supervisor on Spider-Man 2 and Imageworks’ resident expert on digital human faces. He points to the train sequence as it plays on his computer screen. “Watch what the camera does. Here it’s going fast along the train, then underneath the performers, then in front of them, then a wide shot. How could you do that with a real camera?” But the digital faces on the people are the last, crucial piece of this puzzle. In the past, splicing footage of real actors into a digital scene required real cameras, difficult stunt work, and tweaking to get the look of the real and digital images to match; the ability to do computer-generated everything, including human faces, opens a wealth of creative possibilities.
Photorealistic digital faces – ones that can pass for real in still pictures or on the big screen – are among the last frontiers of computer graphics. Until recently, digital faces have looked fake when examined closely and have therefore been relegated to quick cuts and background shots. The problem is that we’re extraordinarily sensitive to how human faces should look; it’s much easier to fool people with a computer-generated T. rex than with a digital human. But advances in rendering skin, lighting digital scenes, and analyzing footage of real actors for reference are now allowing artists and programmers to control the texture and movement of every tiny patch of pixels in a computerized face. The Sony team hopes that audiences will be unable to tell the difference between Tobey Maguire and his digital double – perhaps the first time such verisimilitude has been achieved.
The stakes are huge. Digital effects are a billion-dollar business and growing fast; these days, a typical blockbuster film’s budget can be $150 million, with half of that going to effects companies. Indeed, Spider-Man 2 is just one example of Hollywood’s increasing use of cutting-edge graphics research to create better digital actors, from stunt doubles of Neo and the multitude of Agent Smiths in the Matrix films to Gollum in the Lord of the Rings series. It’s reached the point where the industry jokes about replacing actors with computers (the premise of the 2002 film S1m0ne). And Sony Pictures Imageworks, founded in 1992 and with more than 40 feature films to its credit, is in the vanguard of effects houses that vie for big studios’ business (see “Making Faces,”).
But the real benefit of digital actors isn’t replacing live ones: it’s delivering scenes that take viewers to places no real actor, or camera setup, could go. “It’s giving directors more flexibility, and it allows them to do actions they can’t do with real stunt people,” says Scott Stokdyk, the visual-effects supervisor at Imageworks in charge of the Spider-Man series. “In the past, directors and editors have basically massaged the cut around different quick actions and camera angles to convey a story,” Stokdyk says. “Now they don’t have those kinds of limits.” Thus liberated, directors can follow synthetic actors as they swoop around skyscrapers and dodge bullets in sweeping slow motion. What’s more, actors can be digitally aged, or de-aged, without having to spend hours in makeup. Long-deceased movie stars could even be digitally resurrected.
And movies are just the beginning. Techniques for creating digital humans are pushing the broader frontier of computer graphics and interfaces. These efforts could enable strikingly realistic medical training simulations, lifelike avatars for e-mail and Internet chat rooms, and soon, much more compelling characters in games and interactive films. The technology, Sagar says, “is absolutely ready for prime time.”
Mark Sagar has always been torn between art and science. After college, he spent three years traveling the world, sketching portraits for a living. But the tug of technology made him return to graduate school in his native New Zealand to study engineering. “I never thought I’d spend years of my life studying the human face,” he admits, sitting in his office at Imageworks, surrounded by books and papers on visual perception.
Hearing Sagar describe the human face as “a multichannel signaling device” suggests that the science and engineering side of him has won out. Understanding the science behind faces, he says, enables him to make a digital character’s message come through more effectively on the screen. Expressions like downcast eyes, a furrowed brow, or a curled lip signify a person’s emotional state and give clues to his or her intent.
Sagar’s path to Hollywood opened almost by accident. In the mid-1990s, as a graduate student at the University of Auckland and as a postdoctoral fellow at MIT, he developed computer simulations of the human eye and face that could help doctors-in-training learn surgical techniques. His simulations looked so real that a team of dot-com entrepreneurs convinced him to cofound a graphics startup called LifeFX in Newton, MA. Its mission: commercialize software that everyone from filmmakers to Web businesses and e-mail providers could use to produce photorealistic likenesses of people.
Sagar soon became a leading authority on digital faces for entertainment. In 1999, he came to Los Angeles to work on
computer-generated face animations for films, including one of actor Jim Carrey. Paul Debevec, a graphics researcher who made his name creating virtual environments and advancing digital lighting techniques, saw Sagar’s films at a conference and was intrigued: he had never seen faux faces that looked so convincing up close. “That was the moment that made me cross the threshold of truly believing that a photoreal computer-graphics face would happen in the next five years,” says Debevec, who is now at the Institute for Creative Technologies at the University of Southern California (see “Hollywood’s Master of Light,” TR March 2004).
The two scientists struck up a collaboration, using Debevec’s lighting techniques to render Sagar’s digital faces – a combination that quickly catapulted them to the forefront of the field. It turns out that if you’re trying to simulate a face, getting the lighting right is a big deal. Unlike previous computer simulations that looked odd in different contexts and had to be adjusted by trial and error, Sagar and Debevec’s faces could be tailored to match the lighting in any scene. That’s because they were built using a rich database of real faces photographed from different angles and illuminated by many different combinations of light. When LifeFX folded in 2002, Imageworks snatched up Sagar specifically for his expertise in faces.
He immediately began working on the first feature-film test of these techniques: Spider-Man 2. The action scenes in the film required detailed and expressive simulations of the faces of well-known actors – a particularly tough problem, says Sagar. Not only are audiences quick to reject ersatz human faces in general, but they are particularly sensitive to faces they recognize; any discrepancy between digital and real could be perceived as fake. To make the simulations work, the researchers needed lots of reference footage of the real actors under different lighting conditions.
So Maguire and Molina each spent a day in Debevec’s lab. Supervised by research programmer Tim Hawkins, they sat in a special apparatus called a “light stage” while four still cameras captured hundreds of images of their heads and faces making a variety of expressions and illuminated by strobes from every possible angle. The actors also had laser scans and plaster casts made of their heads and faces, so that high-resolution digital 3-D models of their likenesses could be built on computers.
At Imageworks, Sagar and his team wrote user-friendly software so that dozens of artists could use the gigabytes of image data without getting bogged down in technical details. To make the train sequence look right, for example, Sagar’s software combined images from Debevec’s setup into composites that matched the real-world lighting on the movie set, then mapped the composites onto 3-D computer models of the actors. To make the faces move, animators manipulated the models frame by frame, using existing pictures and video of the actors as a rough guide. The software calculated lighting changes based on how the face models deformed – and illuminated the digital skin accordingly. The result: synthetic actors who look like Maguire and Molina (intercut with the flesh-and-blood ones) zoom through the air, around skyscrapers, over trains, and underwater, emoting all the while.
Imageworks is a prime example of how effects houses are integrating new research into their production pipelines more quickly than they did just a few years ago. (While audiences might be wowed by what has shown up at the multiplex lately, the fundamental graphics technology in films didn’t change much in the 1990s.) “Before, there was a very long lag. Something would get developed, and then you’d wait ten years for a software company to commercialize it,” says J. P. Lewis, an expert on graphics and animation at the University of Southern California’s Computer Graphics and Immersive Technology lab. “Now, I think companies are much more aware of research, and they tend to jump on it much more quickly.”
A walk through the darkened hallways of Imageworks this spring finds the team scrambling to put the finishing touches on the more than 800 effects shots for Spider-Man 2. It’s a young, hip crowd sporting fashionable glasses and displaying mementos from the film on their desks – photos, action figures, a cast of Tobey Maguire’s face. On the day he ships his last shot for the film, visual-effects supervisor Stokdyk laments that there isn’t more time. The biggest challenge, he says, was blending Molina’s sometimes real, sometimes digital, face with his “Doc Ock” costume and comic-book-style surroundings. “To match reality,” he sighs, “is almost impossible.”
Put Your Game Face On
Indeed, despite the millions of dollars thrown at the problem, digital human faces still have a ways to go. What remains to be done may seem like incremental steps – making eye movements less robotic, capturing changes in blood flow so cheeks flush, getting skin to wrinkle just the right way during a smile – but they add up. “The last 20 percent could take 80 percent of our time to get right – but we’re definitely in that last 20 percent,” says Darin Grant, director of technology at Digital Domain in Venice, CA, which did character animations for this summer’s I, Robot.
In the end, commercial audiences will decide the value of these digital doubles. “The ultimate test of what we do is how it looks on-screen and how it translates to production,” says Grant. His colleague Brad Parker, a visual-effects supervisor and director at Digital Domain, maintains that digital humans will pay increasing dividends for filmmakers – and for the graphics community. “It’s a big deal,” he says. “It combines everything that’s difficult about computer graphics.”
Why it’s such a hard problem – exactly what our eyes detect as “wrong” in a digital human – isn’t yet well understood. But University of Southern California graphics researchers Lewis and Ulrich Neumann are trying to find out. In recent experiments, their group showed glimpses of real and digital faces to volunteers to see if they could tell the difference. The results were striking – and frustrating. “We spent a year working on these faces, but we couldn’t fool people for a quarter of a second,” Lewis says. He predicts that this work will lead to statistical models of how real human faces behave, which in turn will yield software tools that artists can use to make characters move their eyes just so or change expressions in other subtle ways that could be vital to believability.
Such advances should have a dramatic impact. Says Steve Sullivan, director of research and development at Industrial Light and Magic in San Rafael, CA, “We’ll probably look back in 10 years and think today’s digital doubles look horribly primitive.”
And it won’t only be movies that get a facelift. The same graphical simulation tools that filmmakers are starting to master will also help fuel the next big market for digital faces: video games. Today’s games boast dazzling creatures and scenery, but their human characters are not even close to being photorealistic. It’s just not practical to program in every viewing angle and expression that may arise during the course of a multilevel, interactive game.
That’s where George Borshukov comes in. Borshukov, a computer scientist who designed state-of-the-art digital humans for the Matrix films (all those Smiths in Reloaded and Revolutions are his team’s), is now applying face technology to games. A former technology supervisor at ESC Entertainment in Alameda, CA, Borshukov recently moved to video-game powerhouse Electronic Arts in Redwood City, CA. He says that next-generation gaming hardware will come close to demonstrating techniques for photorealistic faces in real time, but that trade-offs, approximations, and data compression will be needed to make it happen.
The problem is that with games, everything has to happen on the fly. Yet it still takes a few hours to render a single frame of today’s best digital faces. That’s workable if you have months to produce the scenes, as in a movie. In a game or interactive film, however, the particular image called for may not exist until the user orders it up with the flick of a joystick. Making this practical will require software that’s thousands of times faster.
Five years down the road, experts say, a hybrid between a game and a movie could allow viewers/players to design and direct their own films and even put themselves into the action. You might first “cast” the film by scanning photos of real people – you and your friends, for instance – and running software that would create photoreal 3-D models of those people. Then, in real time, you could direct the film’s action via a handheld controller or keyboard – anything from zooming the camera around the characters to making the lead actor run in a certain direction. Interactive entertainment, Borshukov says, “is where the real future is.”
Facing the future
Back at Imageworks, a storm of activity swirls around Mark Sagar. Artists are in crunch mode for another digital-actor project, this fall’s The Polar Express, based on the popular children’s book. But Sagar, who is not directly involved with that effort, is entranced by what’s farther down the road – a more elegant approach to digital faces based on underlying scientific principles. “I see today’s work as an interim stage where we still have to capture a lot of data,” he says. “Eventually everything will use mathematical models of how things move and how they reflect light.”
Sagar also sees much broader applications of digital humans in medical graphics, cooperative training simulations for rescue workers, and human-computer interfaces that could help users communicate more effectively with both machines and other people. Outside the entertainment industry, large organizations like Microsoft and Honda are pursuing research on advanced graphics and human modeling, including software that could allow you to create realistic virtual characters and digital avatars based on just a photo. Related algorithms could also help computers recognize faces and interpret expressions, either for security purposes or to predict a user’s needs.
“We’re at an interesting age when we’re starting to be able to simulate humans down to the last detail,” says Sagar. There’s a certain irony in his statement. For once digital humans are done right, they’ll be indistinguishable from the real thing; audiences won’t even realize that artists and scientists like Sagar have changed the face of entertainment – and society.
Gregory T. Huang is a TR associate editor.