Virtual Extras

Giving each member of a digital crowd its own personality could make animated mob scenes more realistic.

Duncan Graham-Rowearchive page

December 19, 2007

The behavior of computer-generated crowds in movies and video games could soon appear much more realistic, thanks to new software that gives each character a complex personality of its own.

**Places to go, people to see:** Giving animated characters their own personalities could make computer-generated crowd scenes in movies and video games more realistic.

The software has been demonstrated in a simulation of Pennsylvania Station, in New York City, depicting more than 1,000 commuters, law-enforcement officers, entertainers, and tourists going about their business. Each individual demonstrates complex, rational behaviors that collectively create a much more lifelike representation of human activity, says Demetri Terzopoulos, a professor of computer science at the University of California, Los Angeles.

This sort of realism is important in games and motion pictures, says Norm Badler, director of the Center for Human Modeling and Simulation at the University of Pennsylvania. Even though simulated crowds tend to form part of the backdrop to the foreground action, they will stand out if their behavior is unrealistic, he says. “The whole animation should look plausible.”

Until recently, crowd-animation algorithms have typically been based on some form of flocking activity, in which each character moves in a particular way depending upon how its neighbors move. This works fine for depictions of animal behavior, such as the well-known wildebeest stampede in Disney’s The Lion King, says Terzopoulos. But in simulated humans that should evince some cognitive capacity, this sort of movement can appear aimless and random, says Badler.

Multimedia

Track a single autonomous agent through a crowd scene.

Watch virtual crowds stream through New York City’s Penn Station.

See a simulation of the Great Temple of Petra, in Jordan.

The “autonomous pedestrians” designed by Terzopoulos and graduate student Wei Shao, on the other hand, are governed by three different layers of behavior. A motion layer handles basic movement, such as walking, running, standing, and sitting. On top of this sits a reactive layer, which allows the characters to respond to obstacles or other characters they encounter; it also enables them to perform simple behaviors that people normally take for granted, such as walking around a bench in order to sit on it.

But where the real complexity comes from is the top, cognitive layer. “This is where the agent is able to think ahead about what it’s going to do in the future,” says Terzopoulos. “It’s a comprehensive cognitive model of people from the ground up.”

For example, a character may be charged with the simple task of catching a train. But it knows that, in order to perform this task, it must carry out a number of subgoals, such as purchasing a ticket and finding the train platform. In fact, even these subgoals can have further subgoals, such as finding the ticket office and choosing the shortest ticket line to stand in.

This is a complex planning problem that can be exacerbated by a character’s failure or success in meeting each of its subgoals, says Terzopoulos. “If you want to catch a train, but there are no tickets left, then you have to replan and maybe buy a ticket for a later train.”

To make characters’ behavior still richer, animators can also give them desires, which might make them stop off to buy a soda from a vending machine or pause to watch some street entertainers. Terzopoulos’s software even manages to capture the way in which two crowds of people, moving through a narrow corridor, naturally form two opposing lanes.

Software developed by Massive Software, based in New Zealand, which was used to animate the huge battle scenes in the Lord of the Rings trilogy, is also capable of giving animated agents cognitive behaviors, says the company’s founder, Stephen Regelous. “As far as I can tell, there’s nothing particularly new about this work,” he says.

But Badler disagrees: “Ultimately, Massive’s software puts considerable burden on the animator or programmer to create the behaviors.” In contrast, he says, Terzopoulos’s autonomous pedestrians can be created with great ease. “You can assign individual goals, or you can assign them randomly,” says Terzopoulos.

Also, according to Badler, Massive’s software is practical for animating only relatively short scenes; after that, the amount of time the animator has to spend on the characters becomes prohibitive.

With the autonomous-pedestrian software, says Terzopoulos, it is possible to animate relatively long scenes, dictating the movement and behavior of 1,400 characters in real time. As demanding as this might seem, he says, it is possible because the same set of mechanisms is used for each character–it’s just the parameters that vary. “The biggest computational expense is the simulation of their perception, because they have to look at other objects from their field of view,” he says.

The end result is pretty realistic, says Terzopoulos. You can follow and scrutinize an individual character within the Penn Station animation at close range for periods of up to 20 minutes. And what you’ll find, Terzopoulos says, is behavior quite typical of someone at a train station.

Beyond movies and games, there’s increasing interest in using crowd simulation to help conduct fire and disaster assessments of large public spaces, says Jian Zhang, director of the Computer Animation Research Center at Bournemouth University, in England.

In fact, Terzopoulos has already used his software to help archeologists analyze and learn more about the usage of an ancient building, the Great Temple of Petra, in Jordan. “They overestimated the capacity of the theater,” he says.

Terzopoulos is now working on using simulations to help design smart surveillance networks. The logistical problems of creating huge networks of security cameras, along with privacy concerns, make it difficult for vision researchers to carry out practical experiments in this field, says Terzopoulos. So the current trend is to start using simulated public spaces instead.

Keep Reading

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.