Skip to Content

Virtual Eyes Train Deep Learning Algorithm to Recognize Gaze Direction

Gaze estimation is a classic problem of machine vision, which can now be solved by one computer training another.

Eye contact is one of the most powerful forms of nonverbal communication. If avatars and robots are ever to exploit it, computer scientists will need to better monitor, understand, and reproduce this behavior.

But eye tracking is easier said than done. Perhaps the most promising approach is to train a machine-learning algorithm to recognize gaze direction by studying a large database of images of eyes in which the gaze direction is already known.

The problem here is that large databases of this kind do not exist. And they are hard to create: imagine photographing a person looking in a wide range of directions, using all kinds of different camera angles under many different lighting conditions. And then doing it again for another person with a different eye shape and face and so on. Such a project would be vastly time-consuming and expensive.

Today, Erroll Wood at the University of Cambridge in the U.K. and a few pals say they have solved this problem by creating a huge database of just the kind of images of eyes that a machine learning algorithm requires. That has allowed them to train a machine to recognize gaze direction more accurately than has ever been achieved before.

So how have they done this? Their trick is to create the database entirely artificially. They begin by building a highly detailed virtual model of an eye, an eyelid, and the region around it. They then build this model into various different faces representing people of different ages, skin colors, and eye types and photograph them—virtually.

The photographs can be described by four different variables. These are: camera position, gaze direction, lighting environment, and eye model. To create the database, Wood and co begin with a particular eye model and lighting environment and start with the eyes pointing in a specific direction. They then vary the camera position, taking photographs from a wide range of angles around the head.

Next, they move the eyes to another position and repeat the variations in camera position. And so on.

The result is a database of more than 11,000 images covering 40 degree variations in camera angle and changes in gaze variation over 90 degrees. They chose eye color and environmental lighting conditions randomly for each image.

Finally, Wood and co used the data set to train a deep convolutional neural network to recognize gaze direction. And they tested the resulting algorithm on a set of natural images taken from the wild. “We demonstrated that our method outperforms state-of-the-art methods for cross-data set appearance-based gaze estimation in the wild,” they say.

That’s interesting work. Deep learning techniques are currently taking the word of computer science by storm thanks to two advances. The first is a better understanding of neural networks themselves which has allowed computer scientists to significantly improve them.

The second is the creation of huge annotated data sets that can be used to train these networks. Many of these new data sets have been created using crowd sourcing methods such as Amazon’s Mechanical Turk.

But Wood and co have taken a different approach. Their data set is entirely synthetic, created inside a computer. So it will be interesting to see where else they can apply this synthetic method to create data sets for other types of deep learning.

Ref: : Rendering of Eyes for Eye-Shape Registration and Gaze Estimation

Keep Reading

Most Popular

transplant surgery
transplant surgery

The gene-edited pig heart given to a dying patient was infected with a pig virus

The first transplant of a genetically-modified pig heart into a human may have ended prematurely because of a well-known—and avoidable—risk.

open sourcing language models concept
open sourcing language models concept

Meta has built a massive new language AI—and it’s giving it away for free

Facebook’s parent company is inviting researchers to pore over and pick apart the flaws in its version of GPT-3

Muhammad bin Salman funds anti-aging research
Muhammad bin Salman funds anti-aging research

Saudi Arabia plans to spend $1 billion a year discovering treatments to slow aging

The oil kingdom fears that its population is aging at an accelerated rate and hopes to test drugs to reverse the problem. First up might be the diabetes drug metformin.

images created by Google Imagen
images created by Google Imagen

The dark secret behind those cute AI-generated animal images

Google Brain has revealed its own image-making AI, called Imagen. But don't expect to see anything that isn't wholesome.

Stay connected

Illustration by Rose WongIllustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at with a list of newsletters you’d like to receive.