Skip to Content

Virtual Eyes Train Deep Learning Algorithm to Recognize Gaze Direction

Gaze estimation is a classic problem of machine vision, which can now be solved by one computer training another.

Eye contact is one of the most powerful forms of nonverbal communication. If avatars and robots are ever to exploit it, computer scientists will need to better monitor, understand, and reproduce this behavior.

But eye tracking is easier said than done. Perhaps the most promising approach is to train a machine-learning algorithm to recognize gaze direction by studying a large database of images of eyes in which the gaze direction is already known.

The problem here is that large databases of this kind do not exist. And they are hard to create: imagine photographing a person looking in a wide range of directions, using all kinds of different camera angles under many different lighting conditions. And then doing it again for another person with a different eye shape and face and so on. Such a project would be vastly time-consuming and expensive.

Today, Erroll Wood at the University of Cambridge in the U.K. and a few pals say they have solved this problem by creating a huge database of just the kind of images of eyes that a machine learning algorithm requires. That has allowed them to train a machine to recognize gaze direction more accurately than has ever been achieved before.

So how have they done this? Their trick is to create the database entirely artificially. They begin by building a highly detailed virtual model of an eye, an eyelid, and the region around it. They then build this model into various different faces representing people of different ages, skin colors, and eye types and photograph them—virtually.

The photographs can be described by four different variables. These are: camera position, gaze direction, lighting environment, and eye model. To create the database, Wood and co begin with a particular eye model and lighting environment and start with the eyes pointing in a specific direction. They then vary the camera position, taking photographs from a wide range of angles around the head.

Next, they move the eyes to another position and repeat the variations in camera position. And so on.

The result is a database of more than 11,000 images covering 40 degree variations in camera angle and changes in gaze variation over 90 degrees. They chose eye color and environmental lighting conditions randomly for each image.

Finally, Wood and co used the data set to train a deep convolutional neural network to recognize gaze direction. And they tested the resulting algorithm on a set of natural images taken from the wild. “We demonstrated that our method outperforms state-of-the-art methods for cross-data set appearance-based gaze estimation in the wild,” they say.

That’s interesting work. Deep learning techniques are currently taking the word of computer science by storm thanks to two advances. The first is a better understanding of neural networks themselves which has allowed computer scientists to significantly improve them.

The second is the creation of huge annotated data sets that can be used to train these networks. Many of these new data sets have been created using crowd sourcing methods such as Amazon’s Mechanical Turk.

But Wood and co have taken a different approach. Their data set is entirely synthetic, created inside a computer. So it will be interesting to see where else they can apply this synthetic method to create data sets for other types of deep learning.

Ref: : Rendering of Eyes for Eye-Shape Registration and Gaze Estimation

Keep Reading

Most Popular

A Roomba recorded a woman on the toilet. How did screenshots end up on Facebook?

Robot vacuum companies say your images are safe, but a sprawling global supply chain for data from our devices creates risk.

A startup says it’s begun releasing particles into the atmosphere, in an effort to tweak the climate

Make Sunsets is already attempting to earn revenue for geoengineering, a move likely to provoke widespread criticism.

10 Breakthrough Technologies 2023

Every year, we pick the 10 technologies that matter the most right now. We look for advances that will have a big impact on our lives and break down why they matter.

These exclusive satellite images show that Saudi Arabia’s sci-fi megacity is well underway

Weirdly, any recent work on The Line doesn’t show up on Google Maps. But we got the images anyway.

Stay connected

Illustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at with a list of newsletters you’d like to receive.