Skip to Content
Artificial intelligence

The machine vision challenge to better analyze satellite images of Earth

Machine vision has revolutionized many areas of technology, but satellite image analysis isn’t one of them. That may be about to change.

Machine vision technology has revolutionized the way we see the world. Machines now outperform humans on tasks such as facial recognition and many types of object recognition. And this technology is employed in a wide range of applications today, from security systems to self-driving vehicles.

But there are still areas where machine vision techniques have yet to make such a strong impact. One of them is in analyzing satellite images of the Earth.

That’s something of a surprise since satellite images are numerous, relatively consistent in the way they are taken, and crammed full of data of one kind or another. They are ideal for machines to make sense of. And yet most satellite image analysis is done by human experts trained to recognize relatively obvious things such as roads, buildings, and the way land is used.

That looks set to change, thanks to the DeepGlobe Satellite Challenge organized by researchers at Facebook, the satellite imagery company DigitalGlobe, and academic partners at MIT and other universities. For participants in the challenge, the goal is to use machine vision techniques to automate the process of satellite image analysis. The results of the competition are due to be announced next month. 

The DeepGlobe organizers invited entrants to devise ways to automatically identify three types of information in satellite images: road networks, buildings, and land use. So the task was to take an image as an input and to produce as an output one of the following: a mask showing the road network; an overlaid set of polygons representing buildings; or a color-coded map showing how the land is being used—for agriculture, urban life, forestry, and so forth.

For each of these three tasks, researchers created a database of annotated images for entrants to use in training their machine vision systems. The challengers would later be evaluated according to how well their systems performed on a test database.

The data sets are comprehensive. The one for road identification includes some 9,000 images with a ground resolution of 50 centimeters, spanning a total area of more than 2,000 square kilometers in Thailand, Indonesia, and India. The images include urban and rural areas with paved and unpaved roads. The training data set also includes a mask for each image showing the road network in that area.

The buildings data set contains over 24,000 images, each showing a 200 meter by 200 meter area of land in Las Vegas, Paris, Khartoum, or Shanghai. More than  300,000 buildings are depicted in the training data set, each one marked by human experts as an overlaid polygon.

The land use data set consists of more than 1,000 RGB (or true-color) images with 50-centimeter resolution, paired with a mask showing land use as determined by human experts. The use designations include urban, agriculture, rangeland, forest, water, barren, and unknown (that is, covered by clouds). 

The DeepGlobe Challenge organizers have developed a number of algorithms for measuring the accuracy of machine-generated data that they can use to assess each of the entrants. And there are plenty of them: some 950 teams have registered to take part. The winners will be announced at a conference in Salt Lake City on June 18.

There appears to be plenty of low-hanging fruit here. The major benefits are likely to be for people in remote areas where the road networks have not yet been mapped. One of the sponsors of the challenge is Uber, which may be able to use this type of data to extend its services. Automated satellite-image analysis should also be useful when natural disasters strike and emergency services must reach the affected areas quickly. Additionally, if the data is made widely available at low cost, it could be helpful for climate change research and for urban planning.

And that should just be the beginning. This kind of analysis is surely just a stepping-stone to a more detailed understanding of the world around us.  It will be interesting to see how well the participants perform.

Ref: arxiv.org/abs/1805.06561 : DeepGlobe 2018: A Challenge to Parse the Earth through Satellite Images

Deep Dive

Artificial intelligence

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.

OpenAI teases an amazing new generative video model called Sora

The firm is sharing Sora with a small group of safety testers but the rest of us will have to wait to learn more.

Google’s Gemini is now in everything. Here’s how you can try it out.

Gmail, Docs, and more will now come with Gemini baked in. But Europeans will have to wait before they can download the app.

Google DeepMind’s new generative model makes Super Mario–like games from scratch

Genie learns how to control games by watching hours and hours of video. It could help train next-gen robots too.

Stay connected

Illustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at customer-service@technologyreview.com with a list of newsletters you’d like to receive.