We noticed you're browsing in private or incognito mode.

To continue reading this article, please exit incognito mode or log in.

Not a subscriber? Subscribe now for unlimited access to online articles.

Emerging Technology from the arXiv

A View from Emerging Technology from the arXiv

How Google Cracked House Number Identification in Street View

Google can identify and transcribe all the views it has of street numbers in France in less than an hour, thanks to a neural network that’s just as good as human operators. Now its engineers reveal how they developed it.

  • January 6, 2014

Google Street View has become an essential part of the online mapping experience. It allows users to drop down to street level to see the local area in photographic detail.

But it’s also a useful resource for Google as well. The company uses the images to read house numbers and match them to their geolocation. This physically locates the position of each building in its database.

That’s particularly useful in places where street numbers are otherwise unavailable or places such as Japan and South Korea where streets are rarely numbered in chronological order but in other ways such as the order in which they were constructed, a system that makes many buildings impossibly hard to find, even for locals.

But the task of spotting and identifying these numbers is hugely time-consuming. Google’s street view cameras have recorded hundreds of millions of panoramic images that together contain tens of millions of house numbers. The task of searching these images manually to spot and identify the numbers is not one anybody could approach with relish.

So, naturally, Google has solved the problem by automating it. And today, Ian Goodfellow and pals at the company reveal how they’ve done it. Their method turns out to rely on a neural network that contains 11 levels of neurons that they have trained to spot numbers in images.

To start off with, Goodfellow and co place some limits on the task at hand to keep it as simple as possible. For example, they assume that the building number has already been spotted and the image cropped so that the number is at least one-third the width of the resulting frame. They also assume that the number is no more than five digits long, a reasonable assumption in most parts of the world.

But the team does not divide the number into single digits, as many other groups have done. Their approach is to localize the entire number within the cropped image and to identify it in one go—all with a single neural network. 

They train this net using images drawn from a publicly available data set of number images known as the Street View House Numbers data set. This contains some 200,000 numbers taken by Google’s Street View cameras and made publicly available. The training takes about six days to complete, they say.

Goodfellow and co say there is no point in using an automated system that cannot match or beat the performance of human operators who can generally spot numbers accurately 98 percent of the time. So this is the team’s goal.

However, that doesn’t mean spotting 98 percent of the numbers in 100 percent of the images. Instead, Goodfellow and co say it is acceptable to spot 98 percent of the numbers in a certain subset of images, which in this case turn out to cover around 95 percent of the total.

But even this is significantly better than any other team has been able to achieve. “Worldwide, we automatically detected and transcribed close to 100 million physical street numbers at [human] operator level accuracy,” they say, describing this as an “unprecedented success.”

And they can do it at considerable speed. “We can transcribe all the views we have of street numbers in France in less than an hour using our Google infrastructure,” they say. Yep, that’s just one hour.

One interesting question is whether the same technique might help extract other numbers such as telephone numbers on business signs or even number plates.

However, Goodfellow and co are not optimistic. They say the success of their technique rests heavily on the assumption that street numbers are never more than five digits long. “For large [numbers of digits] our method is unlikely to scale well,” they say.

And of course, the system is not yet perfect. That 2 percent of misidentified numbers is still a thorn in the team’s side.

But in the meantime, Google can rest assured that it has made a significant step forward in character extraction and recognition: the localization and identification of numbers by a single neural network.

The big question of course is what’s next. And Goodfellow and co oblige by opening the kimono just a fraction: “This approach of using a single neural network as an entire end-to-end system could be applicable to other problems such as general text transcription or speech recognition.” 

So there you have it!

Ref: arxiv.org/abs/1312.6082 : Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks

Learn from the humans leading the way in intelligent machines at EmTech Next. Register Today!
June 11-12, 2019
Cambridge, MA

Register now
More from Intelligent Machines

Artificial intelligence and robots are transforming how we work and live.

Want more award-winning journalism? Subscribe to Print + All Access Digital.
  • Print + All Access Digital {! insider.prices.print_digital !}*

    {! insider.display.menuOptionsLabel !}

    The best of MIT Technology Review in print and online, plus unlimited access to our online archive, an ad-free web experience, discounts to MIT Technology Review events, and The Download delivered to your email in-box each weekday.

    See details+

    12-month subscription

    Unlimited access to all our daily online news and feature stories

    6 bi-monthly issues of print + digital magazine

    10% discount to MIT Technology Review events

    Access to entire PDF magazine archive dating back to 1899

    Ad-free website experience

    The Download: newsletter delivery each weekday to your inbox

    The MIT Technology Review App

You've read of three free articles this month. for unlimited online access. You've read of three free articles this month. for unlimited online access. This is your last free article this month. for unlimited online access. You've read all your free articles this month. for unlimited online access. You've read of three free articles this month. for more, or for unlimited online access. for two more free articles, or for unlimited online access.