When I was an undergraduate 20 years ago, I was so excited about computer vision that I chose to implement a cutting-edge paper on recognizing machine parts as my final-year project. Even though those parts were simple silhouettes of basic shapes like triangles and cogs, my project barely worked at all. Computer vision was a long way from being good enough to use in most real applications, and there was no clear path to dramatic improvements.
My college years weren’t entirely focused on algorithms, though. I was a peaceful participant in the protests that turned into the Criminal Justice Act riots in the U.K., I attended plenty of outdoor rave festivals, and I ended up at parties where lots of the attendees were high as a kite. Thankfully there’s no record of any of this, apart from a few photos buried in friends’ drawers. Teenagers today won’t be as lucky, thanks to the explosion of digital images and the advances in computer vision that have happened since that final-year project of mine.
I’ve spent my professional career building software that makes sense of images, but each project was highly custom, more art than science. If I wanted to detect red-eye in photos, I’d program in rules about the exact hue I expected, and to look for two spots of that color in positions that might be eyes, for example. A couple of years ago, I came across a technique that changed my world completely. Alex Krizhevsky and his team won the prestigious Imagenet image recognition contest with a deep convolutional neural network. Their approach had an error rate of 15 percent. The next best contestant’s approach had an error rate of 26 percent.
More importantly, the same technique turned out to be useful for all sorts of problems that require computers to make sense of images, from guessing what kind of environment a photo was taken in to recognizing faces. Before, it would take months of my time to build a classifier for just one kind of object. Now any competent engineer with a bit of training can do the same thing in days. Image analysis algorithms used to be rare handcrafted Fabergé eggs, but now they’re cheap off-the-shelf components made on a production line.
These advances have huge implications for our privacy, since we now document our lives with so many pictures. Facebook alone already has over 200 billion photos. So far this hasn’t had a massive impact on privacy because there’s been no good way to search and analyze those pictures, but advances in image recognition are changing all that. It’s now possible to not only reliably spot you in photos, but also tell what you’re doing. Creating an algorithm to spot common objects, whether they’re bikes or bongs, is now so easy. Imagine all your photos being processed into a data profile for advertisers or law enforcement, showing how much you party, who you’re with, and which demonstrations you attended.
You might think this is science fiction, but the mayor of Peoria managed to justify a raid on the apartment of a critic by citing a Twitter photo appearing to depict cocaine. Police departments across the country monitor YouTube videos that gang members upload of themselves threatening rivals and posing with guns. Right now, this is done manually, but it could be taken much further with easy-to-use object recognition software.
All of us have become used to uploading photos and videos safe in the knowledge that we have privacy through obscurity, but as data-mining images becomes easy, they could come back to haunt us. I’ve been able to move on from my youthful missteps, but it could have been different if all the landlords, potential employers, and bureaucrats I’ve dealt with since then could have summoned photographic evidence of them all at the touch of a button.