Skip to Content

AI Is Taking Over the Cloud

Cloud storage company Box is using Google’s vision technology to make its service considerably smarter.
August 17, 2017
Box CEO Aaron Levie.

The cloud is getting smarter by the minute. In fact, it will soon know more about the photos you’ve uploaded than you do.

Cloud storage company Box announced today that it is adding computer-vision technology from Google to its platform. Users will be able to search through photos, images, and other documents using their visual components, instead of by file name or tag. “As more and more data goes into the cloud, we’re seeing they need more powerful ways to organize and understand their content,” says CEO Aaron Levie.

Computer-vision technology has improved remarkably over the past few years thanks to a machine-learning approach known as deep learning (see “10 Breakthrough Technologies 2013: Deep Learning”). A deep neural network—loosely inspired by the way neurons process and store information—can learn to recognize categories of objects, such as a “red sweater” or a “pickup truck.” Ongoing research, including work from Google’s researchers, is improving the ability of algorithms to describe what’s happening in images.

Box’s computer-vision feature could be a good way for companies to dip their toes into AI and machine learning. It removes the need to manually annotate thousands of images, and it will make it possible to search through older files in ways that might not have occurred to anyone during tagging. Levie says one company testing the technology is using it to search images for particular people.

The announcement is the latest sign that cloud computing is being reinvented through machine learning and artificial intelligence. AI is already the weapon of choice in the battle to dominate cloud computing, with companies that offer on-demand computing—Google, Amazon, and Microsoft among them—all increasingly touting added machine-learning features.

Fei-Fei Li, chief scientist of Google Cloud and a professor at Stanford University who specializes in computer vision and machine learning, said in a statement that the announcement shows how broadly available AI technology is becoming. “Ultimately it will democratize AI for more people and businesses,” Li said.

Levie says his company is looking at adding machine learning for other types of content. This could include audio and video, but also text, for which an algorithm could add semantic analysis, making it possible to search by the meaning of a document rather than specific keywords.

It’s also significant that Box is relying on computer vision from Google, rather than technology developed in-house. This reflects the fact that a few big players have come to dominate the more fundamental aspects of AI like computer vision, voice recognition, and natural-language processing. “If you think about the strength that Google has in image recognition, it would just be strategically unwise for us to try to compete with them,” Levie says. He says his company’s researchers are exploring ways of applying machine learning to the behavior of its customers. This process might reveal ways to optimize the Box service, or help identify tasks that could be ripe for automation, Levie says.

Google’s Cloud Vision API can recognize many thousands of everyday objects in images. However, some customers might need the ability to recognize and search through specific types of images, for example medical or architectural images. So Box’s researchers are exploring ways for customers to train their own vision systems if necessary.

Keep Reading

Most Popular

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.

OpenAI teases an amazing new generative video model called Sora

The firm is sharing Sora with a small group of safety testers but the rest of us will have to wait to learn more.

Google’s Gemini is now in everything. Here’s how you can try it out.

Gmail, Docs, and more will now come with Gemini baked in. But Europeans will have to wait before they can download the app.

This baby with a head camera helped teach an AI how kids learn language

A neural network trained on the experiences of a single young child managed to learn one of the core components of language: how to match words to the objects they represent.

Stay connected

Illustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at customer-service@technologyreview.com with a list of newsletters you’d like to receive.