Skip to Content

Humans Generate Most of the World’s Data, but Machines Are Catching Up

The world’s trove of information is already expanding incredibly fast. Now automated applications will quickly enlarge it even further.
January 9, 2013

A global proliferation of devices like the ones many manufacturers are showcasing at this week’s annual Consumer Electronics Show has made the number of bytes in the world balloon extremely rapidly. Since 2005, when analysts at market research firm IDC began publishing an annual estimate of all the bytes added to the “digital universe,” defined as “all the information created, replicated, and consumed in a single year,” the number has grown from 130 billion gigabytes to 2.8 trillion gigabytes in 2012. IDC’s latest projection is that by 2020 the number will reach 40 trillion.

Consumers have accounted for around 70 to 75 percent of that total each year so far—creating and consuming roughly 1.9 trillion gigabytes in 2012. Of the new data created by consumers in 2012, roughly 80 percent came from digital televisions, as shown in the chart below.

Data Created by Consumers in 2012, by Source

Source: IDC

While in absolute terms the amount of data created by consumers will continue to grow quickly, the pool of data generated by things like industrial machines, vehicles, medical devices, sensors, and security cameras is expanding faster. This is shown in the chart below. In 2012, according to IDC, “machine-generated” data represented 30 percent of all data created, up from 24 percent last year and 16 percent five years ago. 

Digital Information Created or Replicated Annually

Source: IDC


The numbers show us that the world’s supply of what is commonly called “big data”—pools of analyzable and potentially useful digital information—is still relatively small. Much of the data generated by consumers, like that episode of your favorite sitcom saved on your DVR, isn’t very useful for analysis and is eventually deleted. More promising for big-data analysis are the readings from machines monitoring our world, from surveillance equipment and medical devices. And that’s just getting started.

Keep Reading

Most Popular

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.

The problem with plug-in hybrids? Their drivers.

Plug-in hybrids are often sold as a transition to EVs, but new data from Europe shows we’re still underestimating the emissions they produce.

Google DeepMind’s new generative model makes Super Mario–like games from scratch

Genie learns how to control games by watching hours and hours of video. It could help train next-gen robots too.

How scientists traced a mysterious covid case back to six toilets

When wastewater surveillance turns into a hunt for a single infected individual, the ethics get tricky.

Stay connected

Illustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at customer-service@technologyreview.com with a list of newsletters you’d like to receive.