Skip to Content

Data Mining 200 Years of Patent Office Records To Reveal The Nature of Invention

The elaborate records kept by the US Patent Office since 1790 are allowing researchers to study the nature of invention and how it has changed in 200 years.

One way to think about invention is as a process that combines technologies to fulfill some human need or purpose. In other words, inventions never come out of nowhere. They always build on earlier advances to create something new.

So, for example, the incandescent light bulb uses electricity, a heated filament, inert gas and a glass bulb; an inkjet printer relies on the ability to position matter with extreme precision and to pump ink in very small droplets; and the laser is based on the ability to make highly reflective optical cavities and so on. All these inventions stand on the shoulders of previous advances.

That’s why many technologists think about invention as a combinatorial process—a walk through the entire space of technological permutations. To find a new invention, simply combine various old technologies in a new way.

At least, that’s the idea. But how to test the extent to which it is true? Today, we find out thanks to the work of Hyejin Youn at University of Oxford in the UK and a few pals. These guys have studied the nature of invention and say that there is good evidence that it is indeed a combinatorial process, at least in part.

There are work relies on data gathered by the US Patent Office, which uses an elaborate system of technology codes to classify the technologies responsible for an invention’s novelty. Inventions that rely on a single technology have a single code. But those that rely on several technologies are given a combination of codes.

That opens up the possibility of an interesting study, they say. Since the US Patent Office records go back to 1790, it ought to be possible to see how the combination of codes has changed over time. In particular, these records should reveal to what extent invention is the refinement of existing combinations of technologies and to what extent it is the result of new combinations of technologies.

And that’s exactly what these guys have done. “To do this we treat patented inventions as carriers of technologies and avail ourselves of the elaborate system of technology codes used by the US Patent Office to classify the technologies responsible for an invention’s novelty,” say Youn and co.

For each invention, they count the number of technology codes associated with it. That allows them to study the way the number of inventions and the combination of technologies they rely on has changed with time.

So, to what extent do inventions rely on completely new combinations of technology codes? If most inventions were entirely new, the percentage should be high.

On the other hand, if most inventions are merely revised and improved versions of existing technologies, then they would depend on previously existing combinations of technologies.

The results give an interesting insight into this question. They suggest that some 40 per cent of new inventions rely on previously existing combinations of technologies while about 60 per cent introduce entirely new combinations of technologies.

That has important implications. One idea is that new inventions can come about through a random walk through the space of all possible permutations of technologies. But the fact that 40 per cent reuse previously existing combinations suggests that invention is not the result of this kind of random search.

Indeed, Youn and co point out that certain parts of the combinatorial space are excluded for reasons of practicality, thereby ruling out inventions such as exploding prosthetics or espresso-making toothbrushes.

What’s more, certain technology “phenotypes”– particular operating systems, dimensions of roads and so on– limit the types of technologies that can later be useful. And this places another important bound on the types of inventions likely to be useful.

For these and other reasons, the number of inventions is a vastly smaller than the almost infinite space allowed by combinations of technologies. “The huge gap between the possible and the actual number of combinations indicates that only a small subset of combinations become inventions,” they say.

There is an interesting comparison here between the way inventions and DNA–based organisms have evolved. Biological evolution is another combinatorial process that relies on only a small number of building blocks–the protein-coding genes—combined together in lots of different ways. That’s not unlike the way inventions rely on a relatively small number of technologies combined in different ways.

What’s more, biological evolution is path-dependent since the success of an adaptation depends on the order in which it follows other changes. And it is one that is ultimately determined by selection.

Youn and co say there is more work to be done in studying the link between these combinatorial processes. “Studying patent, comparative and systemic records of inventions, will open a way to make quantitative assessments for a counterpart of these features of biological evolution in technological evolution,” they say.

Perhaps. But either way, the use of big data to study the nature of invention has significant potential. There are surely more jewels to be found in them thar hills.

Ref: : Invention as a Combinatorial Process: Evidence from U.S. Patents

Keep Reading

Most Popular

still from Embodied Intelligence video
still from Embodied Intelligence video

These weird virtual creatures evolve their bodies to solve problems

They show how intelligence and body plans are closely linked—and could unlock AI for robots.

pig kidney transplant surgery
pig kidney transplant surgery

Surgeons have successfully tested a pig’s kidney in a human patient

The test, in a brain-dead patient, was very short but represents a milestone in the long quest to use animal organs in human transplants.

panpsychism concept
panpsychism concept

Is everything in the world a little bit conscious?

The idea that consciousness is widespread is attractive to many for intellectual and, perhaps, also emotional
reasons. But can it be tested? Surprisingly, perhaps it can.

We reviewed three at-home covid tests. The results were mixed.

Over-the-counter coronavirus tests are finally available in the US. Some are more accurate and easier to use than others.

Stay connected

Illustration by Rose WongIllustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at with a list of newsletters you’d like to receive.