Skip to Content

Googling for Code

Will Google’s new search tool improve software?
October 10, 2006

Last week, Google launched a tool for sifting through the billions of lines of software code available on the Web. The free resource could help programmers design new software projects, test code, and fix bugs, says Tom Stocky, product manager at Google. And, ultimately, it might help them build better products.

An efficient code-searching tool is invaluable for programmers, from computer science students to professionals, Stocky says. At the foundation of all software are lines of instructions that direct the program to perform certain tasks, such as searching through a list or rearranging values. Although tasks such as these are commonly used, the code for them varies by software language and can differ slightly depending on the application. Being able to search for code allows people to find the worked-out solutions to these common problems as well as solutions to more obscure coding challenges.

“The first thing someone does when writing a new piece of software is to search for existing things that are related,” says Stocky. In the past, some programmers have used Google’s standard search bar to find code, but it’s inefficient because a lot of code resides in databases inaccessible to it.

Programmers can also turn to repositories of online code, Stocky says, but the previous methods for downloading software code and searching for specific functions is time consuming. Instead, Google Code Search crawls and indexes open-source archives that contain code in file formats, such as .zip and .tar, that general Web crawlers can’t investigate. In addition, the tool indexes the code found in two common websites that host source code, CVS and Subversion. One of the goals of Google Code Search, Stocky says, is to make the searching process easier: “We’re trying to give people one place where they can do that quickly.”

The Google Code Search tool lets people search for code using not only keywords, as in a typical Google search, but also “regular expressions” in which patterns within words can be searched. For instance, a search for “do?” would return “dot” or “dog,” Stocky says. Using this, “programmers can create really advanced queries that can search for obscure function definitions,” he says. In addition, searches can be narrowed down to one of 33 different programming languages and 18 different licenses.

Of course there’s the issue of ownership and licensing. “For each piece of code, we do our best to detect licenses,” Stocky says; but in some cases, a license can’t be identified. “For anyone who didn’t mean for their code to be posted publicly, we have methods for them to remove it.” It is similar to the way Google handles Web pages or images whose owners would like them to be unsearchable.

Stocky adds that the tool could actually help to prevent code plagiarism. By searching for code you’ve written, he says, you could see who has implemented it and how.

Google isn’t the first company to offer code search. Santa Monica, CA-based Koders was launched in April 2005, and Krugle in Menlo Park, CA, went live in February 2006. Although the features of these engines differ–Krugle, for instance, allows people to search for code by project, unlike Google’s tool–their goal is the same: to allow programmers to reuse code that’s already been written, to make better software more quickly.

The rising popularity of code search is important, says Ken Krugler, founder and chief technology officer of Krugle. In surveying programmers, his company found that 20-27 percent of their time was spent searching for reusable code. “Everyone talks of code reuse as being the silver bullet to the problems of improving the software creation process,” he says. “To me, search is a key part of that.”

Google Code Search began as an internal tool for the company’s engineers, many of whom already participate in open-source software projects, Stocky says. The engineers were constantly searching for chunks of free code to complete their software, and used the tool to do it.

In fact, open-source developers have been using the general Google search to try to find code for a while, says Karsten Wade, a senior developer at Red Hat, a provider of open-source technology. Google’s Code Search tool “gives a friendly face to code snippets,” he says, adding that it will likely spur open-source development further by allowing more code to be found more easily. People can simply post pieces of code or how-to programs on their blogs, he says, and Google will find it. Moreover, an increase in code sharing could produce other benefits, he says, such as helping people find common mistakes.

Google Code Search currently resides in Google Labs, where the company’s latest product ideas are tested. The tool isn’t perfect, admits Stocky–it can’t yet find all the source code that’s available (Google has a form that allows people to submit code they’ve missed). The company plans to add support for more repositories of source code. But aside from improving code access, it’s not clear how exactly the new tool will evolve. “We want to get a lot of feedback to know what features people want that aren’t there,” says Stocky. “I’ve thought a lot about the potential directions to go, but [we want to know] what people are asking for.”

Keep Reading

Most Popular

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.

The problem with plug-in hybrids? Their drivers.

Plug-in hybrids are often sold as a transition to EVs, but new data from Europe shows we’re still underestimating the emissions they produce.

Google DeepMind’s new generative model makes Super Mario–like games from scratch

Genie learns how to control games by watching hours and hours of video. It could help train next-gen robots too.

How scientists traced a mysterious covid case back to six toilets

When wastewater surveillance turns into a hunt for a single infected individual, the ethics get tricky.

Stay connected

Illustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at customer-service@technologyreview.com with a list of newsletters you’d like to receive.