Googling for CodeWill Google's new search tool improve software?
Last week, Google launched a tool for sifting through the billions of lines of software code available on the Web. The free resource could help programmers design new software projects, test code, and fix bugs, says Tom Stocky, product manager at Google. And, ultimately, it might help them build better products. An efficient code-searching tool is invaluable for programmers, from computer science students to professionals, Stocky says. At the foundation of all software are lines of instructions that direct the program to perform certain tasks, such as searching through a list or rearranging values. Although tasks such as these are commonly used, the code for them varies by software language and can differ slightly depending on the application. Being able to search for code allows people to find the worked-out solutions to these common problems as well as solutions to more obscure coding challenges. "The first thing someone does when writing a new piece of software is to search for existing things that are related," says Stocky. In the past, some programmers have used Google's standard search bar to find code, but it's inefficient because a lot of code resides in databases inaccessible to it. Programmers can also turn to repositories of online code, Stocky says, but the previous methods for downloading software code and searching for specific functions is time consuming. Instead, Google Code Search crawls and indexes open-source archives that contain code in file formats, such as .zip and .tar, that general Web crawlers can't investigate. In addition, the tool indexes the code found in two common websites that host source code, CVS and Subversion. One of the goals of Google Code Search, Stocky says, is to make the searching process easier: "We're trying to give people one place where they can do that quickly." The Google Code Search tool lets people search for code using not only keywords, as in a typical Google search, but also "regular expressions" in which patterns within words can be searched. For instance, a search for "do?" would return "dot" or "dog," Stocky says. Using this, "programmers can create really advanced queries that can search for obscure function definitions," he says. In addition, searches can be narrowed down to one of 33 different programming languages and 18 different licenses. Of course there's the issue of ownership and licensing. "For each piece of code, we do our best to detect licenses," Stocky says; but in some cases, a license can't be identified. "For anyone who didn't mean for their code to be posted publicly, we have methods for them to remove it." It is similar to the way Google handles Web pages or images whose owners would like them to be unsearchable. Stocky adds that the tool could actually help to prevent code plagiarism. By searching for code you've written, he says, you could see who has implemented it and how.
|


Comments
ms
10/10/2006
Posts:188
For Krugle, we use the containing project's score to adjust the static (non-query specific) boost for files. The project score depends on factors like downloads, references on tech web pages, hoster reputation, etc.
This doesn't weed out all the cruft, but it does increase the odds that a highly ranked file is coming from a popular project, which in turn is a fuzzy indicator of code quality.
kkrugler
10/10/2006
Posts:2
What I do believe, and know from using Krugle intenally, is that code search can make programmers better. Every time I quickly find a working example of code using an API that I'm struggling with, or a component that implements functionality I need, then I'm working faster, and writing better code.
This isn't the solution to all programming problems, but it is a solution that will grow stronger over time as the quantity and quality of publicly available code continues to increase.
kkrugler
10/10/2006
Posts:2
It seems that Google has now provided a way to find security flaws in existing code as well. Take this article: http://www.securityfocus.com/news/11417. An example from it: a search like 'todo +security' allows you to find existing security flaws in open source software. It would be nice to think that in the long run this will help improve software, but will people continue writing bad code without the proper training?
Dale Beermann
beermann
10/13/2006
Posts:1