Technology Review - Published By MIT
Advertisement

Confusing Osama bin Laden with Johnny Rotten

Continued from page 2

By Mark Williams

Wednesday, April 04, 2007

smaller text tool iconmedium text tool iconlarger text tool icon

Sweeney means that the U.S. government watch lists, besides containing common names like Ted Kennedy, depend on variations of a phonetic algorithm called Soundex. As she says, "Soundex is an old patent that's been used for a long time, whenever they have two databases where they're trying to match up records." Indeed, Soundex dates back to a time when Hollerith punch cards were the newest thing in computing technology. Developed to index and retrieve soundalike surnames with different spellings (like Rogers and Rodgers) scattered throughout an alphabetical list, Soundex was first used so that U.S. government clerks could retroactively analyze the 1890 U.S. census results. Soundex works by taking the first letter of a name, dropping all vowels, assigning a number to each of the next three consonants (with similar-sounding consonants like s and c getting the same numbers), then dropping any remaining consonants. Thereby, the algorithm reduces all names to a letter followed by three numbers.

Consequently, Soundex assigns to the name Laden the code L350, as it does Lydon, Lawton, and Leedham. This is, in other words, an algorithm so deficient for identification purposes that it confuses al Qaeda's Osama bin Laden and the Sex Pistols' Johnny (Lydon) Rotten. To see for yourself how poorly Soundex performs, go to nofly.s3.com, where S3 Matching Technologies has combined the algorithm with a list of potential-terrorist names recorded in U.S. government databases. "The U.S. government obviously updates its lists every day, so we don't suggest this is up-to-date," says James Moore, a company spokesperson. "But we got the best available data on who'd be on terrorist watch lists from various private intelligence agencies." Using Soundex and S3 Matching Technologies' version of the watch list reveals that the names Jesus Christ and George Bush resemble terrorists' names enough that they're assigned to the no-fly or selectee list.

How does the U.S. government rationalize using such error-prone technology for its watch lists? Sweeney says, "Whomever I ask--whether it's DHS, DARPA, the Department of Justice--everybody essentially says, 'We're just going to plow ahead.' At the DOJ, the answer I get is, 'It'll get solved when we use biometrics.' Their belief is that the current problem will disappear because you'll show your driver's license and match your fingerprint against your fingerprint's stored image on your license." Sweeney half-seriously proposes a hypothetical solution to the watch-list problem. "I've told ChoicePoint that they ought to go into the watch-list business."

Alongside Lexis-Nexis and AcXiom, ChoicePoint is one of the big-three data-brokerage corporations and in many ways the most interesting of them. Evan Hendricks, editor-publisher of the Washington-based Privacy Times, says, "Though most Americans don't know about ChoicePoint, it's a company that knows a lot about hundreds of millions of Americans." Would ChoicePoint have a minimum of four data points--name, address, social-security number, and birth date--for almost every adult U.S. citizen, and therefore have enough information to differentiate among, say, any five people with names whose Soundex hashes would come out the same? Hendricks answers, "That's certainly true. So would the three main credit-reporting companies." However, Hendricks continues, whereas the big-three credit-reporting agencies--Experian, Trans Union and Equifax--calculate individuals' credit scores, ChoicePoint defines itself as a data-aggregation company in the business of selling actionable intelligence to both industry and government, with credit-related information being only a subset of that whole.

Comments

Log In

Forgot your password?     Register »
Advertisement

Videos

Making 3D Maps on the Move
Technology Review November/December 2009

Current Issue

Natural Gas Changes the Energy Map
The United States has vast supplies of this cleaner fossil fuel. But how should we use it?
Featured Content
Sponsored by:
White Papers

Twelve ways to reduce costs with SQL Server 2008
Find out how to reduce costs and get more efficient

Download

Total Economic Impact of SQL Server 2008 Upgrade
Forrester reports on increasing productivity and management capabilities

Download 

Achieving Cost and Resource Savings with UC
How Office Communications Server R2 and Exchange Server can make your business smarter and more efficient

Download 

The Compelling Case for Conferencing
Read how you can improve workload support and find IT efficiencies

Download

How Windows Server 2008 R2 Helps Optimize IT and Save you Money
Read how you can improve workload support and find IT efficiencies

Download

Windows Server 2008 R2 Hyper-V Live Migration
See how Windows Server 2008 R2 and Hyper-V enable virtualization and Live Migration

Download
Advertisement
Subscribe to Technology Review's daily e-mail update. Enter your e-mail address

TECHNOLOGY RESOURCES
Advertisement
MIT Massachusetts Institute of Technology © 2009 Technology Review. All Rights Reserved.