Technology Review - Published By MIT
Advertisement

The Voice of Osama bin Laden

Continued from page 1

By Richard A. Muller

January 23, 2004

smaller text tool iconmedium text tool iconlarger text tool icon

Voice recognition is a rapidly developing technology, thanks to the availability of cheap computing power. You've probably seen a "voice print," a plot of frequency density vs. time; music editing software makes them on personal computers. Old voice recognition analysis made matches between sets of such plots. Modern voice identification systems, which seek to have low false-alarm rates even in the presence of noise, tend to depend more heavily on a technique known as "feature analysis." A feature is a peculiar twist in the voice, often a tell-tale transition between phonemes with different pitches. These are not readily heard by listeners, but they can be picked out in a digital analysis. Patterns of such glitches are unique identifiers, much like the ridge bifurcations and other minutiae of fingerprint patterns are the keys in fingerprint identification.

Voice identification systems are already in widespread use around the world. They are employed at the Canadian border to identify and track frequent travelers, and in Britain to verify the compliance of young parolees. U.S. companies, including Chase Manhattan Bank, Charles Schwab, and Prudential Securities, use voice identification to control access to secure areas and records. Visa is hoping to replace credit card verification personal identification numbers with voice recognition; a computer will compare features of your voice with those stored in the credit card chip.

With such a success record, shouldn't voice recognition software work reliably to identify Osama, or to reject an imitator? Unfortunately, the Al Jazeera tapes are not high quality-probably no better than telephone sound. That's good enough to detect some kinds of deception, but not all. Here are three possibilities:

1. The tape was made by an impressionist trying to imitate bin Laden's voice. Good impressionists can mimic the tone and pacing of their subject, but they often overemphasize obvious quirks, much as a caricaturist exaggerates dominant physical features.  That makes it amusing to hear, but it won't fool an analyst. Impressionists are not good at catching the more subtle features that even simple voice recognition software uses. This kind of counterfeit can almost certainly be ruled out.

2. The tape was made by cutting and pasting true excerpts from bin Laden's past speeches. Much of the tape could be unchanged from a prior recording. The tough part for the counterfeiter was adding mention of Saddam's capture, where words and phrases had to be rearranged. To detect such a forgery, a good analyst would listen for discontinuities in the background noise, or small blips indicating the tape was spliced. Digital processing by the tape maker can remove such artifacts, but they leave behind their own; low-pass filters, for example, create easily detected changes in the spectrum of the background hiss. (That's why true audiophiles dislike noise suppression filters. It is readily noticed by a trained ear.) Such cutting and pasting, even with digital filtering, would have been detected by the CIA. Digital processing can be detected in other ways; for example, it sometimes generates false frequencies (called aliases). Such tampering would have raised suspicions. Therefore this scenario can probably be ruled out as well.

3. The tape was a recording of one of Osama bin Laden's sons, who was deliberately trying to sound like his father. This is, in my mind, the most likely hypothesis.

Saad bin Osama bin Laden is the third of Osama's 23 to 50 children; he is known to be in his early twenties. He has been active in al Qaeda since his pre-teen years, and was probably being groomed for eventual leadership. He is reported to be fluent in English and the use of computers. The Washington Post reported that Saad was a key organizer of the May 12, 2003, al Qaeda bombing in Riyadh, Saudi Arabia. There have been reports that he is hiding along the Afghanistan-Pakistan border; others say that he is in Iran close to the Afghanistan border, in a region not controlled by the Iranian government. The Arab newspaper Asharq Al Awsat says that Saad is now one of the principal leaders of al Qaeda, but I'm skeptical of that. Al Qaeda is too sophisticated to let such a young and inexperienced person take over. But he likely has an extremely useful talent: sounding like his dad.

I like to consider myself an expert in the voices of my wife and my two daughters. I notice them even in a crowded and noisy room. When one of them telephones me, I instantly recognize her-but often incorrectly. The one I name is the one I expect, not the one who called. (They find this very amusing.) I don't know if the similarity of their voices is genetic or learned, but I know that others have similar problems. Parents and children tend to sound alike, and that effect is exaggerated when bandwidth is poor, such as in a telephone call or on a cassette recording. In fact, commercial speech recognition software that is "trained" to respond to a particular person's voice often  will have a hard time distinguishing the voice of a family member. The more sophisticated systems that intelligence agencies presumably use may of course be less prone to such confusion-but I suspect that this vulnerability to child and sibling spoofing remains. And I doubt that the U.S. government has a recording of Saad to use for comparison.

Here is my scenario:

Osama bin Laden was killed at Tora Bora-or his dialysis machine was destroyed and he died shortly afterwards. The strongest evidence for this is the absence of new videos. Al Qaeda fears that news of his death will shock and discourage many of its supporters. There is no other leader who can hold together this diverse and contentious organization, so they believe that they need to keep the news secret. The initial tapes they released were old recordings of former speeches. But many supporters were concerned. They, like me, noticed the absence of videos, and of speeches with clear date indicators. Al Qaeda knew a video counterfeit would be detected, but they noticed that Saad sounded a lot like his father. They had him listen to his father's speeches, and practice enunciating them with a similar style. It took many attempts, but Saad's voice on the final tape was good enough to deceive not only al Qaeda's foreign legions, but even some analysts at the CIA.

And if my personal experience is indicative, the tapes may even have fooled one or more of Osama bin Laden's wives.

Comments

Log In

Forgot your password?     Register »
Advertisement

Videos

The Marcellus Shale Gas Rush
Technology Review November/December 2009

Current Issue

Natural Gas Changes the Energy Map
The United States has vast supplies of this cleaner fossil fuel. But how should we use it?
Featured Content
Sponsored by:
White Papers

Twelve ways to reduce costs with SQL Server 2008
Find out how to reduce costs and get more efficient

Download

Total Economic Impact of SQL Server 2008 Upgrade
Forrester reports on increasing productivity and management capabilities

Download 

Achieving Cost and Resource Savings with UC
How Office Communications Server R2 and Exchange Server can make your business smarter and more efficient

Download 

The Compelling Case for Conferencing
Read how you can improve workload support and find IT efficiencies

Download

How Windows Server 2008 R2 Helps Optimize IT and Save you Money
Read how you can improve workload support and find IT efficiencies

Download

Windows Server 2008 R2 Hyper-V Live Migration
See how Windows Server 2008 R2 and Hyper-V enable virtualization and Live Migration

Download
Advertisement
Subscribe to Technology Review's daily e-mail update. Enter your e-mail address

TECHNOLOGY RESOURCES
Advertisement
MIT Massachusetts Institute of Technology © 2009 Technology Review. All Rights Reserved.