In a Washington, DC, conference room soundproofed to thwart eavesdropping, five linguists working for the government-speaking on condition their names not be published-describe the monumental task they face analyzing foreign-language intercepts in the age of terror.
Around the table are experts in Arabic, Russian, Chinese, and Italian, and a woman who is one of the government’s few speakers of Dari, a language used in Afghanistan. They are young; three are in their 20s, the others in their 30s and 40s. But they are increasingly vital to U.S. national security: they are the front-line translators analyzing language that is messy, complicated, and fragmented but may give clues to an impending terrorist attack.
“Analyze” is the operative word here-not just “translate.” Poring over documents and audio clips, the five-along with thousands of other government or contract linguists who do similar work-struggle to pull out single words, isolate fragments of information, weave intelligence out of the fragments, and generally perform linguistic triage on the flood of raw material collected daily by the CIA, FBI, Department of Defense, and other sources. One linguist tells of having to dig through a filthy box of documents, reeking of gasoline, that had just come off a plane from Afghanistan. Another describes decoding a handwritten note whose signature-a key clue to its intelligence value-was half ripped off. Another recounts listening to an intercepted cell-phone conversation, in Russian, between two men in a noisy outdoor marketplace. One man was stuttering; that could have indicated he was nervous, which might or might not reflect the importance of the conversation. “It’s like looking at the pieces of a jigsaw puzzle,” without the box-top that shows what the picture is supposed to look like, says the Chinese linguist. “And maybe the pieces don’t fit together. You have to brush off the dust and say, what do I do next?”
Beyond these physical and contextual stumbling blocks, analysts face challenges from the languages themselves. Al-Qaeda members tend to speak an Arabic saturated with cultural and historical allusions; that makes it tough to distinguish religious dialogue from attack plans. And some of the terror group’s members aren’t native speakers of the language, which means they make unusual word choices, pronounce words differently, and commit many grammatical errors. “We have a lot of practice dealing with the Soviet model or the European model of conversation,” but not as much with cultures in which direct, plain language is rare, says Everette Jordan, a former National Security Agency linguist who arranged for the five linguists to meet with Technology Review. “It’s not the where, what, how, and when. It’s the why, and the why not. That’s what we’re encountering a lot.”
The costs of failing to clarify what adversaries mean in a timely manner are high. That was made clear during Congressional investigations into the intelligence lapses that led up to the September 11 attacks. In perhaps the most glaring example, on Sept. 10, 2001, according to June 2002 news reports, the NSA intercepted two Arabic-language messages, one that said “Tomorrow is zero hour” and another that said “The match is about to begin.” The sentences weren’t translated until Sept. 12, 2001. The revelation underscored the fact that the U.S. government faces a serious crisis in its ability to store, analyze, search, and translate data in dozens of foreign languages.
It’s a crisis that’s getting worse-literally, by the hour. The backlog of unexamined material is so large, it’s measured not in mere pages but in cubic meters. Consider that every three hours, NSA satellites sweep up enough information to fill the Library of Congress. And the NSA is only one intelligence agency. Somewhere in that massive haystack might be a needle about two kilobytes in size-the amount of data in a single typewritten page-in which terrorists let slip their plans.
And although there’s a well-reported shortage of qualified translators to help search for that needle, there’s a systemic problem, too. This deluge of intelligence is absorbed by a federal intelligence-gathering bureaucracy that is sprawling and balkanized. Four branches of the military, 13 intelligence agencies, and the State Department’s diplomatic corps all have their own creaky systems built up over decades. Each agency houses-some say hoards-its own set of translators, analysts, and databases. Indeed, well before September 11, experts knew that the government’s translation infrastructure wasn’t only overwhelmed; it was obsolete. But the attacks provided the motivation to rethink, from the ground up, how translation gets done. “We’re going through a cultural change right now,” the Chinese linguist says. “We have to find the tools for the job.”