Christopher Mims

A View from Christopher Mims

Automated Processing of Wikileaks Cables Reveals U.S. Friends, Foes

Natural Language Processing of nearly 4,000 U.S. diplomatic cables reveals fraying relations with traditional allies, and a few other surprises

  • April 11, 2011

Software capable of determining the positive or negative sentiment of sentences written by humans has been unleashed on 3,891 U.S. diplomatic cables released by WikiLeaks, and the results are a systematic, if preliminary, analysis of which countries are our besties and which are in the doghouse.

The analysis was part of a class project (pdf) by a pair of computer science undergraduates at Stanford, Xuwen Cao and Beyang Li. By looking at how often a country was mentioned, as well as whether or not it was cast in a positive or negative light, Cao and Li identified four clusters to which countries could belong: countries we don’t like that aren’t mentioned very often (red), countries we sort-of don’t like that aren’t mentioned very often (teal), and countries spoken of positively that also aren’t mentioned very often (blue).

Since these cables were supposed to be classified, we can assume they are candid. There weren’t any countries that were mentioned frequently in a negative or especially positive light – just countries that were groused about fairly frequently (green).

Here’s a further breakdown of what each cluster represents:

Green locations (cast in a somewhat negative light, and frequently):

[‘london’, ‘paris’, ‘cuba’, ‘africa’, ‘brasilia’, ‘cairo’, ‘eu’, ‘brazil’,
‘afghanistan’, ‘egypt’, ‘europe’, ‘iran’, ‘china’, ‘iraq’, ‘libya’, ‘syria’,
‘pakistan’, ‘washington’, ‘turkey’, ‘israel’, ‘moscow’, ‘spain’, ‘uk’, ‘russia’,
‘madrid’, ‘india’, ‘tripoli’, ‘kabul’, ‘iceland’, ‘france’]

Red locations (countries talked about infrequently, and in the most negative context):

[‘djibouti’, ‘taiwan’, ‘tajikistan’, ‘islam’, ‘mumbai’, ‘zimbabwe’, ‘dubai’, ‘goa’,
‘tibet’, ‘armenia’, ‘yar’, ‘ecuador’, ‘benghazi’, ‘algiers’, ‘yemen’, ‘paraguay’,
‘caracas’, ‘south africa’, ‘ouagadougou’, ‘xxxxxxxxxxxx’, ‘guinea’]

(It’s worth noting that due to the nature of natural language processing, a country like Taiwan could be mentioned in the context of negative sentiment about its context, and not the country itself – e.g. the cross-strait tensions with mainland China.)

Teal locations (mentioned in a somewhat negative context, but relatively infrequently):

[‘kosovo’, ‘north korea’, ‘damascus’, ‘argentina’, ‘latin america’, ‘netherlands’,
‘uruzgan’, ‘switzerland’, ‘reykjavik’, ‘lebanon’, ‘qatar’, ‘sudan’, ‘somalia’,
‘venezuela’, ‘guantanamo’, ‘colombia’, ‘sao paulo’, ‘saudi arabia’, ‘america’,
‘peru’, ‘gaza’, ‘bolivia’, ‘ukraine’, ‘geneva’, ‘jordan’, ‘tehran’, ‘georgia’,
‘sweden’, ‘portugal’, ‘mexico’, ‘lula’, ‘kenya’, ‘italy’, ‘ethiopia’, ‘canada’,
‘germany’, ‘havana’, ‘algeria’]

Blue locations (mentioned in the most positive context, but not very often):

[‘azerbaijan’, ‘japan’, ‘chechnya’, ‘norway’, ‘australia’, ‘ankara’, ‘baghdad’,
‘poland’, ‘haiti’, ‘kazakhstan’, ‘honduras’, ‘belgrade’, ‘copenhagen’, ‘kuwait’,
‘karzai’, ‘amazon’, ‘burma’, ‘tunisia’, ‘west bank’, ‘doha’, ‘west’, ‘new york’,
‘nigeria’, ‘serbia’, ‘darfur’, ‘chile’, ‘morocco’, ‘vatican’, ‘uae’, ‘new delhi’,
‘middle east’, ‘brussels’]

Here’s what the authors say about the seeming outliers in the blue group:

The blue cluster has the highest sentiment score, which means that US is relatively happy with this group. As one may notice, there are a few notable anomalies such as ‘burma’ and ‘sudan’. In the case of ‘burma’, the positive sentiment is mainly caused by Aung San Suu Kyi’s release from house arrest from mutliple cables. In the case of ‘sudan’, it’s also a special case because the darfur cables discuss mostly the international help darfur received, instead of it’s dire situation.

And here are the findings the authors found most interesting:

Given our model, we made a few interesting discoveries:
1. In general, the US diplomats are critical of other countries, as we observe the majority of the data points is in the negative
2. Surprisingly, US’s most important ally is spain (seen lower right quardrant)
3. US is most friendly with Norway (right-most point), although it’s relatively unimportant
4. Iran appeared most frequently, with a small negative sentiment (which means the attitude is not always hostile)
5. US is least happy with Zimbabwe and Paraguay, although it doesn’t care too much about them either
6. US doesn’t actually have good relations with its traditional allies such as France, UK and Germany. Canada, Italy and Germany even scored lower than China.

Number six is a zinger. It’s a stretch to say that cables that talk about our traditional allies in a negative light indicate that we have poor relations with them – maybe we have good relations, and that means we’re more willing to be critical, the way siblings are wont to fight. It’s also important to note that these results, which aren’t peer reviewed, are just a first approximation of what a full-fledged Natural Language Processing analysis of these cables would look like.

As much as this study says something about the nature of diplomacy, it’s possible it says something more about the nature of gossip: good news is never as important as news of what’s going wrong.

Follow Mims on Twitter or contact him via email.

Become an MIT Technology Review Insider for in-depth analysis and unparalleled perspective.
Subscribe today

Uh oh–you've read all five of your free articles for this month.

Insider Premium

$179.95/yr US PRICE

More from Intelligent Machines

Artificial intelligence and robots are transforming how we work and live.

Want more award-winning journalism? Subscribe to Insider Premium.

  • Insider Premium {! insider.prices.premium !}*

    {! insider.display.menuOptionsLabel !}

    Our award winning magazine, unlimited access to our story archive, special discounts to MIT Technology Review Events, and exclusive content.

    See details+

    What's Included

    Bimonthly home delivery and unlimited 24/7 access to MIT Technology Review’s website.

    The Download. Our daily newsletter of what's important in technology and innovation.

    Access to the Magazine archive. Over 24,000 articles going back to 1899 at your fingertips.

    Special Discounts to select partner offerings

    Discount to MIT Technology Review events

    Ad-free web experience

    First Look. Exclusive early access to stories.

    Insider Conversations. Join in and ask questions as our editors talk to innovators from around the world.

You've read of free articles this month.