Automated Processing of Wikileaks Cables Reveals U.S. Friends, Foes
Software capable of determining the positive or negative sentiment of sentences written by humans has been unleashed on 3,891 U.S. diplomatic cables released by WikiLeaks, and the results are a systematic, if preliminary, analysis of which countries are our besties and which are in the doghouse.
The analysis was part of a class project (pdf) by a pair of computer science undergraduates at Stanford, Xuwen Cao and Beyang Li. By looking at how often a country was mentioned, as well as whether or not it was cast in a positive or negative light, Cao and Li identified four clusters to which countries could belong: countries we don’t like that aren’t mentioned very often (red), countries we sort-of don’t like that aren’t mentioned very often (teal), and countries spoken of positively that also aren’t mentioned very often (blue).

Since these cables were supposed to be classified, we can assume they are candid. There weren’t any countries that were mentioned frequently in a negative or especially positive light – just countries that were groused about fairly frequently (green).
Here’s a further breakdown of what each cluster represents:
Green locations (cast in a somewhat negative light, and frequently):
[‘london’, ‘paris’, ‘cuba’, ‘africa’, ‘brasilia’, ‘cairo’, ‘eu’, ‘brazil’,
‘afghanistan’, ‘egypt’, ‘europe’, ‘iran’, ‘china’, ‘iraq’, ‘libya’, ‘syria’,
‘pakistan’, ‘washington’, ‘turkey’, ‘israel’, ‘moscow’, ‘spain’, ‘uk’, ‘russia’,
‘madrid’, ‘india’, ‘tripoli’, ‘kabul’, ‘iceland’, ‘france’]
Red locations (countries talked about infrequently, and in the most negative context):
[‘djibouti’, ‘taiwan’, ‘tajikistan’, ‘islam’, ‘mumbai’, ‘zimbabwe’, ‘dubai’, ‘goa’,
‘tibet’, ‘armenia’, ‘yar’, ‘ecuador’, ‘benghazi’, ‘algiers’, ‘yemen’, ‘paraguay’,
‘caracas’, ‘south africa’, ‘ouagadougou’, ‘xxxxxxxxxxxx’, ‘guinea’]
(It’s worth noting that due to the nature of natural language processing, a country like Taiwan could be mentioned in the context of negative sentiment about its context, and not the country itself – e.g. the cross-strait tensions with mainland China.)
Teal locations (mentioned in a somewhat negative context, but relatively infrequently):
[‘kosovo’, ‘north korea’, ‘damascus’, ‘argentina’, ‘latin america’, ‘netherlands’,
‘uruzgan’, ‘switzerland’, ‘reykjavik’, ‘lebanon’, ‘qatar’, ‘sudan’, ‘somalia’,
‘venezuela’, ‘guantanamo’, ‘colombia’, ‘sao paulo’, ‘saudi arabia’, ‘america’,
‘peru’, ‘gaza’, ‘bolivia’, ‘ukraine’, ‘geneva’, ‘jordan’, ‘tehran’, ‘georgia’,
‘sweden’, ‘portugal’, ‘mexico’, ‘lula’, ‘kenya’, ‘italy’, ‘ethiopia’, ‘canada’,
‘germany’, ‘havana’, ‘algeria’]
Blue locations (mentioned in the most positive context, but not very often):
[‘azerbaijan’, ‘japan’, ‘chechnya’, ‘norway’, ‘australia’, ‘ankara’, ‘baghdad’,
‘poland’, ‘haiti’, ‘kazakhstan’, ‘honduras’, ‘belgrade’, ‘copenhagen’, ‘kuwait’,
‘karzai’, ‘amazon’, ‘burma’, ‘tunisia’, ‘west bank’, ‘doha’, ‘west’, ‘new york’,
‘nigeria’, ‘serbia’, ‘darfur’, ‘chile’, ‘morocco’, ‘vatican’, ‘uae’, ‘new delhi’,
‘middle east’, ‘brussels’]
Here’s what the authors say about the seeming outliers in the blue group:
The blue cluster has the highest sentiment score, which means that US is relatively happy with this group. As one may notice, there are a few notable anomalies such as ‘burma’ and ‘sudan’. In the case of ‘burma’, the positive sentiment is mainly caused by Aung San Suu Kyi’s release from house arrest from mutliple cables. In the case of ‘sudan’, it’s also a special case because the darfur cables discuss mostly the international help darfur received, instead of it’s dire situation.
And here are the findings the authors found most interesting:
Given our model, we made a few interesting discoveries:
1. In general, the US diplomats are critical of other countries, as we observe the majority of the data points is in the negative
2. Surprisingly, US’s most important ally is spain (seen lower right quardrant)
3. US is most friendly with Norway (right-most point), although it’s relatively unimportant
4. Iran appeared most frequently, with a small negative sentiment (which means the attitude is not always hostile)
5. US is least happy with Zimbabwe and Paraguay, although it doesn’t care too much about them either
6. US doesn’t actually have good relations with its traditional allies such as France, UK and Germany. Canada, Italy and Germany even scored lower than China.
Number six is a zinger. It’s a stretch to say that cables that talk about our traditional allies in a negative light indicate that we have poor relations with them – maybe we have good relations, and that means we’re more willing to be critical, the way siblings are wont to fight. It’s also important to note that these results, which aren’t peer reviewed, are just a first approximation of what a full-fledged Natural Language Processing analysis of these cables would look like.
As much as this study says something about the nature of diplomacy, it’s possible it says something more about the nature of gossip: good news is never as important as news of what’s going wrong.
Keep Reading
Most Popular
Geoffrey Hinton tells us why he’s now scared of the tech he helped build
“I have suddenly switched my views on whether these things are going to be more intelligent than us.”
Deep learning pioneer Geoffrey Hinton has quit Google
Hinton will be speaking at EmTech Digital on Wednesday.
Video: Geoffrey Hinton talks about the “existential threat” of AI
Watch Hinton speak with Will Douglas Heaven, MIT Technology Review’s senior editor for AI, at EmTech Digital.
Doctors have performed brain surgery on a fetus in one of the first operations of its kind
A baby girl who developed a life-threatening brain condition was successfully treated before she was born—and is now a healthy seven-week-old.
Stay connected
Get the latest updates from
MIT Technology Review
Discover special offers, top stories, upcoming events, and more.