Select your localized edition:

Close ×

More Ways to Connect

Discover one of our 28 local entrepreneurial communities »

Be the first to know as we launch in new countries and markets around the globe.

Interested in bringing MIT Technology Review to your local market?

MIT Technology ReviewMIT Technology Review - logo

 

Unsupported browser: Your browser does not meet modern web standards. See how it scores »

{ action.text }

Software capable of determining the positive or negative sentiment of sentences written by humans has been unleashed on 3,891 U.S. diplomatic cables released by WikiLeaks, and the results are a systematic, if preliminary, analysis of which countries are our besties and which are in the doghouse.

The analysis was part of a class project (pdf) by a pair of computer science undergraduates at Stanford, Xuwen Cao and Beyang Li. By looking at how often a country was mentioned, as well as whether or not it was cast in a positive or negative light, Cao and Li identified four clusters to which countries could belong: countries we don’t like that aren’t mentioned very often (red), countries we sort-of don’t like that aren’t mentioned very often (teal), and countries spoken of positively that also aren’t mentioned very often (blue).

Since these cables were supposed to be classified, we can assume they are candid. There weren’t any countries that were mentioned frequently in a negative or especially positive light – just countries that were groused about fairly frequently (green).

Here’s a further breakdown of what each cluster represents:

Green locations (cast in a somewhat negative light, and frequently):

[‘london’, ‘paris’, ‘cuba’, ‘africa’, ‘brasilia’, ‘cairo’, ‘eu’, ‘brazil’,
‘afghanistan’, ‘egypt’, ‘europe’, ‘iran’, ‘china’, ‘iraq’, ‘libya’, ‘syria’,
‘pakistan’, ‘washington’, ‘turkey’, ‘israel’, ‘moscow’, ‘spain’, ‘uk’, ‘russia’,
‘madrid’, ‘india’, ‘tripoli’, ‘kabul’, ‘iceland’, ‘france’]

Red locations (countries talked about infrequently, and in the most negative context):

[‘djibouti’, ‘taiwan’, ‘tajikistan’, ‘islam’, ‘mumbai’, ‘zimbabwe’, ‘dubai’, ‘goa’,
‘tibet’, ‘armenia’, ‘yar’, ‘ecuador’, ‘benghazi’, ‘algiers’, ‘yemen’, ‘paraguay’,
‘caracas’, ‘south africa’, ‘ouagadougou’, ‘xxxxxxxxxxxx’, ‘guinea’]

(It’s worth noting that due to the nature of natural language processing, a country like Taiwan could be mentioned in the context of negative sentiment about its context, and not the country itself – e.g. the cross-strait tensions with mainland China.)

Teal locations (mentioned in a somewhat negative context, but relatively infrequently):

[‘kosovo’, ‘north korea’, ‘damascus’, ‘argentina’, ‘latin america’, ‘netherlands’,
‘uruzgan’, ‘switzerland’, ‘reykjavik’, ‘lebanon’, ‘qatar’, ‘sudan’, ‘somalia’,
‘venezuela’, ‘guantanamo’, ‘colombia’, ‘sao paulo’, ‘saudi arabia’, ‘america’,
‘peru’, ‘gaza’, ‘bolivia’, ‘ukraine’, ‘geneva’, ‘jordan’, ‘tehran’, ‘georgia’,
‘sweden’, ‘portugal’, ‘mexico’, ‘lula’, ‘kenya’, ‘italy’, ‘ethiopia’, ‘canada’,
‘germany’, ‘havana’, ‘algeria’]

Blue locations (mentioned in the most positive context, but not very often):

[‘azerbaijan’, ‘japan’, ‘chechnya’, ‘norway’, ‘australia’, ‘ankara’, ‘baghdad’,
‘poland’, ‘haiti’, ‘kazakhstan’, ‘honduras’, ‘belgrade’, ‘copenhagen’, ‘kuwait’,
‘karzai’, ‘amazon’, ‘burma’, ‘tunisia’, ‘west bank’, ‘doha’, ‘west’, ‘new york’,
‘nigeria’, ‘serbia’, ‘darfur’, ‘chile’, ‘morocco’, ‘vatican’, ‘uae’, ‘new delhi’,
‘middle east’, ‘brussels’]

Here’s what the authors say about the seeming outliers in the blue group:

The blue cluster has the highest sentiment score, which means that US is relatively happy with this group. As one may notice, there are a few notable anomalies such as ‘burma’ and ‘sudan’. In the case of ‘burma’, the positive sentiment is mainly caused by Aung San Suu Kyi’s release from house arrest from mutliple cables. In the case of ‘sudan’, it’s also a special case because the darfur cables discuss mostly the international help darfur received, instead of it’s dire situation.

And here are the findings the authors found most interesting:

Given our model, we made a few interesting discoveries:
1. In general, the US diplomats are critical of other countries, as we observe the majority of the data points is in the negative
2. Surprisingly, US’s most important ally is spain (seen lower right quardrant)
3. US is most friendly with Norway (right-most point), although it’s relatively unimportant
4. Iran appeared most frequently, with a small negative sentiment (which means the attitude is not always hostile)
5. US is least happy with Zimbabwe and Paraguay, although it doesn’t care too much about them either
6. US doesn’t actually have good relations with its traditional allies such as France, UK and Germany. Canada, Italy and Germany even scored lower than China.

Number six is a zinger. It’s a stretch to say that cables that talk about our traditional allies in a negative light indicate that we have poor relations with them – maybe we have good relations, and that means we’re more willing to be critical, the way siblings are wont to fight. It’s also important to note that these results, which aren’t peer reviewed, are just a first approximation of what a full-fledged Natural Language Processing analysis of these cables would look like.

As much as this study says something about the nature of diplomacy, it’s possible it says something more about the nature of gossip: good news is never as important as news of what’s going wrong.

Follow Mims on Twitter or contact him via email.

1 comment. Share your thoughts »

Tagged: Computing, WikiLeaks, natural language processing, Stanford

Reprints and Permissions | Send feedback to the editor

From the Archives

Close

Introducing MIT Technology Review Insider.

Already a Magazine subscriber?

You're automatically an Insider. It's easy to activate or upgrade your account.

Activate Your Account

Become an Insider

It's the new way to subscribe. Get even more of the tech news, research, and discoveries you crave.

Sign Up

Learn More

Find out why MIT Technology Review Insider is for you and explore your options.

Show Me