Parlez-vous artificial intelligence? Two new research papers detail unsupervised machine-learning methods that can do language translation without dictionaries, as reported in Science. The methods also work without parallel text, or identical text that already exists in another language.
The papers, completed independently of one another, use similar methods. Both projects start by building bilingual dictionaries without the aid of a human to say whether they were right or not. Each takes advantage of the fact that relationships between certain words, like tree and leaves or shoes and socks, are similar across languages. This lets the AI look at clusters and connections from one language and learn about how another language works.
When it comes to translating sentences, the new dictionaries are put to the test with some additional help from two methods called back translation and denoising. Back translation converts one sentence to the new language before translating it back. If it doesn’t match the original sentence, the AI tweaks its next attempt and tries to get closer. Denoising works similarly, but moves or takes out a word here or there to keep the AI learning useful structure instead of just copying sentences.
Improving language translation has been a goal for companies like Google and Facebook, with some recent successes. Other attempts, like Google’s recent Pixel ear buds that are meant to translate on the fly, are still a work in progress.