Baidu has developed a new technique for smoother real-time machine translation.
The challenge: Developing a simultaneous translation system has been a tough nut to crack because word order differs between different languages. Consider this sentence in English, then Chinese: “The US president meets with the English prime minister.” “美国总统与英国首相会晤.” In English, the word “meets” appears near the beginning; its Chinese counterpart, “会晤” (huiwu), appears at the end. Because of this, commercial “real-time” translation systems wait for the person to complete a sentence before translating it into the target language. The result is a clunky user experience with awkward delays.
How it works: Baidu’s new approach shortens the delay by starting before a sentence is finished. If the system were translating the sentence above from Chinese to English, for example, it would anticipate the English word “meets” after hearing the first part of the Chinese sentence, based on the likelihood that the US president would meet with someone. The idea was inspired by an anticipation technique commonly used by human translators to keep up with the speaker.
Work in progress: Like any predictive machine-learning model, the system only works with reasonable accuracy when trained on copious amounts of language data with similar sentence structures. As a result, there are still significant limitations to deploying this system at scale. Baidu researchers found that the system doesn’t perform as well as full-sentence translators in experiments on Chinese-to-English translation with a five-word delay. But the researchers are excited about the new path forward. “Simultaneous translating for human interpreters is extremely challenging and burdensome,” says Liang Huang, the principal scientist at Baidu Research, “so we’re hopeful machines can step in and really make this service more accessible for professionals and consumers.”
A Roomba recorded a woman on the toilet. How did screenshots end up on Facebook?
Robot vacuum companies say your images are safe, but a sprawling global supply chain for data from our devices creates risk.
The viral AI avatar app Lensa undressed me—without my consent
My avatars were cartoonishly pornified, while my male colleagues got to be astronauts, explorers, and inventors.
Roomba testers feel misled after intimate images ended up on Facebook
An MIT Technology Review investigation recently revealed how images of a minor and a tester on the toilet ended up on social media. iRobot said it had consent to collect this kind of data from inside homes—but participants say otherwise.
How to spot AI-generated text
The internet is increasingly awash with text written by AI software. We need new tools to detect it.
Get the latest updates from
MIT Technology Review
Discover special offers, top stories, upcoming events, and more.