FCC rules require TV stations to provide closed captions that convey speech, sound effects, and audience reactions such as laughter to deaf and hard of hearing viewers. YouTube isn’t subject to those rules, but thanks to Google’s machine-learning technology, it now offers similar assistance.
YouTube has used speech-to-text software to automatically caption speech in videos since 2009 (they are used 15 million times a day). Today it rolled out algorithms that indicate applause, laughter, and music in captions. More sounds could follow, since the underlying software can also identify noises like sighs, barks, and knocks.
The company says user tests indicate that the feature significantly improves the experience of the deaf and hard of hearing (and anyone who needs to keep the volume down). “Machine learning is giving people like me that need accommodation in some situations the same independence as others,” says Liat Kaver, a product manager at YouTube who is deaf.
Indeed, YouTube’s project is one of a variety that are creating new accessibility tools by building on progress in the power and practicability of machine learning. The computing industry has been driven to advance software that can interpret images, text, or sound primarily by the prospect of profits in areas such as ads, search, or cloud computing. But software with some ability to understand the world has many uses.
Last year, Facebook launched a feature that uses the company’s research on image recognition to create text descriptions of images from a person’s friends, for example.
Researchers at IBM are using language-processing software developed under the company’s Watson project to make a tool called Content Clarifier to help people with cognitive or intellectual disabilities such as autism or dementia. It can replace figures of speech such as “raining cats and dogs” with plainer terms, and trim or break up lengthy sentences with multiple clauses and indirect language.
The University of Massachusetts Medical School is helping to test how the system could help people with reading or cognitive disabilities. Will Scott, an IBM researcher who worked on the project, says the company is talking with an organization that helps autistic high schoolers transition to college life about testing the system as a way of helping people understand administrative and educational documents. “The computing power and algorithms and cloud services like Watson weren’t previously available to perform these kinds of things,” he says.
Ineke Schuurman, a researcher at the University of Leuven in Belgium, says inventing new kinds of accessibility tools is important to prevent some people from being left behind as society relies more and more on communication through computers and mobile devices.
She is one of the leaders of an EU project testing its own text simplification software for people with intellectual disabilities. The technology has been built into apps that integrate with Gmail and social networks such as Facebook. “People with intellectual disabilities, or any disability, want to do what their friends and sisters and brothers do—use smartphones, tablets, and social networking,” says Schuurman.
Austin Lubetkin, who has autism spectrum disorder, has worked with Florida nonprofit Artists with Autism to help others on the spectrum become more independent. He welcomes research like IBM’s but says it will be a challenge to ensure that such tools perform reliably. A machine-learning algorithm recommending a movie you don’t care for is one thing; an error that causes you to misunderstand a friend is another.
Still, Lubetkin, who is working at a startup while pursuing a college degree, is optimistic that machine learning will open up many new opportunities for people with disabilities in the next few years. He recently drew on image-recognition technology from startup Clarifai to prototype a navigation app that offers directions in the form of landmarks, inspired by his own struggles to interpret the text and diagram information from conventional apps while driving. “Honestly, AI can level the playing field,” says Lubetkin.
DeepMind’s cofounder: Generative AI is just a phase. What’s next is interactive AI.
“This is a profound moment in the history of technology,” says Mustafa Suleyman.
AI hype is built on high test scores. Those tests are flawed.
With hopes and fears about the technology running wild, it's time to agree on what it can and can't do.
You need to talk to your kid about AI. Here are 6 things you should say.
As children start back at school this week, it’s not just ChatGPT you need to be thinking about.
AI language models are rife with different political biases
New research explains you’ll get more right- or left-wing answers, depending on which AI model you ask.
Get the latest updates from
MIT Technology Review
Discover special offers, top stories, upcoming events, and more.