FCC rules require TV stations to provide closed captions that convey speech, sound effects, and audience reactions such as laughter to deaf and hard of hearing viewers. YouTube isn’t subject to those rules, but thanks to Google’s machine-learning technology, it now offers similar assistance.
YouTube has used speech-to-text software to automatically caption speech in videos since 2009 (they are used 15 million times a day). Today it rolled out algorithms that indicate applause, laughter, and music in captions. More sounds could follow, since the underlying software can also identify noises like sighs, barks, and knocks.
The company says user tests indicate that the feature significantly improves the experience of the deaf and hard of hearing (and anyone who needs to keep the volume down). “Machine learning is giving people like me that need accommodation in some situations the same independence as others,” says Liat Kaver, a product manager at YouTube who is deaf.
Indeed, YouTube’s project is one of a variety that are creating new accessibility tools by building on progress in the power and practicability of machine learning. The computing industry has been driven to advance software that can interpret images, text, or sound primarily by the prospect of profits in areas such as ads, search, or cloud computing. But software with some ability to understand the world has many uses.
Last year, Facebook launched a feature that uses the company’s research on image recognition to create text descriptions of images from a person’s friends, for example.
Researchers at IBM are using language-processing software developed under the company’s Watson project to make a tool called Content Clarifier to help people with cognitive or intellectual disabilities such as autism or dementia. It can replace figures of speech such as “raining cats and dogs” with plainer terms, and trim or break up lengthy sentences with multiple clauses and indirect language.
The University of Massachusetts Medical School is helping to test how the system could help people with reading or cognitive disabilities. Will Scott, an IBM researcher who worked on the project, says the company is talking with an organization that helps autistic high schoolers transition to college life about testing the system as a way of helping people understand administrative and educational documents. “The computing power and algorithms and cloud services like Watson weren’t previously available to perform these kinds of things,” he says.
Ineke Schuurman, a researcher at the University of Leuven in Belgium, says inventing new kinds of accessibility tools is important to prevent some people from being left behind as society relies more and more on communication through computers and mobile devices.
She is one of the leaders of an EU project testing its own text simplification software for people with intellectual disabilities. The technology has been built into apps that integrate with Gmail and social networks such as Facebook. “People with intellectual disabilities, or any disability, want to do what their friends and sisters and brothers do—use smartphones, tablets, and social networking,” says Schuurman.
Austin Lubetkin, who has autism spectrum disorder, has worked with Florida nonprofit Artists with Autism to help others on the spectrum become more independent. He welcomes research like IBM’s but says it will be a challenge to ensure that such tools perform reliably. A machine-learning algorithm recommending a movie you don’t care for is one thing; an error that causes you to misunderstand a friend is another.
Still, Lubetkin, who is working at a startup while pursuing a college degree, is optimistic that machine learning will open up many new opportunities for people with disabilities in the next few years. He recently drew on image-recognition technology from startup Clarifai to prototype a navigation app that offers directions in the form of landmarks, inspired by his own struggles to interpret the text and diagram information from conventional apps while driving. “Honestly, AI can level the playing field,” says Lubetkin.
This artist is dominating AI-generated art. And he’s not happy about it.
Greg Rutkowski is a more popular prompt than Picasso.
What does GPT-3 “know” about me?
Large language models are trained on troves of personal data hoovered from the internet. So I wanted to know: What does it have on me?
An AI that can design new proteins could help unlock new cures and materials
The machine-learning tool could help researchers discover entirely new proteins not yet known to science.
DeepMind’s new chatbot uses Google searches plus humans to give better answers
The lab trained a chatbot to learn from human feedback and search the internet for information to support its claims.
Get the latest updates from
MIT Technology Review
Discover special offers, top stories, upcoming events, and more.