Skip to Content
MIT Technology Review

Machines can spot mental health issues—if you hand over your personal data

Digital diagnosis could transform psychiatry by mining your most intimate data for clues. But is the privacy cost worth it?

August 13, 2020
Neguine RazaiiNeguine RazaiiJake Belcher

When Neguine Rezaii first moved to the United States a decade ago, she hesitated to tell people she was Iranian. Instead, she would use Persian. “I figured that people probably wouldn’t know what that was,” she says. 

The linguistic ambiguity was useful: she could conceal her embarrassment at the regime of Mahmoud Ahmadinejad while still being true to herself. “They just used to smile and go away,” she says. These days she’s happy to say Iranian again. 

We don't all choose to use language as consciously as Rezaii did—but the words we use matter. Poets, detectives, and lawyers have long sifted through people's language for clues to look for their motives and inner truths. Psychiatrists, too: perhaps psychiatrists especially. After all, while medicine now has a battery of tests and technical tools for diagnosing physical ailments, the chief tool of psychiatry is the same one employed centuries ago: the question “So how do you feel today?” Simple to ask, maybe—but not to answer.  

“In psychiatry we don’t even have a stethoscope,” says Rezaii, who is now a neuropsychiatry fellow at Massachusetts General Hospital. “It’s 45 minutes of talking with a patient and then making a diagnosis on the basis of that conversation. There are no objective measures. No numbers.” 

There’s no blood test to diagnose depression, no brain scan that can pinpoint anxiety before it happens. Suicidal thoughts cannot be diagnosed by a biopsy, and even if psychiatrists are deeply concerned that the covid-19 pandemic will have severe impacts on mental health, they have no easy way to track that. In the language of medicine, there is not a single reliable biomarker that can be used to help diagnose any psychiatric condition. The search for shortcuts to finding corruption of thought keeps coming up empty—keeping much of psychiatry in the past and blocking the road to progress. It makes diagnosis a slow, difficult, subjective process and stops researchers from understanding the true nature and causes of the spectrum of mental maladies or developing better treatments.

But what if there were other ways? What if we didn’t just listen to words but measure them? Could that help psychiatrists follow the verbal clues that could lead back to our state of mind?

“That is basically what we’re after,” Rezaii says. “Finding some behavioral features that we can assign some numbers to. To be able to track them in a reliable manner and to use them for potential detection or diagnosis of mental disorders.”

In June 2019, Rezaii published a paper about a radical new approach that did exactly that. Her research showed that the way we speak and write can reveal early indications of psychosis, and that computers can help us spot those signs with unnerving accuracy. She followed the breadcrumbs of language to see where they led. 

Rezaii found that language analysis could predict with more than 90% accuracy which patients were likely to develop schizophrenia before any typical symptoms emerged.

People who are prone to hearing voices, it turns out, tend to talk about them. They don’t mention these auditory hallucinations explicitly, but they do use associated words—“sound,” “hear,” “chant,” “loud”—more often in regular conversation. The pattern is so subtle you wouldn’t be able to spot the spikes with the naked ear. But a computer can find them. And in tests with dozens of psychiatric patients, Rezaii found that language analysis could predict which of them were likely to develop schizophrenia with more than 90% accuracy, before any typical symptoms emerged. It promised a huge leap forward.

In the past, capturing information about somebody or analyzing a person’s statements to make a diagnosis relied on the skill, experience, and opinions of individual psychiatrists. But thanks to the omnipresence of smartphones and social media, people’s language has never been so easy to record, digitize, and analyze. And a growing number of researchers are sifting through the data we produce—from our choice of language or our sleep patterns to how often we call our friends and what we write on Twitter and Facebook—to look for signs of depression, anxiety, bipolar disorder, and other syndromes. 

To Rezaii and others, the ability to collect this data and analyze it is the next great advance in psychiatry. They call it “digital phenotyping.”

Weighing your words

In 1908, the Swiss psychiatrist Eugen Bleuler announced the name for a condition that he and his peers were studying: schizophrenia. He noted how the condition’s symptoms “find their expression in language” but added, “The abnormality lies not in language itself but what it has to say.”

Bleuler was among the first to focus on what are called the “negative” symptoms of schizophrenia, the absence of something seen in healthy people. These are less noticeable than the so-called positive symptoms, which indicate the presence of something extra, such as hallucinations. One of the most common negative symptoms is alogia, or speech poverty. Patients either speak less or say less when they speak, using vague, repetitive, stereotypical phrases. The result is what psychiatrists call low semantic density.

Low semantic density is a telltale sign that a patient might be at risk of psychosis. Schizophrenia, a common form of psychosis, tends to develop  in the late teens to early 20s for men and the late 20s to early 30s for women—but a preliminary stage with milder symptoms usually precedes the full-blown condition. A lot of research is carried out on people in this “prodromal” phase, and psychiatrists like Rezaii are using language and other measures of behavior to try to identify which prodromal patients go on to develop full schizophrenia and why. Building on other research projects suggesting, for example, that people at high risk of psychosis tend use fewer possessive pronouns like “my,” “his,” or “ours,” Rezaii and her colleagues wanted to see if a computer could spot low semantic density.

Neguine Razai
JAKE BELCHER

The researchers used recordings of conversations made over the last decade or so with two groups of schizophrenia patients at Emory University. They broke each spoken sentence down into a series of core ideas so that a computer could measure the semantic density. The sentence “Well, I think I do have strong feelings about politics” gets a high score, thanks to the words “strong,” “politics,” and “feelings.”

But a sentence like “Now, now I know how to be cool with people because it’s like not talking is like, is like, you know how to be cool with people it’s like now I know how to do that” has a very low semantic density. 

In a second test, they got the computer to count the number of times each patient used words associated with sound—looking for the clues about voices that they might be hearing but keeping secret. In both cases, the researchers gave the computer a baseline of “normal” speech by feeding it online conversations posted by 30,000 users of Reddit.

When psychiatrists meet people in the prodromal phase, they use a standard set of interviews and cognitive tests to predict which will go on to develop psychosis. They usually get it right 80% of the time. By combining the two analyses of speech patterns, Rezaii’s computer scored at least 90%.

She says there’s a long way to go before the discovery could be used in the clinic to help predict what will happen to patients. The study looked at the speech of just 40 people; the next step would be to increase the sample size. But she’s already working on software that could quickly analyze the conversations she has with patients. “So you hit the button and it gives you numbers. What is the semantic density of the speech of the patient? What were the subtle features that the patient talked about but did not necessarily express in an explicit way?” she says. “If it’s a way to get into the deeper, more subconscious layers, that would be very cool.” 

The results also have an obvious implication: If a computer can reliably detect such subtle changes, why not continuously monitor those at risk? 

More than just schizophrenia

Around one in four people across the world will suffer from a psychiatric syndrome during their lifetime. Two in four now own a smartphone. Using the gadgets to capture and analyze speech and text patterns could act as an early warning system. That would give doctors time to intervene with those at highest risk, perhaps to watch them more closely—or even to try therapies to reduce the chance of a psychotic event.

Patients could also use technology to monitor their own symptoms. Mental-health patients are often unreliable narrators when it comes to their health—unable or unwilling to identify their symptoms. Even digital monitoring of basic measurements like the number of hours of sleep somebody is getting can help, says Kit Huckvale, a postdoctoral fellow who works on digital health at the Black Dog Institute in Sydney, because it can warn patients when they might be most vulnerable to a downturn in their condition.

It’s not just schizophrenia that could be spotted with a machine. By studying people’s phones, psychiatrists have been able to pick up the subtle signs that precede a bipolar episode.

“Using these computers that we all carry around with us, maybe we do have access to information about changes in behavior, cognition, or experience that provide robust signals about future mental illness,” he says. “Or indeed, just the earliest stages of distress.”

And it’s not just schizophrenia that could be spotted with a machine. Probably the most advanced use of digital phenotyping is to predict the behaviors of people with bipolar disorder. By studying people’s phones, psychiatrists have been able to pick up the subtle signs that precede an episode. When a downswing in mood is coming, the GPS sensors in bipolar patients’ phones show that they tend to be less active. They answer incoming calls less, make fewer outgoing calls, and generally spend more time looking at the screen. In contrast, before a manic phase they move around more, send more text messages, and spend longer talking on the phone. 

Starting in March 2017, hundreds of patients discharged from psychiatric hospitals around Copenhagen have been loaned customized phones so doctors can remotely watch their activity and check for signs of low mood or mania. If the researchers spot unusual or worrying patterns, the patients are invited to speak with a nurse. By watching for and reacting to early warning signs in this way, the study aims to reduce the number of patients who experience a serious relapse.

Such projects seek consent from participants and promise to keep the data confidential. But as details on mental health get sucked into the world of big data, experts have raised concerns about privacy.

“The uptake of this technology is definitely outpacing legal regulation. It’s even outpacing public debate,” says Piers Gooding, who studies mental-health law and policies at the Melbourne Social Equity Institute in Australia. “There needs to be a serious public debate about the use of digital technologies in the mental-health context.”

Already, scientists have used videos posted by families to YouTube—without seeking explicit consent—to train computers to find distinctive body movements of children with autism. Others have sifted Twitter posts to help track behaviors associated with the transmission of HIV, while insurance companies in New York are officially allowed to study people’s Instagram feeds before calculating their life insurance premiums.

As technology tracks and analyzes our behaviors and lifestyles with ever more precision—sometimes with our knowledge and sometimes without—the opportunities for others to remotely monitor our mental state is growing fast. 

Privacy protections

In theory, privacy laws should prevent mental-health data from being passed around. In the US, the 24-year-old HIPAA statute regulates the sharing of medical data, and Europe’s data protection act, the GDPR, should theoretically stop it too. But a 2019 report from surveillance watchdog Privacy International found that popular websites about depression in France, Germany, and the UK shared user data with advertisers, data brokers, and large tech companies, while some websites offering depression tests leaked answers and test results to third parties.

Gooding points out that for several years Canadian police would pass details on people who attempted suicide to US border officials, who would then refuse them entry. In 2017, an investigation concluded that the practice was illegal, and it was stopped. 

Few would dispute that this was an invasion of privacy. Medical information is, after all, meant to be sacrosanct. Even when diagnoses of mental illness are made, laws around the world are supposed to prevent discrimination in the workplace and elsewhere. 

But some ethicists worry that digital phenotyping blurs the lines on what could or should be classed, regulated, and protected as medical data. 

If the minutiae of our daily lives is sifted for clues to our mental health, then our “digital exhaust”—data on which words we choose, how quickly we respond to texts and calls, how often we swipe left, which posts we choose to like—could tell others at least as much about our state of mind as what’s in our confidential medical records. And it’s almost impossible to hide.

“The technology has pushed us beyond the traditional paradigms that were meant to protect certain types of information,” says Nicole Martinez-Martin, a bioethicist at Stanford. “When all data are potentially health data then there’s a lot of questions about whether that sort of health-information exceptionalism even makes sense anymore.”

Health-care information, she adds, used to be simple to classify—and therefore protect—because it was produced by health-care providers and held within health-care institutions, each of which had its own regulations to safeguard the needs and rights of its patients. Now, many ways of tracking and monitoring mental health using signals from our everyday actions are being developed by commercial firms, which don’t.

Facebook, for example, claims to use AI algorithms to find people at risk of suicide, by screening language in posts and concerned comments from friends and family. The company says it has alerted authorities to help people in at least 3,500 cases. But independent researchers complain it has not revealed how its system works or what it does with the data it gathers.

“Although suicide prevention efforts are vitally important, this is not the answer,” says Gooding. “There is zero research as to the accuracy, scale, or effectiveness of the initiative, nor information on what precisely the company does with the information following each apparent crisis. It’s basically hidden behind a curtain of trade secrecy laws.” 

The problems are not just in the private sector. Although researchers working in universities and research institutes are subject to a web of permissions to ensure consent, privacy, and ethical approval, some academic practices could actually encourage and enable the misuse of digital phenotyping, Rezaii points out.

“When I published my paper on predicting schizophrenia, the publishers wanted the code to be openly accessible, and I said fine because I was into liberal and free stuff. But then what if someone uses that to build an app and predict things on weird teenagers? That’s risky,” she says. “Journals have been advocating free publication of the algorithms. It has been downloaded 1,060 times so far. I do not know for what purpose, and that makes me uncomfortable.” 

Beyond privacy concerns, some worry that digital phenotyping is simply overhyped.

Serife Tekin, who studies the philosophy of psychiatry at the University of Texas at San Antonio, says psychiatrists have a long history of jumping on the latest technology as a way to try to make their diagnoses and treatments seem more evidence-based. From lobotomies to the colorful promise of brain scans, the field tends to move with huge surges of uncritical optimism that later proves to be unfounded, she says—and digital phenotyping could be simply the latest example. 

“Contemporary psychiatry is in crisis,” she says. “But whether the solution to the crisis in mental-health research is digital phenotyping is questionable. When we keep putting all of our eggs in one basket, that’s not really engaging with the complexity of the problem.”

Making mental health more modern?

Neguine Rezaii knows that she and others working on digital phenotyping are sometimes blinded by the bright potential of the technology. “There are things I haven’t thought about because we’re so excited about getting as much data as possible about this hidden signal in language,” she says.

But she also knows that psychiatry has relied for too long on little more than informed guesswork. “We don’t want to make some questionable inferences about what the patient might have said or meant if there is a way to objectively find out,” she says. “We want to record them, hit a button, and get some numbers. At the end of the appointment, we have the results. That’s the ideal. That’s what we’re working on.” 

To Rezaii, it’s natural that modern psychiatrists should want to use smartphones and other available technology. Discussions about ethics and privacy are important, she says, but so is an awareness that tech firms already harvest information on our behavior and use it—without our consent—for less noble purposes, such as deciding who will pay more for identical taxi rides or wait longer to be picked up. 

“We live in a digital world. Things can always be abused,” she says. “Once an algorithm is out there, then people can take it and use it on others. There’s no way to prevent that. At least in the medical world we ask for consent.”