Microsoft’s solution to Zoom fatigue is to trick your brain

Will sitting in a fixed virtual setting make videoconferencing better?

Tanya Basuarchive page

July 9, 2020

Courtesy Microsoft

There’s a certain routine to logging on to the now-ubiquitous videoconference: join a screen of Brady Bunch–like squares, ping-ponging your gaze between speakers but mostly staring self-consciously at your own face. What started as a novelty of working at home is now an exhausting ordeal that can leave us feeling mentally wiped out.

Microsoft thinks it’s got a solution. On Wednesday it launched “Together Mode” for its Teams business software, as part of a new suite of updates. The videoconferencing tool uses artificial intelligence to take a cutout of your live video image and place it into a fixed position within a setting. In a demo I participated in, it was a seat in a virtual auditorium, not unlike those found in lecture halls. The idea is that when you can see people in a fixed position, nonverbal cues like looking at or pointing at a speaker become clearer and more like what would happen in a natural in-person meeting.

Microsoft is playing catch-up in the lucrative videoconferencing arena. It has been outflanked during the pandemic-fueled remote-work boom by Zoom, which has become the envy of Silicon Valley, a cultural phenomenon, and a verb practically overnight.

Microsoft thinks its Teams platform and Together Mode can match Zoom’s reach. It’s got the benefit of a key demographic in its grip: students and educators. Microsoft said 183,000 educational institutions in 175 countries were using Teams, with about 150 million students and faculty actively using Microsoft Education products as a hub for remote learning.

But we’re increasingly fed up with video calls. Months of videoconferencing—not just for work and school, but for dating, happy hours, holiday gatherings, appointments, and chats with loved ones—have led to “Zoom fatigue.” Why is this the case? Jeremy Bailenson, a professor of communication at Stanford University who consulted on Together Mode, says that the faces presented on a typical videoconferencing grid reflect the dimensions of a person standing about two feet from you.

“Very rarely are you standing two feet away from a person and staring at them for an hour like that, unless you’re getting in conflict or about to mate,” he says. “When you have faces staring at you like this, the arousal response kicks in, that fight-or-flight mode. If you’re in fight-or-flight mode all day, it’s taxing to do these meetings.”

Microsoft hopes Together Mode will make videoconferencing feel less taxing. In Zoom’s gallery mode, for example, it can be hard to figure out who is speaking and who wants to speak. Together Mode solves this, in theory, by putting a participant in the same seat on everyone’s screen. That means if a person pipes up in the upper right hand corner of the virtual room, everyone’s gaze moves at about the same time to focus on that person; if someone interrupts from the middle, heads and eyes shift in that direction. In internal tests, Microsoft claims, users felt less fatigued and more focused in Together Mode.

But whether Together Mode encourages quieter or often ignored members of a team to speak up is yet to be seen. Despite the promise of a more democratic virtual platform, women have had a more difficult time making their voices heard. Teams, Meet, and Zoom have all incorporated a hand-raising function to help, and guides for supporting female colleagues in this setting have become increasingly common.

“I think Together Mode gives people the tools to do better, but it doesn’t guarantee people will be better,” says Jaron Lanier, a research scientist at Microsoft who’s considered a world expert in mixed reality.

Another nagging issue is, well, your face. In a demo, I found myself trying to adjust my seating so that I would not appear too big or too small, something Lanier claims is helpful in democratizing participation. But it meant I often lost track of the discussion as I self-consciously corrected my posture. And perhaps self-consciousness is the root of the problem to begin with.

“The ideal technology would let you disappear so you would stop being aware of yourself,” says Amber Davisson, an associate professor of communication at Keene State College. Davisson, who researches the intersection of intimacy, communication, and technology, says the way videoconferencing tries to emulate meetings contrasts with how humans normally interact.

“When I’m sitting in my class and teaching, I’m not looking at myself,” she says. “[Videoconferencing] is anxiety-causing and we can never relax; we’re way too aware of ourselves. The best technology would eliminate your face so you can look at everyone and they can look at you, and you don’t see yourself.”

One thing Davisson says is working in Together Mode’s favor is the non-personal background and predictable seating layout. Personalized Zoom backgrounds can be fun, but Davisson says having an agreed-upon, bland background like the ones offered by Together Mode eliminates the brain’s confusion of personal and professional.

That concept—of delineating work and home as the two spaces coalesce—is ultimately the challenge of videoconferencing during a pandemic. We need a safe place to fall apart, “and we used to think that was home,” she says. “But we do work and school in our homes now. Our only private space has been invaded, and it’s a lot for our brains to compute.”