The software giant wants to better understand how people interact on the Internet.

Marc Smith, the senior research sociologist at Microsoft Research, believes that now is a good time to practice his trade. Thanks to the Internet, there is unprecedented access to sociological data. And thanks to computers, sociologists are better able to sift through that data, find trends, and test models.

At Microsoft, Smith uses public Internet data to look at the social phenomenon of online communities, and he tries to make them better for people and better for business. He recently gave a presentation regarding his work at Microsoft’s TechFest in Redmond, WA, an annual event at which Microsoft researchers from around the world share their latest work. Technology Review caught up with Smith to ask him about the field of cybersociology.

Technology Review: What’s your background, and why does it make you a good sociologist for Microsoft?

Marc Smith: I was trained in collective action dilemma theory–the study of how we all get along. I looked at all the problems and opportunities that occur when groups of people get together to get things done. It really appealed to me because I wondered what the Internet would do to human societies. If you look out on the Net, almost all of the interesting phenomena is the result of groups of people coming together: news groups, e-mail, blogs–people coming together to answer questions and to create software and huge collections of photography. But it doesn’t always work. The question becomes, What makes some collective actions more successful than others?

TR: What sort of projects do you work on at Microsoft?

MS: It could be anything. What I’ve made it about is tool building and data analysis; the effort is largely around making better tools for seeing online communities. We’ve gathered a lot of public behavioral data from online communities and analyze it to find patterns that we can associate with interesting social-science phenomena and that are relevant to business.

TR: Why are online communities important to Microsoft?

MS: We care about the online communities around any number of Microsoft products, such as Vista or Xbox. These online communities are the lifeblood of software applications. It’s where people go for support and where advanced users talk about advancing these tools. In many ways, companies like Microsoft and others who have leveraged online phenomena are flying blind with their online product communities–they don’t see the patterns and structures of their communities. The questions are: Is the community more healthy now than last month? Is it growing? Are the top contributors or the people who ask questions growing?

Our collection of this data is called Netscan, and it’s accessible to the public. We encourage social scientists to use this tool for social-scientific purposes. We’ve been able to take this data and create a consortium of 16 universities that use our data to do research.

TR: Give me an example of a project that uses Netscan data.

MS: We have a project called Community Buzz that uses the data to help us see the network structure of the community and main ideas that are generated from the community. The idea is that Community Buzz takes data from Netscan and indexes it according to the behavior of people in the community, [based on] the patterns of message creation. There are roles people play in communities. One of the roles, which only a few people do, but it’s the most important, is called the answer person. And particularly in a technical question-and-answer environment, the answer person is carrying the community forward. They are the experts who care a lot about this product and combine this care with deep knowledge. In Community Buzz, we can segment the population into answer people, newcomers, and spammers. We identify these roles based on patterns of who talks to whom and how often, instead of looking at actual content. Then we can analyze the text and apply information-visualization tools such as tag clouds and trend lines to determine the common topics that are being discussed over time.

TR: Why do you think sociology is important for technology companies to consider?

MS: Pretty much all of the future of computing is social computing. What makes people come back to a keyboard? The answer is many things, but I’ll argue that it’s other people. [Studies have shown that] people who didn’t send or receive much e-mail stopped using the Net as much as those who sent and received more. When someone’s message is waiting in your inbox for you to reply to it, there’s an enormous moral force that trumps advertisements for cheap airplane tickets or other impersonal messages. If you look out on the Net, it’s all about people who are brought together. Name the really interesting thing on the Net that’s not made out of people. At the moment, I think the world of the Internet is all about sociology.

TR: Companies like Yahoo, Intel, and Google are snatching up sociologists and economists in order to develop new products and optimize existing technology. You’re the only main sociology researcher at Microsoft. Are there plans to hire more?

MS: I believe so. I think one of the challenges is that many of the social scientists who’ve been gobbled up by other companies have been computer scientists who also do sociology. The challenge is the discipline of sociology: cybersociologists must do everything sociologist do, but also be computer savvy.

TR: How is the Internet changing sociology?

MS: It’s not that you collect data from the world and run it through a computer; it’s that most of the world runs through a computer. It’s a revolutionary thing. It’s a shift from an ephemeral society to archival society. Six or seven billion humans have come and gone over the course of history, and most of them didn’t leave a trace. In the not too distant future, it’s likely that one to two billion will leave 5 to 10 terabytes, and in those bytes will be the fine-grain details of their lives: the pictures they’ve taken, the words they’ve typed, and the people they’ve been with. This brings up a whole new set of issues. What will privacy look like? How will sovereignty be asserted on this stream of data?

The role of Microsoft Research is to get to the future first, cut our fingers on the rough edges, and figure out how to sand the future down so it’s smooth and ready for the rest of us. It’s naive to think that they’re only going to be positive results.
