Marc Smith, the senior research sociologist at Microsoft Research, believes that now is a good time to practice his trade. Thanks to the Internet, there is unprecedented access to sociological data. And thanks to computers, sociologists are better able to sift through that data, find trends, and test models. <?xml:namespace prefix = o /??>At Microsoft, Smith uses public Internet data to look at the social phenomenon of online communities, and he tries to make them better for people and better for business. He recently gave a presentation regarding his work at Microsoft’s TechFest in <?xml:namespace prefix = st1 /??>
Marc Smith: I was trained in collective action dilemma theory–the study of how we all get along. I looked at all the problems and opportunities that occur when groups of people get together to get things done. It really appealed to me because I wondered what the Internet would do to human societies. If you look out on the Net, almost all of the interesting phenomena is the result of groups of people coming together: news groups, e-mail, blogs–people coming together to answer questions and to create software and huge collections of photography. But it doesn’t always work. The question becomes, What makes some collective actions more successful than others?TR: What sort of projects do you work on at Microsoft?
MS: It could be anything. What I’ve made it about is tool building and data analysis; the effort is largely around making better tools for seeing online communities. We’ve gathered a lot of public behavioral data from online communities and analyze it to find patterns that we can associate with interesting social-science phenomena and that are relevant to business.
TR: Why are online communities important to Microsoft?
MS: We care about the online communities around any number of Microsoft products, such as
Our collection of this data is called Netscan, and it’s accessible to the public. We encourage social scientists to use this tool for social-scientific purposes. We’ve been able to take this data and create a consortium of 16 universities that use our data to do research.
TR: Give me an example of a project that uses Netscan data.
MS: We have a project called Community Buzz that uses the data to help us see the network structure of the community and main ideas that are generated from the community. The idea is that Community Buzz takes data from Netscan and indexes it according to the behavior of people in the community, [based on] the patterns of message creation. There are roles people play in communities. One of the roles, which only a few people do, but it’s the most important, is called the answer person. And particularly in a technical question-and-answer environment, the answer person is carrying the community forward. They are the experts who care a lot about this product and combine this care with deep knowledge. In Community Buzz, we can segment the population into answer people, newcomers, and spammers. We identify these roles based on patterns of who talks to whom and how often, instead of looking at actual content. Then we can analyze the text and apply information-visualization tools such as tag clouds and trend lines to determine the common topics that are being discussed over time.