Recently Collective Intellect has been working on more private data—for example, processing transcripts of calls made to customer service lines, or transcripts of doctor-patient consultations. “In that case the drug company wants to know what the themes of discussion are when a drug gets prescribed,” says Greenstreet. He predicts that such tools will start being used on internal data much more frequently. “There’s a lot of data in the enterprise that they don’t have a handle on,” he says.
David Steier, a director of information management with Deloitte’s consulting arm in Palo Alto, California, agrees that many businesses are unknowingly sitting on top of valuable data. His team makes it possible to automatically extract insights from collections of documents that would otherwise have to be read individually.
“Auto insurance claims, for example, have text descriptions of the accident that would usually be read and interpreted by a person processing the claim,” he says. “We can have software read that text, and other information on the form, to create a risk score for the claim automatically.” Deloitte’s system was trained on a batch of manually processed claims and was able to learn which words in the written description signaled high-cost accidents. “Motorcycle” and “flighted” (as in to a hospital) proved to be particularly strong indications that a claim would be expensive.
“You can use the score we generate to automatically route complex claims to the most experienced adjustors,” says Steier. A similar approach can use the text of support requests and call transcripts to make predictions about which customers are likely to switch to a competing product, giving companies a chance to develop innovative strategies for targeting people before they defect. “Companies need to pay attention to the fact that there is a lot of high value in this unstructured data that it is easy to overlook,” says Steier.