For example, a bank or credit-card company wanting to use machine learning to build systems that make decisions based on historical transactions is unlikely to have the specialized staff and necessary infrastructure for what is a computationally intensive approach. “This API could be a way to get a capability cheaply that would cost a huge amount through a traditional route.”
Google’s new service may also be more palatable to businesses wary of handing over their data to cloud providers, says Confino. “The data can be completely obfuscated, and you can still use this service. Google doesn’t have to know if those numbers you are sending it are stock prices or housing prices.”
Google does, however, get some information that it can use to improve its machine-learning algorithms. “We don’t look at users’ data, but we do see the same metrics on prediction quality that they do, to help us improve the service,” says Green. The engineers running Prediction API will know if a particular algorithm is rarely used, or if a new one needs to be added to the mix to better process certain types of data.
Prediction API has the potential to be a leveler between established companies and smaller startups, says Pete Warden, an ex-Apple engineer now working on his own startup OpenHeatMap.com. “That’s been a competitive advantage for large companies like Amazon, whose product recommendation is built on machine learning,” he explains. “Now you still have to have a decent set of training data, but you don’t have to have the same level of expertise.”
Warden has yet to gain access to Prediction API, but has plans to use it to improve a service he built that shows where people using a particular word or phrase on Twitter are located. “It would be really interesting to also see where they are saying positive and negative things on a subject,” says Warden. Prediction API could be trained to distinguish between positive and negative tweets to do that, he says.
Chris Bates, a data scientist with online music service Grooveshark and a member of the preview program, agrees that Google’s black box will enable wider use of machine learning, but he contends that the service needs to mature. “Today it is good at predicting which language text is in and also sentiment analysis, for example to pick out positive and negative reviews,” he says.
Ultimately, though, being unable to inspect the inner workings of the algorithms and fine-tune them for a specific use may have its limits. “It’s good for cases that are not mission-critical, where you can afford a few false positives,” Bates says. For example, a spam filter that occasionally lets through the occasional junk message could still be usable, but a credit-card company might be less able to accept any errors.