Magnum Photos, possibly the most famous and valuable photographer-owned collection on the planet, knew it had “lost” images from the set of American Graffiti, George Lucas’s first film. It also knew that the way to find them was to get humans to start tagging every image in their archive. Problem is, they couldn’t trust the Amazon Mechanical Turkers to whom they were crowdsourcing the tagging of their archives to be able to recognize images from the set of that movie.
That’s where a machine with a pre-programmed (by humans) “semantic graph” comes in, says Panos Ipeirotis, a professor at NYU. Much like IBM’s Jeopardy-winning software Watson, this system had a map of the relationships between different phrases. It’s a bit like the game six degrees of Kevin Bacon, in which players try to leap from one actor to Kevin Bacon by tracing the links between them in the form of movies in which actors have both appeared.
At the intersection of the tags for the actors that appeared in the photographs being processed by crowdsourced human labor was American Graffiti. It was the only movie in which all of these actors had appeared, so it sat at the center of the semantic map connecting them.
In other words, humans plus machines accomplished something that neither could have accomplished on their own for a reasonable cost.
Many, many media companies have large volumes of material which is inadequately tagged with metadata, if it’s even tagged at all. The solution that Tagasauris applied to the Magnum photo archive could probably help unlock treasures in the archives of any company with a sufficiently deep archive.