Too much information. In conversation, it means roughly “Stop! I didn’t want to know that.” But it’s also a cry heard more and more often from search-engine users who type in a generic query like digital cameras and get back an Everest of results (21 million at Google, 21.5 million at MSN Search, and 38.1 million at Yahoo, to be precise). For shoppers in pursuit of product reviews or retail listings, general-purpose search engines just aren’t the right tools.
But shopping portals like Shopzilla, PriceGrabber, Shopping.com, and Froogle have their own shortcomings, argues Silicon Valley entrepreneur Michael Yang. They generally assume that shoppers have already decided what they want to buy and are simply comparing prices. “But shopping is really a two-stage process,” says Yang, who is 43. “Once you decide what is the best product, then you start comparing prices. And the comparison-shopping sites are only focused on getting you the best price.”
Yang should know. Along with fellow Korean-born engineer Yeogirl Yun, he launched price comparison site mySimon in 1998. MySimon was one of the earliest and most successful online shopping aids, attracting millions of users and $30 million in venture capital in its first 18 months. In early 2000, CNET acquired mySimon for $700 million, providing a princely return to its investors and allowing its founders to walk away with fortunes in the “double-digit millions,” in Yang’s words. It was one of the rare examples of dot-com dreams come true.
But Yang and Yun weren’t sated by the experience. After spending a few years on other projects, the two are back together and taking a second run at the online-shopping market. This time they’re building a new type of search engine that focuses exclusively on product-related information. And that might be just the beginning: the engine itself, they say, uses fundamentally new search algorithms that could be adapted to any specialized, or “vertical,” category. A beta version of Yang and Yun’s site, called Become.com, debuted in February.
But can a pair of dot-com millionaires strike gold twice? (Or thrice, in Yun’s case: after mySimon, he developed a search engine called Wisenut, built a solid base of users, and sold it to Looksmart.) These days, the odds are unfavorable for any startup. Add to that the extreme caution now endemic among Silicon Valley venture capitalists, the surfeit of existing shopping sites, and the fact that Yun and Yang are taking on Google, Yahoo, and Microsoft all at once, and their plan begins to sound brash.
Yang immigrated to the United States with his parents at age 14, earned degrees in electrical engineering and computer science at the University of California, Berkeley, and Columbia, did stints as a chip designer at Xerox and a marketer at Samsung, and got a night-school MBA from Berkeley. Yun graduated from Seoul National University in 1993 with a degree in computer engineering, won a national scholarship in computer science, earned a master’s in the subject at Stanford, and went on to found an Internet software startup called One-Oh. By that time, Yang was running video-card maker Jazz Multimedia, and he and Yun first met when Jazz acquired One-Oh. When Jazz itself folded in 1998, the pair decided to start an Internet business together.
They settled on comparison shopping, a habit Yang says he acquired as a high-schooler earning a low wage at Seven-Eleven. “There were a couple of startups that were already doing that, but they weren’t doing it very well,” Yang says. (One of them, Jango, was absorbed in 1997 by the doomed search portal Excite, and the other, Junglee, was acquired by Amazon.com in 1998.) Their creation, mySimon, was built around a “virtual learning agent” designed by Yun to extract prices, product details, and other relevant data from merchants’ online catalogues, each of which was structured differently. “These days we don’t need that agent technology, because most merchants package their information in a standardized XML [extensible markup language] structure,” Yun says. “But back then no one wanted to give out that information” – so Yun’s team became expert at extracting it.
Between April 1998 and December 1999, 10 million users flocked to mySimon, and its staff grew from two to 60. Yang says he and Yun would have stayed on as the company’s leaders, but by early 2000 its investors were itching to cash out. CNET’s offer of $700 million – made just months before the dot-com crash – was impossible to refuse. (MySimon continues to operate today as part of the CNET publishing empire.)
While Yun went on to start Wisenut, Yang started a family. He says the years after mySimon gave him time to think more about what makes a search engine successful – and to watch Google rise to greatness. “I really got excited about challenging Google by developing a new search engine company that could give them a run for their money,” Yang says. In early 2003, he started reading papers about PageRank, Google’s algorithm for sorting search results, and started looking for partners who could build something better. “One day it just dawned on me – I should recruit Yeogirl.”
Yun, who having sold Wisenut was in law school in Korea at the time, was initially cautious about taking on the search giant. “Google is a strong company,” Yun says he told Yang. “I advised Michael to think about something else.” But when Yang gave a half-hour pitch over lunch that grew into a six-hour brainstorming session, Yun was hooked. It helped, of course, that Yang was promising to put $2 million of his own savings into the venture.
Any search at Become.com will illustrate one of Yun and Yang’s first strategic decisions: to grant a high rank in search results to impartial product-review sites such as ConsumerReports.org and ConsumerSearch.com. Those sites appear in part because Become has a human element: a small team of researchers locates authoritative sites and places them into the site’s index. Machine-learning software then ranks other pages partly according to their relation to the authoritative sites. That’s a big departure from the approach taken by PageRank, which is fully automatic and sorts the pages resulting from a given query mainly according to the number of other pages that link to them. The more incoming links, the assumption goes, the more popular a page, and the more popular, the more relevant. But “with popularity-based link analysis, sites like ConsumerSearch.com would not always appear high in the results,” says Yang. “We fundamentally believe that there are certain things that humans can do better than algorithms.”
That’s not to say Become lacks clever algorithms of its own. For one thing, Yun and his small team of engineers (who were hired only after passing an exhaustive 20-hour programming test) devised a new type of crawler, which is a program that scans the Web and copies pages into a search engine’s index. Become’s crawler has been trained to recognize and throw out spam pages and non-shopping-related information. Then there’s Yun’s proudest accomplishment, the site’s core algorithm for sorting search results. Called AIR, for affinity index ranking, it differs from Google’s PageRank algorithm in two ways. First, when AIR assesses the importance of a given Web page, it takes into account the topics of the pages linking to it. (PageRank considers some elements of the context surrounding an incoming link, but not the page’s overall topic.) AIR rewards pages that have on-topic incoming links. Second, AIR penalizes pages that have outgoing links to off-topic pages. (PageRank does not examine a page’s outgoing links.) AIR’s dual process of rewarding and punishing pages based on the primacy of a specific topic means that the top search results for a query like refrigerators will be those most closely related to buying a refrigerator, not necessarily those with the most incoming links, as with PageRank.
For now, Become’s only revenues come from the keyword-based ads provided to thousands of sites by Google’s AdSense program. (Advertisers pay site owners for every click-through.) Later this year, the site will add classic price-comparison pages and charge merchants to list their wares. Become will earn a fee for every click-through to a merchant site, and a commission for any sales resulting from such click-throughs. Meanwhile, Yang says the company has enough capital – $4.5 million, including $2.5 million from one of mySimon’s initial backers, Japanese corporate investor Transcosmos – to keep its team of 24 employees running. That’s a lot less than the $30 million mySimon accumulated to support its 60 employees, but Yang says that’s deliberate. “Before, it was ‘Get big fast.’ It was land-grab mentality. Now it’s ‘Get profitable fast.’ And there is value in being self-sufficient. You don’t have to be at the whims of investors.” The company is currently trying to raise $12 million in second-round financing.
Officials at Transcosmos don’t take offense at Yang’s attitude. “Number one, we are a big fan of the comparison-shopping category,” says Shin Nakagura, vice president of business development at Transcosmos’s U.S. subsidiary. “Number two, we made a big return on mySimon, so it’s relatively easy to get the money together. Number three, we know those guys. And I think Become.com’s approach – starting with the more technically difficult point, pure search, and then adding the product comparison service – is a very good idea.”
Indeed, the longer one spends with Yang and Yun, the clearer it becomes that Become is a specialized-search company in a shopping site’s clothing. It doesn’t take much prodding to get Yun to admit that AIR could be applied not just to shopping but to any well-bounded search domain, such as health care – thereby chipping away at the audiences for general-purpose search sites. As Yun says, “I really feel that Become can become anything.” But he understands that in today’s climate, he won’t attract large investments until he proves that his first business can make money.
This new data poisoning tool lets artists fight back against generative AI
The tool, called Nightshade, messes up training data in ways that could cause serious damage to image-generating AI models.
Rogue superintelligence and merging with machines: Inside the mind of OpenAI’s chief scientist
An exclusive conversation with Ilya Sutskever on his fears for the future of AI and why they’ve made him change the focus of his life’s work.
Data analytics reveal real business value
Sophisticated analytics tools mine insights from data, optimizing operational processes across the enterprise.
Driving companywide efficiencies with AI
Advanced AI and ML capabilities revolutionize how administrative and operations tasks are done.
Get the latest updates from
MIT Technology Review
Discover special offers, top stories, upcoming events, and more.