Skip to Content

AI That Picks Stocks Better Than the Pros

A computer science professor uses textual analysis of articles to beat the market.

The ability to predict the stock market is, as any Wall Street quantitative trader (or quant) will tell you, a license to print money. So it should be of no small interest to anyone who likes money that a new system that works in a radically different way than previous automated trading schemes appears to be able to beat Wall Street’s best quantitative mutual funds at their own game.

It’s called the Arizona Financial Text system, or AZFinText, and it works by ingesting large quantities of financial news stories (in initial tests, from Yahoo Finance) along with minute-by-minute stock price data, and then using the former to figure out how to predict the latter. Then it buys, or shorts, every stock it believes will move more than 1% of its current price in the next 20 minutes - and it never holds a stock for longer.

The system was developed by Robert P. Schumaker of Iona College in New Rochelle and and Hsinchun Chen of the University of Arizona, and was first described in a paper published early this year. Both researchers continue to experiment with and enhance the system - more on that below.

Using data from five non-consecutive weeks in 2005, a period chosen for its lack of unusual stock market activity, here’s how AZFinText performed versus funds that traded in the same securities (which were all chosen from the S&P 500):

And here’s how it performed compared to the top 10 quantitative mutual funds in the world, all of which draw from a much larger basket of securities, except of course for the included S&P 500 itself:

Software that analyzes textual financial information - quarterly reports, press releases, news articles - is nothing new. Researchers have been publishing on the subject since at least the mid-1990’s.

However, previous approaches to this technique were hampered by either poor performance (averaging little better than chance) and / or requirements for unreasonable amounts of computational horsepower. Schumaker and Chen get around these issues by first radically shrinking the amount of text their system has to parse by boiling down all the financial articles the system ingests into words falling into specific categories of information.

Interestingly, these techniques and categories derive from classification schemes described at the 7th Message Understanding Conference, held in 1997, which was a Defense Advanced Research Projects Agency project to create new and better ways to extract information and meaning from texts. (At the time, they were concentrating on terrorist activities in Latin America, airplane crashes, rocket and missile launches and other things relevant to national security.)

Schumaker and Chen’s system concentrates on Proper Nouns - people and companies - and combines information about their frequency with stock prices at the moment a news article is released. Using a machine learning algorithm on historical data, they look for correlations that can be used to predict future stock prices.

Further work with the AZFinText system has revealed oddities that may or may not remain relevant as researchers continue to apply it to other bodies of historical stock market and financial news data. For example, in a paper described on June 6 at the Computational Linguistics in a World of Social Media workshop, Schumaker went fishing for the Verbs most likely to cause a stock to move up or down in the next 20 minutes, and came up with a list of 211 terms that had some power to move stock prices. (In his work, ‘verb’ is a technical term, and does not exactly correspond with the conventional definition of the word.)

According to Schumaker:

The five verbs with highest negative impact on stock price are hereto, comparable, charge, summit and green. If the verb hereto were to appear in a financial article, AZFinText would discount the price by $0.0029. While this movement may not appear to be much, the continued usage of negative verbs is additive.

The five verbs with the highest positive impact on stock prices are planted, announcing, front, smaller and crude.

Schumaker did not attempt to determine why these particular terms move stock prices, but it’s interesting to note that the stock market does not appear to like the marketing buzzword “green,” but is quite happy to hear any news at all about the term “crude,” as in oil.

Keep Reading

Most Popular

transplant surgery
transplant surgery

The gene-edited pig heart given to a dying patient was infected with a pig virus

The first transplant of a genetically-modified pig heart into a human may have ended prematurely because of a well-known—and avoidable—risk.

open sourcing language models concept
open sourcing language models concept

Meta has built a massive new language AI—and it’s giving it away for free

Facebook’s parent company is inviting researchers to pore over and pick apart the flaws in its version of GPT-3

Muhammad bin Salman funds anti-aging research
Muhammad bin Salman funds anti-aging research

Saudi Arabia plans to spend $1 billion a year discovering treatments to slow aging

The oil kingdom fears that its population is aging at an accelerated rate and hopes to test drugs to reverse the problem. First up might be the diabetes drug metformin.

images created by Google Imagen
images created by Google Imagen

The dark secret behind those cute AI-generated animal images

Google Brain has revealed its own image-making AI, called Imagen. But don't expect to see anything that isn't wholesome.

Stay connected

Illustration by Rose WongIllustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at customer-service@technologyreview.com with a list of newsletters you’d like to receive.