AI is learning how to spot risky websites for you

Jackie Snowarchive page

February 22, 2018

Machine learning can sniff out tell-tale signs of shady URLs so you don’t get phished.

The problem: The internet is riddled with websites set up for the sole purpose of stealing a user’s information or installing malware on a victim’s machine. Antivirus companies blacklist them as fast as they can, but with new sites launched every day, it’s a Sisyphean effort to keep up.

AI to the rescue: A new system called URLNet uses neural networks that look at character-level and word-level combinations in—you guessed it—the site’s URL to detect a site’s risk. URLs contain clues to whether a site is malicious, like length and misspelled domain names.

Results: The researchers trained URLNet on two data sets, one containing a million legit and malicious URLs and one with five million. In each case, URLNet beat other current systems at detecting suspicious sites.

Deep Dive

Artificial intelligence

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.