Blueprint sits on a website’s servers, reads user-generated HTML, and checks it against a white list of trusted code. It removes any potentially harmful scripts and decides how the content should appear in a browser. Then it reformats the information and transmits it to the browser. Blueprint makes sure, for example, to avoid characters and symbols that are sometimes used to send unauthorized scripting signals to a user’s browser. Nonharmful content should make it through the process unaffected, the researchers say.
The root of the problem, explains V. N. Venkatakrishnan, an assistant professor of computer science who was involved in the project, is that browsers were originally designed to be forgiving of badly written Web-page code. “Browsers try to do the best possible rendering of any type of poorly formatted content,” he says.
Over the years, different browsers have developed their own ways of interpreting poorly formatted content. Attackers can take advantage of this by inserting HTML that will run as a script in the right browser. “This makes the problem of filtering HTML content for scripts very, very challenging,” Venkatakrishnan says. Efforts are under way to change the way browsers work, but the researchers say that another solution is needed in the meantime.
“What we want to do is to take away the ability for the browser’s parser to make any script-identification decisions on the untrusted content that is supplied by the Web application,” Venkatakrishnan says.
Robert Hansen, CEO and founder of the Internet security company SecTheory, which maintains the XSS Cheat Sheet, says that, although Blueprint protects against most major cross-site scripting threats, it doesn’t cover all possible threats. “There are other ways to get stuff rendered inside a browser, and unfortunately, this doesn’t cover any of those,” he says.
Hansen adds that the researchers’ system protects content by wrapping it in a script that search engines can’t read. “This isn’t a panacea,” he says, “but that’s the big issue.” Hansen says that cross-site scripting is too complex a problem to be stopped without changing how the browser works.