Who Owns XML?

A small software-maker has patent rights on parts of the Web language, according to company officials who spoke with TR Executive Web Editor Wade Roush.

Wade Rousharchive page

October 29, 2005

For Web programmers, the Extensible Markup Language (XML) is not only a lingua franca – it’s the water that floats the boat, the air that holds up the plane. In other words, it’s a free resource without which the rest of the Web wouldn’t work.

Developed by the World Wide Web Consortium (W3C) between 1996 and 1998, XML has become the dominant way of describing and structuring data so that it can be shared across the Internet and displayed in any browser.

But now executives at Scientigo, a small software maker based in Charlotte, NC, say the company owns two U.S. patents (No. 5,842,213 and No. 6,393,426), that cover one of the fundamental concepts behind XML: the idea of packaging data in a self-defining format that allows it to be correctly displayed wherever it travels.

Scientigo CEO Doyal Bryant says the company plans to capitalize on the patents either by reaching licensing agreements with big corporate users of XML or by selling them to another company.

The very idea of patents on software is a contentious one, though. In July, the European Parliament threw out a bill that would have legalized software patents across all EU member states. In the United States, where the courts have recognized software patents for some time, groups such as the Electronic Frontier Foundation (EFF) have charged that many of those patents are too broad and granted without adequate review.

In EFF’s view, that makes it too easy for patent holders (sometimes labeled “patent trolls” or “patent assertion companies”) to threaten legal action against alleged infringers if they don’t pay license fees.

That’s why actions like Scientigo’s, which could affect every company that uses XML, spook the community of Internet engineers and Web developers.

Most programmers and computer scientists agree that open standards like XML are a big reason why the Internet works as well as it does and has spread as far as it has. The majority of the people who invented the protocols underlying basic Internet features – packet switching, electronic mail, the World Wide Web – never bothered to copyright or patent their contributions, or else have waived most of their intellectual property rights under various open-source licenses. As a result, anyone can build new content or software on top of existing Internet technologies without having to obtain permission or pay a fee. So if a business is thinking of building a new product or service using technologies they think are open source, the sudden prospect of owing license fees can dampen enthusiasm.

Take the ongoing case of SCO versus IBM. After a Utah company, SCO, sued IBM in 2003 for allegedly including Unix software code owned by SCO in its version of the Linux open-source operating system, Linux adoption slowed in the business world for more than a year.

Bryant says Scientigo’s claims – which relate to XML “namespaces,” a universal system for naming data types, which was added to the XML standard by the W3C in 1999 – are not a repeat of the SCO episode. According to him, the company simply wants to find a way to earn a reasonable return on its intellectual property. And when South Carolina-based e-business software developer Commerce One auctioned off a collection of their own XML-related patents last December for $15.5 million, a way appeared.

“It was the Commerce One transaction that really got our attention,” says Bryant. “If there was no interest in this [technology], there wouldn’t have been a last-minute bidding frenzy by the major players.”

According to Bryant, the company so far has held talks with more than 40 companies about its patent claims, including Microsoft and Oracle, and this week Bryant says he’s finalizing an agreement with an IP licensing firm.

On October 25, Technologyreview.com Executive Web Editor Wade Roush interviewed CEO Doyal Bryant, along with Paul Odom, Scientigo’s chief scientist and senior vice president, and Ron Laurie, managing director of Inflexion Point Strategy, an intellectual property consulting firm and member of Scientigo’s intellectual property team.

Wade Roush: Please walk me through the parts of your patents that you believe XML infringes upon.

Ron Laurie: The central concept is called “neutral form.” Paul [Odum] developed it in the database domain as a solution to the problem of transferring data between databases in a way that the receiving application would know everything it needs to know to process the data. What happened was, the Web world had essentially a very similar problem: how to publish documents on the web, and, later, to exchange data on the web. The Web community’s solution to the problem started to converge with the database problem, and the web community’s solution was XML. We’re not talking about XML as it existed in its earliest instantiation – we’re talking about XML after the creation of the schema and namespace applications, which occurred after our patent filing date. A lot of people say these patents were filed after the XML working group was formed. That’s true, but at that time the XML working group didn’t do neutral form.

WR: When did you start to think that what the W3C was doing infringed on your patents?

Paul Odom: The day that I saw the namespace specification issued by W3C, I immediately knew there was an overlap. Neutral form is the holy grail of data transfer, and that’s what we set out to do with these inventions. Prior to that, people had done only neutral format – things like HTML and SGML. We were looking for neutral form, which is self-defining information – meaning you can give it to a third party and they understand not only what the data means, so it can be used, but also what its relationships are, universally, to anything that may be related to it….When the namespace application was added to XML, there was now a universal way to identify a data item. And that’s one of the claims in the patent – how do you define the universal identity.

WR: We all saw what happened when SCO sued IBM over proprietary code that IBM had allegedly introduced into Linux. The lawsuit alone was enough to slow down Linux adoption for more than a year. How is what you’re doing not like the SCO case?

RL: The first thing is that open source is essentially a copyright model, and that’s what the SCO case was all about: pieces of Linux being proprietary. Patents are a different model entirely. The open-source community is struggling with the relationship between the open-source model and third-party patents. But in general the fact that something is developed using an open-source model does not mean that it isn’t subject to the normal rules about patents. Open source is all about derivation. It says, “If I do something with what you did, I have to license it under the same open terms under which you licensed it to me.” There is nothing about derivation in patent law. You could never have heard of me or seen my patent, but if you do something that overlaps, that’s infringement. And it’s been that way for hundreds of years. There needs to be some common ground between the open-source community and the patent community.

WR: But how do you go about “monetizing” a patent on something as widely used as XML without spooking other companies and slowing down innovation and adoption?

RL: The answer is, by being reasonable as far as the licensing terms go. If you’re not reasonable, if you’re trying to hold the industry ransom, you’re going to be in court for the rest of your life, because people will challenge the patent any way they can. My own personal view is that the way you avoid those challenges – and nobody wants that – is to be commercially reasonable in the licensing terms and not try and hold anybody up.

WR: There is some history around that. At various points, the W3C has worked with consortium members who hold patents related to developing Web standards to ensure that the patented technology is shared either under reasonable licensing terms or entirely royalty-free. Have you met with the W3C or discussed how your claims might be licensed through the W3C process?

RL: No, we haven’t. Whether there should be some involvement now – we haven’t talked about it. It’s an interesting question. We believe the success of any monetization system has to do with reasonable licensing.

WR: The patents we’re talking about were granted in 1998 and 1999. What’s the history of Scientigo? How did you come to own these patents? And what led you to decide to try to exploit them now?

Doyal Bryant: I came in about a year and a half ago. The company [then Market Central] was mainly a call-center company making a customer relationship management play, and it had assets, including Pliant Technologies, an acquisition it made in 2003. The company was deep in debt and had a lot of litigation problems. We spent nine to 12 months cleaning it up and recapitalizing and refocusing the company. I made a decision as we sold off the other assets to devote the entire effort of this company to building upon the technology platform that Paul and his team brought to us from Pliant. We needed to see if there was an asset that we had in our patents, and what value that might have to somebody else in the world. But I really want to make it clear that we are not in any way a “patent assertion” type of company. We’re just trying to leverage what we have.

RL: Traditionally patents were viewed just as a ticket to court. “I have a patent, you’re infringing it, you pay me.” That’s not at all what we are doing….What we’re looking to do is to monetize the patent, which is code for a whole range of opportunities, one of which is to sell it to someone for whom it has strategic value. Many companies are looking to acquire broad patents for defensive use, so when competitors come along with patent suits they have something to trade with….There is a rapidly growing market in companies looking to acquire outside patents. That is the main focus of our strategy, to take these neutral form patents, which are very broad, reserve a license to help the company do its core business, and monetize the patents by selling them to someone for whom they have strategic value.

WR: What is your core business?

PO: We have two major product lines at this point that all fit on top of an engine that is based upon the two patents – actually four, only two of which we’ve been talking about. Those two products are intelligent document recognition, which attempts to look at structured and unstructured documents as an individual would look at them and interpret them on the fly, extract appropriate information, and classify them. The other product we have is a search product that can track context and semantics and put it all together into one algorithm to give significantly more relevant answers.

WR: How does that relate to your patents?

PO: Neutral form is the basis for the engine. Based on those patents, we’ve created a “projection” technology that projects individual data items into a repository. Atomizing the data in that form allows you to do significantly faster processing, with far fewer resources than with the traditional algorithms.

Keep Reading

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.