Get Down, Get Fuzzy

Jennifer Zaino
SemanticWeb.com Contributor

Google’s got one day more to respond to a patent infringement lawsuit filed by Jarg Corp., the semantic search vendor, and Northeastern University, from whom Jarg licensed the algorithm that the two parties believe Google uses to serve up every query result and earn every dollar of its ad revenue.

According to Jarg President and co-founder Michael Belanger, “We’re not out there with a stick to beat them up …but if they are using some functionality that is covered in our patents – and we have hundreds of claims in about ten patents – then we want to politely ask them to pay some reasonable royalty. The idea here is to try to get to the end game sooner, with friendly partners, rather than to be antagonists.”

Whether Google looks at the lawsuit in the same light – or would be apt to consider Jarg a friendly partner – the bigger discussion may be around what that end game is. As Belanger sees it, it’s not the semantic web as envisioned by the W3C.

“Our point of view is that a rising tide lifts all boats,” he says, with the rising tide being not only Google’s application of the technology that Jarg believes infringes on its patent, but also the growing interest in semantics that many companies are now exploiting, from Radar Networks to search engine Hakia, which recently closed on another $5 million in funding.

Belanger argues, however, that there’s something missing. In Google’s instance, its use of the algorithm that is the subject of Jarg’s suit delivers more links for users to wade through and guess at the more complex their queries are. In the instance of the W3C semantic web standards and those applications that subscribe to it, Belanger finds fault because they are largely focused on moving information between relational databases in an interactive or interoperable fashion.

“And – these are some gray hairs from the past – one would say, how is that different from EDI?… Every time you do a project with the standards of the moment it is considered an interoperable model,” he says.

Away from rigid modeling

Getting back to that rising boat, there’s an opportunity for people who clearly understand Google and natural language processing and who have begun to understand the W3C’s semantic web standards to also “understand there is a continuum that this next generation requires, and that the fragility of what they are playing with now is not going to be the end of the game,” he says. “To get to the end game, as Tim Berners-Lee and his colleagues put it in the semantic web article in Scientific American in 2001, they have to get all the way down the spectrum to where we are, away from rigid modeling.”

To clarify, Belanger says that Jarg’s patents bring to the table something in the database world called an unlimited number of pre-computed joins.

“That means we can have a query more complex by orders of magnitude, and pack more context in there than you can push against a database using the SQL protocol, because there may be three joins of a SQL protocol and the database engine comes to its knees and stops,” he says. “You can’t have a complex query in the SPARQL world, either.”

Belanger believes that right now, most people are locked into semantic projects in which they build a model and test it with consistency checkers to make sure it behaves in a rigid way, so they can trust the information going back and forth between suppliers and users.

“That’s basically EDI, but we’re no longer using twisted copper pair wires between two companies, but the Internet. The only difference is the schema is no longer locked up in a database schema at each end,” Belanger says. “It’s now on the table where you can examine it as an ontology, but it is an ontology that defines a rigid model of trusted information going back and forth. It’s only useful in the one narrow case they build it for.”

And only for as long as the standards prevail.

“You have to get beyond the W3C standards, beyond the EDI environment, and in some way become extremely fuzzy, and get into sort of a fuzzy AI environment rather than this rigid, fragile modeling stance that they are currently still focused on. We do the fuzzy stuff,” he says.


Jarg firmy believes in the future of ontology-based computing, but with ontology defined as vocabulary, not model.

“It is a thesaurus mountain to climb rather than a model mountain to climb,” he says.

Fundamentally, Jarg is an indexing technical base that fragments representations of digital artifacts, throws the fragments of a query against all the fragments in the index, and pulls back the results.

“The source of information that is already in the index with the highest number of matched fragments has proven to be the item that has the highest contextual relevance to the query, and the next item that comes back with the same or slightly fewer fragment matches is the next closest information source with a contextual fit to the idea,” he says. “So this is a way of instantaneously, in terms of the search function, ranking everything that comes back not by link analysis or keywords but by the context of the entire idea you present as a query. So it is essentially a filtering engine that can handle an unlimited size of sources and queries of unlimited complexity. So the more context you put into the query, the better the engine works for you, which is the inverse of Google….The algorithm in their use doesn’t serve you well.”

Additionally, autonomous, ontology-supported intelligent agents that sit in the index let you stream all new information, acting like highly contextual persistent queries.

“There is no such thing in the future as information overload. You will have an information supply custom tailored to you, and the thesaurus will find unexpected information that is not in the literal queries you run. If you have a 25-word query, think of all 25 of those words having between 10 and 100 thesaurus equivalents that identify information you wouldn’t even imagine is there, but that is contextually relevant to what you have in mind,” he says.

Those organizations that do go beyond what Belanger believes to be the fragile standards-based model to a fuzzier model, particular in markets such as intelligence and financial trading desks, which are time- and quality-sensitive, will have a big advantage.

“The domain will have perfect information on demand and perfect information supply automated if they set up enough of these autonomous intelligent agents. It will be a situation where people with these new technologies will have such an information advantage, such a quality-of-information advantage, that they will be able to take market share unfairly from competitors that have not been early adopters of this stuff,” he says. “This is very empowering stuff, it’s just not rolling out very fast but in an evolutionary fashion.”

Even the life sciences market, one of the premier examples often looked to as a leader in deploying semantic technologies, hasn’t moved as fast to Jarg’s end of the spectrum. Jarg spun out a life sciences venture to target that arena in 2001, but says that most players are not quite ready to invest seriously in ontology-based computing to get some advantage. “They have the most complex set of objects to understand and filter through but they haven’t gotten beyond this semantic web fragile modeling approach, they have not yet got to the fuzzy AI, where it’s vocabulary-intensive rather than modeling-intensive.”

Semantic Tech & Business Conference Returns to San Francisco

Semantic Tech & Business Conference returns to San Francisco in June! Join us from June 3-7 for complete coverage of Big Data, Linked Data, Extreme Information Management, and Semantic Web. From breakthrough approaches to solving business problems to the big data implications of fast–evolving technologies, SemTechBiz provides you with an unparalleled interactive experience and delivers tangible business value. We're offering a special early rate when you register by February 17. Sign up now!