Semantics for Spies, Spooks and Secret Agents
Jennifer Zaino
SemanticWeb.com Contributor
When is a semantic web startup not exactly a startup? Answer: When it’s been in business for nearly a decade, but in the last couple of years switched its proprietary technology to be completely based on semantic web standards such as RDF and OWL.
The company is Israel-based MindCite, and its semantic-based software lets homeland security agencies and intelligence agencies collect, integrate, research, and analyze diverse types of information, from structured formats like databases and unstructured formats such as text, to solve crimes, crack cases and get alerts. The company’s systems are installed in over a dozen countries, though not stateside yet.
“When we started out there was no OWL, no RDF,” says CTO Oren Yosifon. Switching over from its technology, which was very much influenced by the early works of DAML (the DARPA Agent Markup Language originally developed to facilitate the concept of the semantic web), has had immense benefits, he says, because so much of what it had to develop is now available for free.
Citer is the company’s main platform, built on several modules that harness the power of semantics to read the text related to web links, search forums, blogs and so on in order to try to understand which is most related to the user’s domain of interest, and then prioritizes those links and crawls them by order of ontological priorities. “This allows customers to get to more relevant information in less time using less bandwidth,” Yosifon says.
The tool also has the capability to create textual summaries of documents, extracting sentences based on page structure and content. Since it already knows what is of interest to the user, and has knowledge of the user’s ontology, it can pick sentences that it thinks have the most meaning for the user.
“Think about an article that is 1,000 words long on chemical warfare,” he says. “If you are an army general there are several sentences to be more interesting to you than to others-you are more interested in the warfare part. But if you are a chemical biologist, probably you’re more interested in other sentences-those that have to do with chemical compounds. So the summary of the document has to be put in the context of the reader and the ontology lets us do that.”
Cracking Crimes
Also in the company’s toolkit is a new product, Citer Link Analysis Tool, for intelligence officers and investigators that helps them visualize and gain insight from data locked in many different ontologies, to discover relationships between entities in order to help crack crimes.
Recently in Central America, Yosifon says, the technology has been used to integrate government data among 10 different databases-the civil census registry, police database of weapons ownership, and so on-to help create leads for investigators by correlating entities.
“We have migrated or created a way of converging those databases into triples, we host that in Citer Link’s triple store called Semantic Server, and we provide link analysis tools over it,” he says.
So, for instance, imagine you come into a crime scene, find a weapon with a registration number, and meet an eyewitness who saw the first two characters of a license plate for a car fleeing the scene. Now the information among these disparate databases can be pulled together using graph mining and analysis that helps create information matches. These might lead investigators to the owner of a weapon and then to the link that that weapon belongs to someone whose mother’s second husband owns a car with those initial license plate numbers.
Trying to do the same thing without the power of semantics among relational databases would be an incredibly elaborate exercise, and it wouldn’t provide the same flexibility semantic technologies do to integrate data in an evolutionary fashion as needed.
MindCite has just introduced a new semantic annotation tool as part of Citer Link that can help users load relevant documents from Citer and semantically annotate them.
“So in a sense you integrate the insights that come from text or open source intelligence into the big integrated data sets from the databases,” he says. “Semantic technologies help us focus on the actual integration of concepts rather than how to integrate the syntax of data, which is already solved.”
The tool, Yosifon says, is probably the only off-the-shelf semantic solution in its category. But he doesn’t think it will stay like that for very long, as he’s seeing a lot of evidence of the semantic wave rising. “We expects to have to do more to keep ourselves positioned as the leader,” he says. “A lot of people are coming to understand the power of semantic representation.”

The 
Eric Franzon
VP Community
Jennifer Zaino
Contributor
Angela Guess Contributor
semanticweb.com Twitter feed loading...