Semantics Help Sort Out Where Tax Dollars are Going
Jennifer Zaino
SemanticWeb.com Contributor
It’s fitting that shortly after the world learned that $18 million in additional funds are being spent to redesign the Recovery.gov web site that Expert System USA is delivering a new free service that lets users explore where their tax dollars are going.
The new service is available at Semanticgovernment.org (users will need to request log-in information from Expert System). The company is delivering this service as part of its plans to help evangelize semantic web technology to a larger audience, by giving them a real picture of what it can accomplish to go along with the theory that many now grasp.
(To get an idea of how this works, see this slideshow — in video format — on Youtube.)
Semantic Site Indexing
The company — which boasts a semantic network of some 350,000 entries, 2.8 million connections, and 3,500 rules-strong — has semantically indexed sites including Recovery.gov and Data.gov so that users can perform far richer searches than those sites’ keyword search features enable. It leverages its technology, which maps all English words, definitions and their relationships to one another, to help users connect the dots among all the information at these sites.
Sources of Information
At Semanticgovernment.org, note that the corpus sources are listed on the left, and others may include at some point third-party corpus sources such as data from the Sunlight Foundation, a non-partisan non-profit dedicated to using the power of the Internet to catalyze greater government openness and transparency. Expert’s crawlers can grab new sources, and collect and build an index like any other enterprise tool that wants to acquire, index and search — the difference being it does it semantically. It plans to keep current with the main government corpus sources on a daily basis.
Semantic Precision
In this example, the original search on the word “amount” (see pop-up box) is refined thanks to the technology’s ability to understand that the word has a number of different definitions. Choosing to define “amount” as a quantity of money narrows the number of documents returned to the ones that specifically use the word in that context. That’s a semantic step called precision: “It lets search be more precise based on the context in which you want to review the document,” says Brooke Aker, CEO of Expert System USA. Note the “magic jewel” to the far right of the search box – it’s an “expand concepts” icon. Clicking on it in this instance would broaden the search to connect the concept of “quantity of money” to every use where a different word is tied essentially the same concept (payment, for example).
Triples Come To Life
The notion of semantic triples comes to life for the average user – here, every role a word plays within documents is pulled together in its place as subject, action (aka predicate) and object. We chose “announce” from a drop-down list of action words, recasting subjects and objects that are connected by that word. We then chose “Obama” as the subject and “release” as the object to come up with the semantically matching documents, sorted in real time as we moved ahead.
“There are lots of ways of navigating through this,” says Aker. “The way in which concepts are related, the roles words play in all this — it’s all a more powerful way of navigating than what exists [at these sites] currently. It’s just keyword searches there.”
Search Place, Organizations, People
How’s New York State — or other locations — faring when it comes to their piece of the recovery pie? Among the options the service supports is searching places (see right of slide), as well as organizations or people (or some combination thereof) to get insight into what’s being spent where and how that might relate to spending in other parts of the country. “Think about Recovery.gov or Data.gov as what do you want to know as an investigative reporter or average citizen?” says Aker.
Synopsis View
Users can get a synopsis of any of their search results noting applicable domains (see left slide), main lemmas (see middle) represented in the document (that is, the citation form of a noun which in English is the singular or the base form of a verb), and the document’s relation to other documents and other entities (see right slide). It’s all about shedding light on previously very opaque areas, which is what the Obama administration has said it wants to do.
“Semantics is an important element of streamlining cost savings, of being open and transparent,” says Aker, who says he can also see semantics being applied to applications such as the declassification of documents. “You can have a small army of people who can read through all these and determine what is sensitive and what isn’t,” he says, “or employ some technology that can sort the piles between what clearly can be declassified and what needs a human being to look through it, so that the mountain becomes a molehill to be reviewed.”

Semantic Tech & Business Conference returns to San Francisco in June! Join us from June 3-7 for complete coverage of Big Data, Linked Data, Extreme Information Management, and Semantic Web. From breakthrough approaches to solving business problems to the big data implications of fast–evolving technologies, SemTechBiz provides you with an unparalleled interactive experience and delivers tangible business value. We're offering a special early rate when you register by February 17. 
Eric Franzon
VP Community
Jennifer Zaino
Contributor
Angela Guess Contributor
semanticweb.com Twitter feed loading...