SemTechBiz SF SemTechBiz UK SemTechBiz NYC more TVNewser TVSpy GalleyCat AppNewser UnBeige AgencySpy PRNewser 10,000 Words FishbowlNY FishbowlLA FishbowlDC MediaJobsDaily SocialTimes AllFacebook AllTwitter

Semantic’s Role in Making Government Data Transparent

Jennifer Zaino
SemanticWeb.com Contributor

The Obama administration has said that it’s committed to transparency and accountability when it comes to how the government is spending taxpayers’ money. As that money is dispensed to deal with, say, a crippled financial system, there’s hope that semantic tools can be leveraged to make some important links among financial data that’s translated into the RDF format from the XML-based XBRL (extensible business reporting language) markup standard.

Vivek Kundra, the first U.S. chief information officer, is establishing a Data.gov site that will put vast amounts of government information into the public domain. The semantic web has a role to play in delivering on that vision of creating transparency, efficiency, and greater collaboration around this information. And, when it comes to providing data in an open and standardized manner so that citizens and agencies can mine it, make comparisons, and perform analytics across government sectors, the semantic web needs XML and specifically XBRL for financial data, says Diane Mueller, vice chair, XBRL International Steering Committee and VP, XBRL Development, at JustSystems. XBRL provides the layer and the foundation that lets all parties map to a shared taxonomy standard, so that disparate sources of financial data can be mapped against each other without human intervention.

“The whole possibility of all the mash-ups you can do from all this data and create interesting new knowledge is pretty phenomenal, and applications we can’t even imagine yet will be created,” Mueller says.

But she has some ideas. Mueller, who along with JustSystems’ Dave Raggett, is creating a W3C workshop to be held in June on the semantic web and XBRL, has been working on the XBRL financial data standard for ten years, and her interest in semantics and linguistics goes back to her university days. She imagines, as one example, how anyone using semantic tools that are already available today might have an interest in searching XBRL-tagged data converted to RDF for information in bailout baby AIG’s SEC filings — say, how much money it’s allocated as bonuses — and then mashing that up with other government data that might also leverage the XBRL standard in documents describing what funds have made available to the insurer.

Think about having all that linked data available and the power of a citizen’s network scrutinizing this, and how that might contribute to what the regulatory agencies themselves are doing, Mueller notes. JustSystems, which is sponsoring the W3C workshop, includes in its product set XMetal, an XML authoring tool and xfy, which leverages open standards for component document authoring in a platform that enables creating semantic mashups and using multiple vocabularies, including XBRL, to create reports.

Mueller, who attended the recent eGov W3C workshop in Washington D.C., says she’s encouraged by the energy she saw from government officials who want to realize the vision of making transparent government data (not just budgets and funding information) for interagency and/or citizens’ use. Mueller sees parallels between the financial scandal of Enron and Worldcomm in the earlier part of the decade for pushing the XBRL standard to begin with, as a way to deliver greater insight into companies’ financials’ reporting, and today’s much larger financial crisis providing an impetus for XBRL to be incorporated into the federal government’s plans for pushing data sets out onto the Internet from its systems. In that respect the U.S. government may be following the lead of other countries, from the U.K. to Spain to China to Australia, where Mueller says there’s a deeper use of XML and XBRL across government enterprises and financial institutions, including the Shanghai Stock Exchange.


Concerns about data privacy and security may have held the U.S. back a bit longer, as much as the fact that there hasn’t been a big push from the top to move in this direction. And with the government mandate now existing to expose and share connecting pieces of data on the web at the federal level, Mueller is hopeful e-government efforts will take advantage of XML and XBRL, when appropriate, to expose that data.

“The data has always been there but there just were not good, efficient methods to make it available and make sure not too much goes out the door that might violate privacy or security,” says Mueller. “I think that’s what XML does — when you export to XML you are not just dumping raw data that people could cull through and look for random things. It gives a structured way to get exports done. And then having a vetted, well-known standard for financials like XBRL lets them ensure that data that goes out the door is high quality and can be validated, and that gives them the confidence the data they export and expose is mapped to certain fields.” There are at least a dozen tools out there for mapping disparate data to XBRL, so another advantage is that underlying systems don’t have to be changed.

The opportunities for agencies and citizens alike to mash up this data in new and interesting ways — to throw light on problems as well as perhaps find unexpected connections that could point the way to solutions — is limitless.

“Right now people are doing a lot with brute force,” says Mueller. “But once data is linked and things like URIs and RDF [take hold], and that’s happening very, very fast, then this world that we envision can actually happen.”

SemTechBiz is Less Than 2 Weeks Away

The Semantic Tech & Business Conference (SemTechBiz) is coming to San Francisco on June 3-7! Join us for case studies, innovative panels, tutorials, and keynotes that will provide you with practical advice, hands-on guidance, and breakthrough approaches to solving business problems with semantic technology. Passes go up $200 at the door. Sign up now and save !