Dr. Graham G. Rong, founder of IKA LLC, and senior industrial liaison officer at the MIT Corporate Relations Office, leading collaboration between the institute and industry, has been working on a semantic web approach to social and financial analysis based on digital financial data and other information related to companies that can be found on the Internet. The approach first turns XBRL data from SEC reports into RDF format, and then links that with the relevant social information in the company’s ecosystem, to deliver more business value.

The project, which began at MIT (see our earlier story here), has advanced to the application stage, and the software is moving from a JAVA to a browser-based interface. Rong says the team also is developing a web services API for the system.

“Current XBRL technology primary collects financial data for reporting, and secondarily, as more XBRL-based financial data becomes available, it will need to effectively extract financial data for value,” says Rong. Semantic web technology lets the focus be on the latter.

By linking XBRL to RDF, still using the XBRL taxonomy with data elements and an ontology that Rong’s team has developed for the meaning of all this information, the stage is set to provide more powerful and flexible query capability.

“You can extract any available information as needed, avoid hard-coded requests, and more importantly, you can establish relations among data and refine results based on previous results,” putting this all to work for purposes such as stock research, immediate response to reported data, company comparisons, and so on.

 

 

As Rong points out, not everything that fund managers want to know exists within a current SEC data filing. “We convert XBRL into RDF, then combine that data with other information from the Internet like Yahoo Finance, DBpedia,even Twitter and Google news and finance and the NY Times,” he says. “I call it linking all the open financial data and social data.” There is a key identifier in an XBRL file that usually can lead you to the company profile in DBpedia, for instance: “Then with the feature you can see, for example, who is on the board of Company ABC, how many subsidiary companies they have, and so on – this information is not in the SEC report. So the value of the system is getting all this information in its ecosystem and mapping it with RDF data.”

For instance, imagine a company with key suppliers offshore, and now imagine political turmoil or natural disasters in the region it sources most of its materials. That could have an impact on production and sales. “You can get to know that before they file their financial quarterly reports,” Rong says. “If I am a fund manager, I need to take action to avoid the loss. Similarly, in other opposite situations, you can take action to maximize the gain.”

While it is possible to light upon this information today, without the aid of semantic web technology, “it takes a lot of human power and resources to do it,” Rong says. “This system won’t replace an analyst’s job but it can reduce the workload.”

The tool currently offers three query levels. One is a free-from SPARQL query mode, which makes it possible to query any information available in an SEC report but requires users to be comfortable with SPARQL and the structure of RDF. The second is a SPARQL query wizard, where the system supports users in constructing a full range of queries with feature choices, but there are some restrictions on data access. The next level is composed of out-of-the-box pre-constructed SPARQL queries around topics like cash flows and income statements, where no semantic web expertise is required.

Automatic event-based alerts to fund managers can be triggered – “the system follows the idea of the semantic web: Automatic, machine-readable interactions,” Rong says. “There’s no human being involvement from event to business user, for the entire process.”