Photo courtesy: Flickr/ FilterForge

Photo courtesy: Flickr/ FilterForge

The Aite Group, which provides research and consulting services to the international financial services market, spends its fair share of time exploring the data and analytics challenges the industry faces. Senior analyst Virginie O’Shea commented on many of them during a webinar this week sponsored by enterprise NoSQL vendor MarkLogic.

Dealing with multiple data feeds from a variety of systems; feeding information to hundreds of end users with different priorities about what they need to see and how they need to see it; a lack of a common internal taxonomy across the organization that would enable a single identifier for particular data items; the toll ETL, cleansing, and reconciliation can take on agile data delivery; the limitations in cross-referencing and linking instruments and data to other data that exact a price on data governance and quality – they all factor into the picture she sketched out.

Things aren’t going to get any easier, either, both as electronic data and regulatory requirements around financials data increases in Europe and the States. “We are looking at much more data to get a handle around, and as part of shadow bank regulations we’ll see more data reporting in particular areas,” she said. “The profile of data management is being raised but it’s proving challenging.”

In addition to improving processes around data aggregation, and being better able to slice and dice that data as necessary, the basic accuracy and reliability of data is a concern. Among the issues that financial services data management executives are worrying about, she noted, is data lineage. Being able to pinpoint the source of data and its path through systems can help with audit trails that will enable risk managers to know they are making the right decisions based on the right data, as well as help meet regulatory requirements.

In fact, financial services firms may be able to lower risk management demands if they can prove they have a better handle on their own data with more accurate risk calculations.

The MarkLogic NoSQL document database has capabilities that can help with such challenges, according to Amir Halfon, MarkLogic CTO, Financial Services, who participated in the webinar. “Those familiar with RDF and some other semantic standards, we now support those as well, so you can look at it as a cross between a document store and a semantic triple store,” he said. “In fact, this is one way that data lineage and provenance can be supported.” Triple stores help users to connect the dots of data lineage and data journeys with joins across relationships on objects – where the data comes from, who was the last person to touch it, and what was the latest transformation it went through, for example.

Halfon also noted that triples can be added easily as data is ingested.  “That is one way to handle data provenance – as part of ingestion as well as when data changes through the system,” he said.