E-Discovery Gets Boost From Semantics
Jennifer Zaino
SemanticWeb.com Contributor
Semantic technologies have a role to play in the e-discovery space, potentially saving companies a lot of money and headaches when it comes to producing required information that’s in electronic form for the courts. Ten years ago, if information wasn’t on paper it didn’t really exist from the standpoint of evidence at trial. Things are dramatically different today, with the explosion in electronically stored data.
That leaves many companies grappling with the issues of determining how to preserve data when there is a duty to do so, how to collect it in ways that maintain its authenticity and integrity in terms of being able to introduce it into evidence at a later point, and, once it’s collected, how to identify data that is relevant to a particular matter, and produce it in the format requested by the opposing party.
E-discovery solutions provider Fios recently enhanced its e-discovery services with Content Analyst’s CAAT conceptual search and analytical software to help clients search and analyze large amounts of electronically stored information during review and in the early stages of e-discovery planning. The goal is to make the e-discovery process more efficient, enable organizations to meet timeframes for producing relevant data, and reduce their risk (for example, the risk of potentially exposing themselves to consequences by producing data they really don’t have to).
Fios actually introduced the idea of concept-based searching in 2003 through its partnership with intelligent search vendor Engenium — since acquired by Kroll Ontrack — but CAAT adds some additional capabilities around scalability and efficiency.
“In terms of scalability and performance, its speed of search results, speed of indexing, maintaining the index — those things allow us to continue to offer similar capabilities but more efficiently, robustly, with faster search results,” says Brad Harris, director, Discovery Center of Excellence, Fios Consulting. “In addition, longer-term, Content Analyst also has capabilities around concept-clustering, the way you can manipulate the information once you have a conceptual space built in order to better understand and provide better analytics around a population of documents.”
Its concept-based searching approach is based on latent semantic indexing (LSI) or analysis, a natural language processing technique to discover relationships among documents and the terms and words in them by producing a set of concepts related to the documents and terms. Within a group of documents, some terms may seem to be highly related to one another in context, says Harris; for example, a company might use an abbreviation and a code name to refer to the same project.
“So, one of the benefits from the discovery standpoint is, I go in and start grouping things related to one another based on how a company talks about those things. If I can get that much closer to the actual context that the documents are part of, I am that much better at finding things relevant to a particular search query.” LSI provides a more automated and scalable way of discovering the relationships between terms and words based on the documents themselves, without having to create rules to understand these things in context.
“If we now think about how I use concept searching in e-discovery applications, I’m looking for certain things related to an allegation. So if someone comes and says I got hurt using your product and I think it is a design defect, how do you find stuff related to that claim?” Harris says. You might start looking for product names or maybe project names for the product before it was branded, or maybe a project team talked about it in a certain way. “With straight keyword search, I’d have to think about how might I have referred to that design. If I can use concept searching to go in and say, ‘find things related to this phrase or product name,’ it goes out and says, based on usage within this company, here are words or phrases closely associated with that same term.”
At the same time, concept searching makes a culling strategy more efficient than keyword searching. Legal teams can use the process earlier in the phases of discovery to negotiate with the requesting party about what evidence will be relevant to produce — and from that decision, what they don’t have to.
“The technology lets you better understand a population of documents and refine a description of what is considered relevant and what is not. This is negotiating the scope of discovery,” Harris says. Legal teams can use the technology to organize the documents in a logical fashion, that “this set of documents are all related to the same concept, so I can say that group of documents is now all responsive to this request and are in — or they are about going to lunch and that is unresponsive and can be excluded.”

The 
Eric Franzon
VP Community
Jennifer Zaino
Contributor
Angela Guess Contributor
semanticweb.com Twitter feed loading...