SemTechBiz SF more TVNewser TVSpy LostRemote SocialTimes AllFacebook AllTwitter GalleyCat AppNewser UnBeige AgencySpy PRNewser 10,000 Words FishbowlNY FishbowlLA FishbowlDC MediaJobsDaily

Linguistic Technologies and Big Data

Ron Powell recently interviewed Jonathan Litchman, an SVP at SAIC, regarding the role of linguistic technologies in Big Data. During the interview Litchman commented, “Big data, as you know, is a term used for lots of different things. When I think about big data, it depends on how big you want to get. If you think about the vast amounts of data that people need to be able to handle in only one language, you have tremendous big data issues; but if you understand that the most effective use of big data is to be more inclusive and make that big data more global, then you have a situation in which your data increases exponentially with the inclusion of multiple languages within that dataset.”

Litchman continued, “Our Omnifluent products help people who want to do analytics and mining on big data be able to do so without having to confront the barriers that different languages pose. Whether it’s multilingual search, translation summarization, or automatic alignment of a transcript with video or audio, big data has to expand beyond single language capability in order to be able to understand what’s useful within that big data. There are several features of the product that I think are special. The first is that the translation technology that underlies the Omnifluent platform is really a true hybrid machine translation capability. It’s a combination of machine translation that includes rules-based and statistical engines, each of these engines working together as one within a single decision engine.”

He went on, “An even more interesting feature of this linguistic platform that Omnifluent has is that in addition to that hybrid nature of translation, it also unifies text as well as speech on a single platform. Omnifluent provides automatic speech recognition and machine translation in a hybrid approach that fuses all of these components together, sharing linguistic resources to avoid the problem of compounding errors that result from integrating different pieces technology. Since these sit on a single unified platform, they share all of the linguistic resources to provide the best possible output.”

Read more here.

Image: Courtesy SAIC

Early Bird Rates End At Midnight Tonight

LOGO: Semantic Technology & Business Conference; June 2-5, 2013, San Francisco, CaliforniaJoin Semantic Technology & Business Conference, June 2-5 in San Francisco, to hear the latest industry developments from 130 experts in the space. Session topics include Semantic Video's Coming Of Age, Why Big Data for Enterprise Needs Semantic Technologies, and many more. Early bird rates end at midnight tonight, so register now and save $500.