By Dan McCreary on June 14, 2011 11:00 AM
There are three trends that I observed at SemTech 2011 in San Francisco last week. First was the increased role of native XML databases used in combination with RDF data stores. Second was the many natural-language processing tools and vendors at the conference. And third was the role of semantic annotations and standards directly in web content. I think these trends are related.
One of the keynote presentations at the SemTech 2011 conference was done by the BBC. They presented their core architecture for managing web content as having two main components: a native XML database(MarkLogic) for content and a RDF triple store for “metadata.” These tools were at the core of their architecture for their web sites.
Another presentation was done by the Mayo Clinic. They also are using MarkLogic for web content and are also using semantic web technologies. Their diagrams show that there are many ways for these systems to interact.
Read more

Semantic Tech & Business Conference returns to San Francisco in June! Join us from June 3-7 for complete coverage of Big Data, Linked Data, Extreme Information Management, and Semantic Web. From breakthrough approaches to solving business problems to the big data implications of fast–evolving technologies, SemTechBiz provides you with an unparalleled interactive experience and delivers tangible business value.
We're offering a special early rate when you register by February 17.
Sign up now!
By Dan McCreary on January 12, 2009 8:08 PM
Entity Extraction is the process of automatically extracting document metadata from unstructured text documents. Extracting key entities such as person names, locations, dates, specialized terms and product terminology from free-form text can empower organizations to not only improve keyword search but also open the door to semantic search, faceted search and document repurposing. This article defines the field of entity extraction, shows some of the technical challenges involved, and shows how RDF can be used to store document annotations. It then shows how new tools such as Apache UIMA are poised to make entity extraction much more cost effective to an organization.
Read more