SemTechBiz SF SemTechBiz UK SemTechBiz NYC more TVNewser TVSpy GalleyCat AppNewser UnBeige AgencySpy PRNewser 10,000 Words FishbowlNY FishbowlLA FishbowlDC MediaJobsDaily SocialTimes AllFacebook AllTwitter

Posts Tagged ‘XML’

An Example of Simple Federated Queries with RDF

Bob DuCharme, author and speaker, has provided an excellent example of one of the benefits RDF has over XML. In his example, DuCharme shows how to perform a simple federated query with RDF across two different address books. He writes, “Once, at an XML Summer School session, I was giving a talk about semantic web technology to a group that included several presenters from other sessions. This included Henry Thompson, who I’ve known since the SGML days. He was still a bit skeptical about RDF, and said that RDF was in the same situation as XML—that if he and I stored similar information using different vocabularies, we’d still have to convert his to use the same vocabulary as mine or vice versa before we could use our data together.” Read more

SemTechBiz is Less Than 2 Weeks Away

The Semantic Tech & Business Conference (SemTechBiz) is coming to San Francisco on June 3-7! Join us for case studies, innovative panels, tutorials, and keynotes that will provide you with practical advice, hands-on guidance, and breakthrough approaches to solving business problems with semantic technology. Passes go up $200 at the door. Sign up now and save !

RDF Support in IBM’s DB2

DB2 Logo graphic

We caught up with Bernie Spang, IBM’s Director, Strategy and Marketing, Database Software and Systems, to discuss the latest release of its enterprise data products DB2 and InfoSphere. Version 10 of both products have just been released. DB2 is used by thousands of organizations worldwide and comes in flavors ranging from a free version that maxes out at 2GB storage to systems that support large enterprises (Coca-Cola was an early adopter of DB2 version 10, and is already reporting cost-savings of over $1 Million).

The latest version of DB2 is the first in four years and represents a significant release, according to Spang, “This is a culmination of four years of effort by hundreds of engineers in IBM Research and Software Development Labs around the world; we also had more than 100 clients and over 200 business partners involved in the ‘early access program’ to help deliver this software. With the fundamental goal of delivering faster, easier, lower-cost data management.”

The early testing is showing positive results, with customers reporting “up to 10x faster data warehouse queries; freeing up to 90% of storage space using compression; and 98% code compatibility with Oracle Database, which makes it easier to migrate from Oracle to IBM software without changing data or retraining staff.”

For our readers, though, one of the more intriguing new features of DB2 is its built-in support for RDF. While semantics is not new to IBM — IBM Watson has gained particular fame — the appearance of RDF support in such a widely used, stable, enterprise database system is exciting.
Read more

Semantic Case Study: EPIM ReportingHub

On Tuesday the E&P Information Management Association (EPIM) launched EPIM ReportingHub (ERH), an interesting semantic technology project in the field of oil and gas. According to the project website, ERH is “a very flexible knowledgebase for receiving, validating (using NPD’s Fact Pages and PCA RDL), storing, analysing, and transmitting reports. The operators shall send XML schemas for DDR, DPR and MPR to ERH and ERH sends DDR and MPR as XML schemas to the NPD/PSA and all three reports as PDF to EPIM’s License2Share (L2S). The partners may download all three reports and/or any data from one or more reports through flexible queries. Some parts of ERH will be in operation already in November 2011 and the rest as soon as the authorities and the industry are ready for it. ERH is owned and operated by EPIM.” Read more

Wikimeta Project’s Evolution Includes Commercial Ambitions and Focus On Text-Mining, Semantic Annotation Robustness

Wikimeta, the semantic tagging and annotation architecture for incorporating semantic knowledge within documents, websites, content management systems, blogs and applications, this month is incorporating itself as a company called Wikimeta Technologies.  Wikimeta, which has a heritage linked with the NLGbAse project, last year was provided as its own web service.

Dr. Eric Charton, Ph.D, MSc at École Polytechnique de Montréal, is project leader and author of the Wikimeta code. The NLGbAse project was conducted by Charton at the University of Avignon as part of his Ph.D. Thesis.  The Semantic Web Blog recently hosted an email discussion with him to learn more about the Wikimeta architecture and its evolution.

 

The Semantic Web Blog: Tell us about the NLGBase project and Wikimeta’s relationship to it.

Charton: NLGbAse is an ontology extracted from Wikipedia. It is used in Wikimeta as a resource for semantic disambiguation. For each Wikipedia document (aka Semantic Concept), NLGbAse provides various ways of word-writing (for example, “General Motors” can be written “GM Company”, “GM”, “General Motors Corp” and so on), used for detection.

Read more

Lessons Learned On the Road To Linked Data

What’s the path from an XML based e-government metadata application to a linked data version? At the upcoming Semantic Tech & Business Conference in Berlin, the road taken by the Dutch government will be described by Paul Hermans, lead architect of Belgian project Erfgoedplus.be, which uses RDF/XML, OWL and SKOS to describe relationships to heritage types, concepts, objects, people, place and time.

Some 1,000 individual organizations compose the Dutch government, each with their own websites. An effort to employ a search engine a few years ago to spider those different and separate web sites to have one single point of access didn’t work as anticipated. The next step to bring some order was to assign all the documents published on those sites a common kernel of metadata fields, which led to building an XML application to enable a structured approach. Linked Data entered the picture about a year and a half ago.

Read more

Just How Big A Rock Star Is Eric Clapton?

Who has had the greatest impact on rock music? It’s a question that still isn’t answered, despite the efforts of Ronald P. Reck, principal at RRecktek LLC, and Kenneth B. Sall, principal systems engineer/XML data analyst at Ken Sall Consulting.

The team wanted to use semantic technology, along with DBpedia and MusicBrainz data sources, to try and figure out the answer. Reck and Sall recently published a paper, Determining the Impact of Eric Clapton on Music Using RDF Graphs: Selected Challenges of Semantics Across and Within Datasets, based on their experiences. Their plan was to use RDF and SPARQL to query properties and relationships among musical artists to reveal their activity, impact and “six degrees of Eric Clapton” connections to other artists.

Reck and Sall initially saw this project as a door-opener to showing relationships between pieces of data, and drawing inferences and conclusions from them, for a more serious purpose: “We were interested in music, but the real application, especially in the government, is tying the clues together, for example, around terrorists,” says Sall. It turns out that musicians and terrorists have some things in common — they tend to have specific roles in their organizations, and may cross-partner with other groups in loose relationships.

While the work didn’t result in answering the original question posed, it did reveal, as Sall puts it, “what can go wrong in doing this kind of semantic analysis.” That’s in itself useful, as it presents an opportunity to find at least some solutions around those pitfalls.

Read more

An RDF based Permissions Model

GatesOne of the primary challenges in putting together a good content management system is building a decent permissions model. Whether a particular user or process is able to perform some kind of an action upon a resource or not can be remarkably difficult to establish, especially when there are multiple constraints involved. For an XML-based CMS, this can be even more of a challenge, because the n-dimensional nature of such a constraint model is often difficult to model in hierarchical structures.

However, RDF is far more ideally suited for this particular role. A permissions system is, at its core, a set of assertions about who can do what to what, which fits nicely with the “subject predicate object” model that RDF exemplifies. Moreover, because such models are sparse — the number of assertions is likely to be very small compared to the total potential assertions that are possible — this fits nicely into models where sparseness of data is a common characteristic (again, RDF), as compared to storing this information (expensively) in tabular fields as with a relational database.

I’m working on building an XML-based CMS (specifically on a MarkLogic platform, though I would like to keep it portable), and realized as I was working on it that while the user permissions system that MarkLogic employs is powerful, it’s not portable and there are facets that don’t fit nicely into that particular model. Thus, I decided to chase the RDF triples approach to see if that would work better for this. (The end product may very well be a hybrid approach to take advantage of fast queries, but that’s beyond the scope of this particular article).

Read more

Mending Media’s Tangled Relationship With the Web

The media industry has had a complicated relationship with the Web, and that’s putting it kindly. While other sectors pretty quickly realized ways to take advantage of that new thing called the Internet – to sell goods, accelerate supply chains, and build deeper customer relationships – established content providers spent years trying to figure it out. And many still are tussling with big issues, such as whether or not to charge for access to content.

Given the Web’s impact on their business model and their revenues, you can forgive publishers if they might prefer if the darn Internet just stood still for a few minutes and let them catch their breaths and catch up.  Since that isn’t about to happen, the thing to do is to make peace with those changes, many of them thanks to Semantic Web technologies – and figure out fast how they’re going to profit from them.

They’ll have an opportunity to do just that at the upcoming Semantic Web Media Summit in New York City, whose speakers will include Michael Dunn, VP and CTO at Hearst Interactive Media on the topic of why media companies should be interested in this critical part of the Web 3.0 world.

Dunn sees a number of reasons for using Semantic Web technologies as the means for structuring the wealth of content that publishers produce. There’s improving its discoverability by the world via search and social, of course, but it matters for internal operations, too. And add to that the relationship with online advertising so that content can be better monetized.

Read more

The Value of Semantic Markup to Retailers

A recent article informs online retailers that “Starting now, you’re going to need good structured markup on your X/HTML in addition to your white hat tactics. I see structured markup as being equally important to authoritative inbound links as a ranking factor when optimizing content. Why? Because search robots are designed to serve search engine users by matching their search query expectations, known as user intent. These bots are machines, and they’re trying to discern the human mind’s evaluation of information in answer to human-entered keywords.” Read more

Jeni Tennison on Web Development

Jeni Tennison recently shared her experiences working with web standards in her work at legislation.gov.uk. In particular, Tennison looks at how her organization has need to use multiple technologies in concert to achieve various publishing goals and satisfy various types of data consumers.  She begins, “One of the things that’s been niggling at the back of my mind since the schema.org announcement is how small a role search engine results plays in the wider data sharing efforts that I’m more familiar with in my work on legislation.gov.uk, and more generally how my day job experience differs from (what seem to be) more common experiences of development on the web. In this post, I’m going to talk about that experience, and about the particular problems that I see with the coexistence of microdata and RDFa as a result.” Read more

NEXT PAGE >>