Photo courtesy: Flickr/ glen edelson

You know Volkswagen as Das Auto company. But perhaps it’s time to start thinking of it as “Das Semantic Web Company.”

William Greenly is the Volkswagen Technical Lead for the auto vendor’s Volkswagen.co.uk online platform at integrated communications agency Tribal DDB. In that capacity he is taking the partnership the companies have had for more than four decades to a new level. His role there has encompassed managing data around Volkswagen’s products, its retailer and web site content, and its interfaces with social networks and many third-party back-end systems, including those germaine to the auto industry such as manufacturer consortiums.

Now, the focus is on using semantic web technology to drive a more elastic, flexible and streamlined digital world for “The Car” company.

The journey began as a strategic brief about contextual search engines serving content based on context within the site and possibly across affiliate sites, a big idea that was quite quickly bound to something more tactical. That being improving site search, Greenly says. “So the objectives were about site search and improving it, but in the long-run it was always the idea to contextualize content, to facet content, to promote it in different contexts.”

The decision was to leverage the Apache Solr open source enterprise search platform from the Apache Lucene project, but Greenly was looking for an alternate approach to building a Solr data schema. He wanted to steer clear of “something totally proprietary to the system and that has no ubiquity, that wasn’t something that was in a format that could be reused in different systems,” he says. “The best thing for modelling and constructing a domain that would be useful across not just our domain but others was to align our internal domain model with external ontologies.”

As a result of aligning this with core ontologies such as GoodRelations and Dublin Core – for an initial project phase that focused on new VW cars’ product configurations and compatibilities as the primary objective, before rolling out elastic search (such as facets) across all other parts of the uk.co site – the product content management system powered by the Volkswagen Vehicle Ontology vocabulary for describing Volkswagen-specific features of automobiles now is one of the most powerful assets Greenly manages. “This has a really rich set of data consisting of all models and trim derivative data that Volkswagen has, going back about three to four years, along with all specific items and compatibility and availability of those specific items in relation to a derivative.”

In partnership with Martin Hepp of Hepp Research GmbH, Greenly also created the Car Options Ontology that is built on Hepp’s GoodRelations ontology for e-commerce platform. “This car options ontology is relevant to any car manufacturer,” Greenly says, and he’s eager to see it embraced by the industry.

So far he says it has attracted most attention from white label system suppliers that do things such as stocking the availability of cars for test drives. “In the auto industry most terms are quite standard so concepts are common. So this lets any auto manufacturer describe vehicle, model and trim derivatives with a view for anyone to come along and federate across the entire manufacturing industry.”

The idea is that using the SPARQL 1.1 Federation Extensions, queries can be created whose answers are informed by merging data distributed across the web. So, an industry-wide view of identifiers for all models, trims and derivatives could be queried for a used car locator search, with Volkswagen’s data exposed as RDF via a SPARQL endpoint and with its enabling others, including dealers and retailers, to do the same.

“Primarily OWL (the Web Ontology Language) is there to describe our own data so it could be exposed as RDF, and then other manufacturers can do the same, and then ultimately you can federate that data,” he says. Business-critical  VW data sets ranging from product to used car data to registration lookup, test drive and retail data, either have or will have SPARQL endpoints. And the data behind them will be available on Kasabi (see this article) and as CKAN Linked Open Data sets.

“The powerful thing of SPARQL is to link data, to federate it for someone who wants to build an app,” Greenly says. The web suddenly becomes a web of data, and that works to the benefit of those confident enough in their brands to want to share their data, he thinks. “We are quite keen to share data. It can quite clearly show, when you federate data and compare ours to other brands, you can see in a few minutes that the VW brand is a lot stronger and a better value than other brands. It lets people see that in a factual and consisten manner and visualize that which you can’t do with HTML,” he says. “You can’t read every web site. You should be able to federate and get that data, and visualize it any way you like.”

Making The Link

Using the Volkswagen Vehicle Ontology vocabulary that Volkswagen has now defined and aligned, Greenly also is adding RDFa attributes to the unstructured web content that is published to the web site by a number of different systems. “We rolled that out mainly across the new car section and we are rolling that out over the entire web site by the end of the year,” he says.

Among the advantages of doing so is that it “gives great visibility of our data to traditional search engines designed to consume HTML,” Greenly says. “Unstructured data normally has a huge constraint but now we have alot richer search.” Also, using structWSF it is possible to link unstructured content and structured data extracted as RDF from its content management system. “Having a rich powerful link between structured and unstructured content will be quite useful for [site] search,” he says.

It’s ultimately all about creating an elastic solution. As new content comes onto the site or as the domain model grows in one space and extracts in another, this is a powerful solution to automatically accommodate that. “It’s elastic search and you don’t have to write any publishing code. RDF is what we work with and it’s ubiquitous, so we get the benefits of ubiquity.” he notes.

As Greenly sees it, it’s a good time for enterprises in the auto industry or outside it to move to the semantic web. And it doesn’t have to be hard. “Tools are available that are very clever. They don’t require big changes in process which is really an important thing for big companies – you don’t have to change all your systems to publish RDF,” he says. “You can start with RDFa – it’s an enabler and a graceful and subtle way of getting people to publish RDF.”