Paul Miller

An Olympic Opportunity to Share Data

Sir Tim Berners-Lee and the Web, featured during the 2012 Olympics Opening Ceremony

As you’ve probably noticed, the 2012 Olympics are underway in London right now. A massive logistical exercise and a global spectacle, the Games have also given the BBC another opportunity to impress with their semantic technology skills. And impress they most certainly do. Only, I can’t help feeling this was a missed opportunity for a bolder and more modern piece of data sharing, more in keeping with both the Olympics spirit and Sir Tim Berners-Lee’s tweet during his appearance at the Opening Ceremony; ‘this is for everyone.’ Read more

Cray spin-off YarcData betting $100,000 on the power of graph data

Early in 2011, I wrote a piece here on which explored the relationship between Semantic Technologies and super-computing’s venerable rock star, Cray. Then, earlier this year, Cray spun out a new division to focus upon exploring massive graph databases; something which should resonate with the semantic technology community. The new division — YarcData — differentiates itself quite clearly from its parent, leading with a data-led proposition and typically operating at quite a different pricepoint to its eye-wateringly expensive parent.

I sat down with YarcData President Arvind Parthasarathi during the Semantic Technology & Business Conference in San Francisco, to get an update on YarcData and to hear why the company is investing $100,000 in prizes for a new ‘Big Data Graph Analytics Challenge.’ Read more

Daniel Tunkelang talks about LinkedIn’s data graph

Daniel Tunkelang, Principal Data Scientist at LinkedIn, delivered the final keynote at SemTechBiz in San Francisco this morning, exploring the way in which “semantics emerge when we apply the right analytical techniques to a sufficient quality and quantity of data.”

Daniel began by offering his key takeaways for the presentation;

  • Communication trumps knowledge representation.
  • Communication is the problem and the solution. Read more

SemTechBiz Keynote: Jay Myers discusses Linked Data at Best Buy

Photo of Jay MyersJay Myers at Best Buy has been working with semantic technologies for a number of years.

Traditional retailers like Best Buy are looking for opportunities to survive and grow, as competition increases and margins grow ever-narrower. Semantic Web and Linked Data solutions are part enabling transformation, even inside traditional offline retailers. Read more

SemTechBiz Keynote: Steve Harris discusses Semantics at Experian

Photo of Steve HarrisSteve Harris, CTO of UK-based Garlik, opened the keynotes at SemTechBiz here in San Francisco Tuesday. Garlik was acquired by Experian in December 2011, and Garlik’s semantically-powered products are gradually being integrated into the Experian portfolio.

Steve begins by stressing that Garlik “wasn’t a semantic web company” per se, but that they always planned to “use semantic technologies” in powering their products. Semantic technologies as the means, rather than the end. Read more

SemTechBiz kicks off, with hints about 2012′s issues to watch

The eighth west coast Semantic Technology & Business Conference (SemTechBiz) got underway here in San Francisco today, with an opening session that tried to guide attendees through the coming week of sometimes massively parallel sessions.

Up on stage, Programme co-chairs Tony Shaw and Dave McComb were joined by Thematix Partners’ Elisa Kendall and’s very own Eric Franzon. In the room, about 75% of the audience were here for the first time; a good sign that semantics continue to attract interest. Sitting next to me, a first timer from Australia who had crossed the Pacific in the belief that semantic technologies could be a solution to problems his public sector clients were facing in Australia’s Northern Territories. I should track him down on Thursday to see if his expectations were met!

Tony Shaw kicked things off, discussing the shift over the past 8 years, with an “incremental shift” in sessions from the aspirational and theoretical toward specific implementations and real solutions to tangible problems. Progress, for sure. “We all hoped that [shift] would happen in about half the time,” said Tony, “but it’s good that it’s [finally] happening now!” Read more

Google’s Knowledge Graph Is No Ugly Duckling

I’m a fan of the waterfowl model of semantic technology. Clever semantics — as well as ‘advanced’ search boxes, arcane query syntax, and consumer interfaces that require user training — can paddle away as frantically as they like, but only while hidden well below the waterline. SPARQL, SKOS and SQL really shouldn’t be visible to most users of a web site. Ontologies and XML are enabling technologies, not user interface features.

With this week’s unveiling of the Knowledge Graph, Google has taken another step toward realising the potential of their Metaweb acquisition. The company has also clearly demonstrated its continued enthusiasm for delivering additional user value without requiring changes in user behaviour (well, except that those of us outside the US have to remember to use and not our local version, if we want to try this out).

For those who don’t remember, Metaweb was one of those companies that got people excited about the potential for semantic technologies to hit the big time. Founded way back in 2005, Metaweb attracted almost $60Million in investment for their “open, shared database of the world’s knowledge” (Freebase) before disappearing inside Google in 2010.

Read more

Wikidata, and a clash of world views

Remember the days before Wikipedia had all the answers? We looked things up in libraries, referring to shelf-filling encyclopaedias. We bought CD-ROMs (remember them?) full of facts and pictures and video clips. We asked people. Sometimes, school home work actually required some work more strenuous than a cut and paste. We went about our business without remembering that New Coke briefly entered our lives on this day in 1985.

Wikipedia is far from perfect, and some of the concern around its role in a wider dumbing down of thought and argument may be justified. But, despite that, it’s a remarkable achievement and a wonderful resource. Those who argued that it would never work have clearly been proven wrong. Carefully maintained processes and the core principle of the neutral point of view mostly serve contributors well.

With Wikimedia Deutschland‘s recent announcement of Wikidata, many of the early concerns about Wikipedia itself have resurfaced once again. Read more

The Problem With Names

New Amsterdam... or not?

Earlier this week I spent an enjoyable hour on the phone, discussing the work done by a venerable world-class museum in making data about its collections available to a new audience of developers and app-builders. Much of our conversation revolved around consideration of obstacles and barriers, and the most intractable of those proved something of a surprise.

Reluctance amongst senior managers to let potentially valuable data walk out the door? Nope. In fact, not even close; managers pushed museum staff to adopt a more permissive license for metadata (CC0) than the one (CC-BY) they had been considering.

Reluctance amongst curators to let their carefully crafted metadata be abused and modified by non-professionals? Possibly a little bit, but apparently nothing the team couldn’t handle.

A bean-counter’s obsession with measuring every click, every query, every download, such that the whole project became bogged down in working out what to count and when (and, sadly, that really is the case elsewhere!)? Again, no. “The intention was to create a possibility” by releasing data. The museum didn’t know what adoption would be like, and sees experimentation and risk-taking as part of its role. Monitoring is light, and there’s no intention to change that.

Read more

Tread Softly

Two posts here on over the past few days resonated with themes to which I seem to return with increasing frequency. First, Angela Guess pointed to a GigaOM interview with fellow Semantic Link podcaster Andraž Tori, then Jennifer Zaino picked up on the Global Futures Forecast‘s [PDF] enthusiasm for ‘the Semantic Web.’

Andraž is CTO of Zemanta, a company that began life in the small European country of Slovenia before spreading its wings to London and the US. Ever since I first met Andraž and became aware of Zemanta’s usefulness, it has been one of a very small number of tools that — to me — epitomise the real power and usefulness of semantic technologies. There are, of course, plenty of semantic technologies that are better at handling formal classification of data. There are plenty that cope an awful lot better at scale. There are plenty, even, that do a better job of seamlessly and flexibly knitting together facts and assertions from across the web. But Zemanta (and TripIt, my other perennial favourite) don’t make a big issue of their semantic smarts. They don’t — really —make you change your behaviour very much in order to derive benefit. They just help you get something done, quicker, easier, and better than you would have done it without them. TripIt, for example, gets travel arrangements into my calendar (where I need them), faster than I could type them in myself. But that’s just an ancillary benefit of all the other stuff that the site is doing to my travel details on my behalf.

Read more