SemTechBiz SF more TVNewser TVSpy LostRemote SocialTimes AllFacebook AllTwitter GalleyCat AppNewser UnBeige AgencySpy PRNewser 10,000 Words FishbowlNY FishbowlLA FishbowlDC MediaJobsDaily

Automatic Tagging at the BBC

The BBC has been working on automatically tagging their audio archives with DBpedia identifiers. Yves Raimond explains how the broadcast company is handling the project. Raimond writes, “One dataset we are looking at within this project is the World Service archive. This archive is isolated from other programme data sources at the BBC, like BBC Programmes or the Genome Project, and the associated programme data within it is very sparse. It would therefore benefit a lot from being automatically interlinked with further data sources which makes it such a particularly interesting use-case. The archive is also very large: it covers many decades and consists of about two and a half years of high-quality continuous audio content.”

Raimond continues, “One way of dealing with such a large programme archive with patchy metadata but high-quality content is to use the content itself in order to find links with related data sources. For example if a programme mentions ‘London’, ‘Olympics’ and ’1948′ a lot, then there is a high chance it is talking about the 1948 Summer Olympics. Using the structured data available in Wikipedia we can then draw a link between a recent programme on the 2012 London Olympics and that archive programme and use that link to provide further historical context. When developing such an algorithm we need to take into account a couple of desirable properties: it needs to be efficient enough to be applicable to a large archive and it needs to use an unbounded target vocabulary, as programmes within an archive can virtually be about anything.”

Read more here.

Image: Courtesy BBC

Semantic Technology Conference Attracts Notable Speakers

LOGO: Semantic Technology & Business Conference; June 2-5, 2013, San Francisco, CaliforniaJoin Semantic Technology & Business Conference, June 2-5 in San Francisco, to hear the latest industry developments from 130 experts in the space. Sessions will be led by practitioners and semantic experts at Walmart, Viacom, Wells Fargo, Google, Yahoo!, and more. Register today.