2011 SemTech Conference, 2011 SemTech West, Business, How-to, Industry Verticals, Insight, Learning, Linked Data, Marketing & Advertising, News, Open data, Social Media Semantics
Five Steps to Linked Data Integration
Last week, we covered the story of how Chris Testa, Director of Engineering at Ad.ly, Inc. brought the Semantic Web to Hollywood. Today, in Part II, Chris shares his recommended 5-Step process for Linked Data Integration.
1. Understand what your “things” are
- Look for the high value entities in your system — the ones bringing money and business intelligence over competitors (Examples: Advertisers, Brands, Celebrities)
- Look for models that are growing quickly in your system (For us, it was Celebrities)
- Look for things that are well annotated, popular things in culture & technology
2. Choose a Linked Dataset:
- dbpedia and Freebase are cornerstones of the Linked Data movement
- There are tons of specialized datasets in many fields (biomedical, events, news, gov’t, so much more!)
- Once you link up, linking to more becomes much easier!
3. Reconcile your things:
- Reconciling is matching the entities in your database with remote linked data sources
- Freebase’s matchmaker is a really useful tool for reconciling
- Make it a game, put experts on it to ensure high quality datasets
- Heuristic methods exist to tackle queues in the 100k+ count
4. Build business intelligence:
- Tip: There are really simple things you can do with linked data that are cool!
- For example, display context to users around reconciled entities in your project. Context makes things easier for users.
- Index and search on reconciled properties like full name, gender, genre, profession, etc.
5. Feedback & maintenance
- Users won’t trust the data unless it is manicured.
- Add lots of negative feedback loops (Unlike buttons!) to make sure that users are heard.
- A few minutes a day of cleanup does wonders!
See Chris’ SemTech 2011 presentation on slideshare: How Hollywood Learned to Love the Semantic Web:
http://slidesha.re/mhXXOJ
Additional Reporting by Jennifer Zaino with contributions from Chris Testa, Director, Engineering, Adly, Inc.
RELATED:
- Session Spotlight: NEW Keynote - What Google is Doing with Structured Data
- 5 Finalists Named in LODLAM Challenge
- Liaison Healthcare Launches Healthcare Terminology Manager and Translator
- Wibidata Raises $15M to Help Build Predictive Applications on Hadoop



Eric Franzon
VP Community
Jennifer Zaino
Contributor
Angela Guess Contributor
semanticweb.com Twitter feed loading...