SemTechBiz SF more TVNewser TVSpy LostRemote SocialTimes AllFacebook AllTwitter GalleyCat AppNewser UnBeige AgencySpy PRNewser 10,000 Words FishbowlNY FishbowlLA FishbowlDC MediaJobsDaily

Posts Tagged ‘Wikipedia’

Two Perspectives on Wikidata

Mark Graham recently raised some concerns regarding the Wikidata project in The Atlantic. Graham writes, “Wikidata will create a collaborative database that is both machine readable and human editable and which will underpin a lot of knowledge that is presented in all 284 language versions of Wikipedia. In other words, the encyclopaedia plans to become part of the movement from a mostly human-readable Web to a Web in which computers and software can better make sense of information… The reason that Wikidata marks such a significant moment in Wikipedia’s history is the fact that it eliminates some of the scope for culturally contingent representations of places, processes, people, and events. However, even more concerning is that fact that this sort of congealed and structured knowledge is unlikely to reflect the opinions and beliefs of traditionally marginalized groups.”

Graham Continues, “It is important that different communities are able to create and reproduce different truths and worldviews. And while certain truths are universal (Tokyo is described as a capital city in every language version that includes an article about Japan), others are more messy and unclear (e.g. should the population of Israel include occupied and contested territories?).”

Read the full article here.

Denny Vrandečić, project director of Wikidata, posted a thoughtful response to Graham’s article. I have re-posted Vrandečić’s response in its entirety:

Mark,

Thank you for your well-thought criticism. When we were thinking first of adding structured data to Wikipedia, we were indeed thinking of giving every language edition its own data space. This way the Arab and the Hebrew Wikipedia community would not interfere with each other, nor would the Estonian and the Russian communities interfere with each other. Read more

SWiPE Plans to Make Search a Breeze

Eileen Brown recently reported that SWiPE hopes to make querying search engines a less frustrating experience. Brown writes, “If you struggle with RDF triples (Resource Description Framework) and SPARQL (Query language and protocol for RDF) do not despair. SWiPE (Searching WIkiPedia by Example) allows semantic and well-structured knowledge bases to be easily queried from within the pages of Wikipedia. If you want to know which cities in Florida, founded in last century have more than 50 thousand people you will be able to enter the query conditions directly into the Infobox of a Wikipedia page. Swipe activates certain fields of Wikipedia that generate equivalent SPARQL queries executed on DBpedia.” Read more

A Fundamental Linked Data Debate

linkeddata_blue There is a fierce debate going on in the world of the Semantic Web and Linked Data, the question being is it of fundamental importance to realising the benefits of the technology or are they just dancing on the head of a pin.    The core debate revolves around something with the stunningly opaque title of the httpRange-14 issue.

The debate has been rumbling on for years but was reignited over the last few days by proposals being submitted to the W3C to clarify and hopefully simplify things.  I use the word ignited as that what I was beginning to think my iPhone was about to do – it has been buzzing away like a bumblebee on speed over the last few days announcing the arrival of yet another passionately held opinion from a member of the respected Semantic Web/Linked Data community from Sir Tim Berners-Lee downwards.    Fortunately for those of you that do not follow the W3C’s Technical Architecture (TAG) and Linked Open Data (public-lod) mailing lists it may have gone unnoticed.

Let me try to explain, in as simple terms as possible, what the fuss is all about and why it may be important.  From my point of view, and there are many surrounding this, the issue is a combination of two problems.

Read more

Paper Review: “Recovering Semantic Tables on the WEB”

A simple table with no semanticsA paper entitled  “Recovering Semantics of Tables on the Web” was presented at  the 37th Conference on Very Large Databases in Seattle, WA . The paper’s authors included 6 Google engineers along with professor Petros Venetis of Stanford University and Gengxin Miao of UC Santa Barbara. The paper summarizes an approach for recovering the semantics of tables with additional annotations other than what the author of a table has provided. The paper is of interest to developers working on the semantic web because it gives insight into how programmers can use semantic data (database of triples) and Open Information Extraction (OIE) to enhance unstructured data on the web. In addition they compare how a  “maximum-likelihood” model, used to assign class labels to tables, compares to a “database of triples” approach. The authors show that their method for labeling tables is capable of labeling “an order of magnitude more tables on the web than is possible using Wikipedia/YAGO and many more than freebase.”

Read more

The Semantic Web Has Gone Mainstream! Wanna Bet?

Juan Sequeda photoIn 2005, I started learning about the so-called Semantic Web. It wasn’t till 2008, the same year I started my PhD, that I finally understood what the Semantic Web was really about. At the time, I made a $1000 bet with 3 college buddies that the Semantic Web would be mainstream by the time I finished my PhD. I know I’m going to win! In this post, I will argue why.

Read more

The Semantic Link with Guest, Denny Vrandecic – February, 2012

Paul Miller, Bernadette Hyland, Ivan Herman, Eric Hoffer, Andraz Tori, Peter Brown, Christine Connors, Eric Franzon

On Friday, February 10, a group of Semantic thought leaders from around the globe met with their host and colleague, Paul Miller, for the latest installment of the Semantic Link, a monthly podcast covering the world of Semantic Technologies. This episode includes a discussion about data; specifically, the recently announced “wikidata” project with special guest, Denny Vrandecic.
At the recent SemTechBiz Berlin conference, Denny presented a talk titled, “Wikidata: The Next Big Thing for Wikipedia.” As evidenced in the “Wow’s” expressed by the panelists in this month’s podcast call, this is indeed a big deal for Wikipedia and for Semantic Web. Read more

Learning about WikiData at #SemTechBiz Berlin

Richard Wallis is reporting from SemTechBiz in Berlin this week. He recently wrote, “One of the more eagerly awaited presentations at the Semantic Tech & Business Conference in Berlin today was a late addition to the program from Denny Vrandecic.  With the prominence of Dbpedia in the Linked Open Data Cloud, anything new from Wikipedia with data in it was bound to attract attention, and we were not disappointed.” Read more

#SemTechBiz Berlin – Day 2

After a great day yesterday I was eager to to discover what today’s program had to offer.  Unfortunately I had to set off for the airport, where I am now writing this, before the end.  However I caught most of the day and here are my few thoughts and recollections.

P1000760Today’s Keynote was in the form of a panel discussing Semantics in the Automotive Industry with Martin [GoodRelations] Hepp, John Kendall Streit of Tribal DDB, William Greenly of AQKA, and François-Paul Servant from Renault.  They discussed their experiences in pioneering the use of Linked Data / Semantic Web technologies and approaches in the automotive domain.
Read more

The SemanticLink Podcast – Submit Your Questions

The Semantic LinkAfter December’s episode of the Semantic Link, we asked for your thoughts on both the topics we should cover, and the ways in which you would like to interact with the podcast. You spoke, very clearly asking for an opportunity to pose questions for the team to answer during recordings. This is that opportunity.

February’s episode of the show will be recorded this Friday, 10 February, and we’re joined by a guest with a lot to contribute during our conversation.

There is growing interest in publishing, sharing and using data on the Web. The Semantic Web’s Linked Data effort is clearly one approach to this, but there are others. At Wolfram Alpha, for example, founder Stephen Wolfram suggests that a new Top Level Domain (TLD) for data will make data easier to find on the web. And inside the Wikimedia Foundation (the home of Wikipedia), a new WikiData project is rapidly taking shape.

Photo of Denny VrandecicWikiData project director, Denny Vrandecic, joins us to share his perspectives on these and other approaches to the space.

And now over to all of you. Please use the comments facility below, to share your perspectives on the question, or to submit your comments and questions for Denny and the regular gang to consider. Then tune in the week of 13 February to hear the result!

Wikimeta Project’s Evolution Includes Commercial Ambitions and Focus On Text-Mining, Semantic Annotation Robustness

Wikimeta, the semantic tagging and annotation architecture for incorporating semantic knowledge within documents, websites, content management systems, blogs and applications, this month is incorporating itself as a company called Wikimeta Technologies.  Wikimeta, which has a heritage linked with the NLGbAse project, last year was provided as its own web service.

Dr. Eric Charton, Ph.D, MSc at École Polytechnique de Montréal, is project leader and author of the Wikimeta code. The NLGbAse project was conducted by Charton at the University of Avignon as part of his Ph.D. Thesis.  The Semantic Web Blog recently hosted an email discussion with him to learn more about the Wikimeta architecture and its evolution.

 

The Semantic Web Blog: Tell us about the NLGBase project and Wikimeta’s relationship to it.

Charton: NLGbAse is an ontology extracted from Wikipedia. It is used in Wikimeta as a resource for semantic disambiguation. For each Wikipedia document (aka Semantic Concept), NLGbAse provides various ways of word-writing (for example, “General Motors” can be written “GM Company”, “GM”, “General Motors Corp” and so on), used for detection.

Read more

<< PREVIOUS PAGENEXT PAGE >>