Posts Tagged ‘rdfa’
How will webpage data be interpreted in the next few years? The Semantic Web community has high hopes for ever evolving semantic standards to help systems identify and extract rich data found on the web, ultimately making it more useful. With the announcement of Schema.org support for GoodRelations in November, it seems clear semantic progress is now being made on the e-commerce front, and at an accelerated rate. Martin Hepp, founder of GoodRelations, estimates the rate of adoption of rich, structured e-commerce data to significantly increase this year.
However, Mike Tung, founder and CEO of a data parsing service called DiffBot, has less faith that the standards necessary for a true Semantic Web will ever be completely and effectively implemented. In an interview on Xconomy he states that for semantic standards to work correctly content owners must markup the content once for the web and a second time for the semantic standards. This requires extra work, and affords them the opportunity to perform content stuffing (SEO spam).
Today sees the launch of Meritora, the first commercial implementation of the universal payment standard PaySwarm (initially discussed in this blog here and here). The creation of Digital Bazaar, the company founded and CEO’d by Manu Sporny – whose W3C credentials include being founder of both the Web Payments Community Group and JSON-LD Community Group, as well as chair of the RDF Web Applications Working Group – Meritora is designed to ease what is still a surprisingly arduous task of buying and selling on the web. The service is starting with a simple asset hosting feature for helping vendors sell digital content on WordPress-powered sites, and support for decentralized web app stores so that app creators can put their work on their web sites, set a price for them, and let them be bought there, at a web app store, or anywhere on the web.
The name Meritora points to the service’s underlying purpose of rewarding greatness, coming from the bases ‘merit’ and ‘ora,’ the latter of which has been used across a number of cultures to express a unit of value, Sporny says (noting that it means ‘golden’ in Esperanto, and was also used as a unit of currency among Anglo-Saxons). That’s a big name to live up to, but the service hopes to do so by making Web payments work simply, securely, quickly, with low fees and no vendor lock-in for buyers and sellers on the digital content scene.
There’s Linked Data to thank for what Meritora, and PaySwarm, can do, with Sporny describing the system as “the world’s first payment solution where the core of the technology is powered by Linked Data.”
RainVac has just launched a new web site, which they claim is the most advanced site on the Internet, utilizing fully compliant Semantic technologies and displaying on all devices. Vertex Worldwide, Inc. announced today that their RainVac division’s new web site has been launched, incorporating all the latest technologies to provide users with an ideal experience. The site makes extensive use of XHTML+RDFa technology, making it state-of-the-art in terms of rapidly developing semantic search capabilities. Read more
Ivan Herman of the W3C reports, “The W3C RDFa Working Group has published a Last Call Working Draft of HTML+RDFa 1.1. This specification defines rules and guidelines for adapting the RDFa Core 1.1 and RDFa Lite 1.1 specifications for use in HTML5 and XHTML5. The rules defined in this specification not only apply to HTML5 documents in non-XML and XML mode, but also to HTML4 and XHTML documents interpreted through the HTML5 parsing rules. Comments are welcome through 28 February.” Read more
Gregg Turner of Blue Claw Search recently discussed the impact of RDFa format data and why developers should implement it. Turner writes, “Rich snippets have become a lot more prominent within the SERPS over the past couple of years, with appealing, feature-rich listings becoming a more and more commonplace. Google refers to these enhanced search listings as “Rich Snippets”, and from a search marketing perspective they are often more appealing to users and increase Click Through Rates (CTR).” Read more
Yesterday we began our look back at the year in semantic technology here. Today we continue with more expert commentary on the year in review:
Ivan Herman, W3C Semantic Web Activity Lead:
I would mention two things (among many, of course).
- Schema.org had an important effect on semantic technologies. Of course, it is controversial (role of one major vocabulary and its relations to others, the community discussions on the syntax, etc.), but I would rather concentrate on the positive aspects. A few years ago the topic of discussion was whether having ‘structured data’, as it is referred to (I would simply say having RDF in some syntax or other), as part of a Web page makes sense or not. There were fairly passionate discussions about this and many were convinced that doing that would not make any sense, there is no use case for it, authors would not use it and could not deal with it, etc. Well, this discussion is over. Structured data in Web sites is here to stay, it is important, and has become part of the Web landscape. Schema.org’s contribution in this respect is very important; the discussions and disagreements I referred to are minor and transient compared to the success. And 2012 was the year when this issue was finally closed.
- On a very different aspect (and motivated by my own personal interest) I see exciting moves in the library and the digital publishing world. Many libraries recognize the power of linked data as adopted by libraries, of the value of standard cataloging techniques well adapted to linked data, of the role of metadata, in the form of linked data, adopted by journals and soon by electronic books… All these will have a profound influence bringing a huge amount of very valuable data onto the Web of Data, linking to sources of accumulated human knowledge. I have witnessed different aspects of this evolution coming to the fore in 2012, and I think this will become very important in the years to come.
As we close out 2012, we’ve asked some semantic tech experts to give us their take on the year that was. Was Big Data a boon for the semantic web, or is the opportunity to capitalize on the connection still pending? Is structured data on the web not just the future but the present? What sector is taking a strong lead in the semantic web space?
We begin with Part 1, with our experts listed in alphabetical order:
John Breslin, lecturer at NUI Galway, researcher and unit leader at DERI, creator of SIOC, and co-founder of Technology Voice and StreamGlider:
I think the schema.org initiative really gaining community support and a broader range of terms has been fantastic. It’s been great to see an easily understandable set of terms for describing the objects in web pages, but also leveraging the experience of work like GoodRelations rather than ignoring what has gone before. It’s also been encouraging to see the growth of Drupal 7 (which produces RDFa data) in the government sector: Estimates are that 24 percent of .gov CMS sites are now powered by Drupal.
Martin Böhringer, CEO & Co-Founder Hojoki:
For us it was very important to see Jena, our Semantic Web framework, becoming an Apache top-level project in April 2012. We see a lot of development pace in this project recently and see a chance to build an open source Semantic Web foundation which can handle cutting-edge requirements.
Still disappointing is the missing link between Semantic Web and the “cool” technologies and buzzwords. From what we see Semantic Web gives answers to some of the industry’s most challenging problems, but it still doesn’t seem to really find its place in relation to the cloud or big data (Hadoop).
Christine Connors, Chief Ontologist, Knowledgent:
One trend that I have seen is increased interest in the broader spectrum of semantic technologies in the enterprise. Graph stores, NoSQL, schema-less and more flexible systems, ontologies (& ontologists!) and integration with legacy systems. I believe the Big Data movement has had a positive impact on this field. We are hearing more and more about “Big Data Analytics” from our clients, partners and friends. The analytical power brought to bear by the semantic technology stack is sparking curiosity – what is it really? How can these models help me mitigate risk, more accurately predict outcomes, identify hidden intellectual assets, and streamline business processes? Real questions, tough questions: fun challenges!
Search engine Yandex this week added personalization capabilities for Eastern European users’ search results. It analyses their online behavior including their search history, clicks on search results, and language preferences for its suggestions.
Kaliningrad is the name of the latest edition of Yandex’ personalized search engine. It uses that information to make suggestions and rank search results individually tailored for each user, showing book lovers that do a search on Harry Potter links related to the books, while those who prefer movies get film-oriented link fare.
Semantic markup didn’t play a role in the development of the technology, Yandex technical product manager and developer advocate Alexander Shubin says. But it can be applied for future enhancements, he notes. The new personalization reportedly leverages Yandex’ machine-learning-based query and search results algorithms “Spectrum” and “MatrixNet” to train the results to users’ requirements.
That said, Yandex has been diving deeper into semantic web waters. Beyond taking advantage of sites using schema.org markup to improve the display of search results, Shubin provides this update: “We enhanced our markup validator to understand all the markup (Open Graph, schema.org, RDFa, microformats). It is universal now (as Google’s or Bing’s instruments).”
Structured data makes the Web go around. Search engines love it when webmasters mark up page content. Google’s rich snippets, for instance, leverages sites’ use of microdata (preferred format), or RDFa or microformats: It makes it possible to highlight in a few lines specific types of content in search results, to give users some insight about what’s on the page and its relationship to their queries – prep time for a recipe, for instance.
Plenty of web sites generated from structured data haven’t added HTML markup to their pages, though, so they aren’t getting the benefits that come with search engines understanding the information on those web pages.
Maybe that will change, now that Google has introduced Data Highlighter, an easy way to tell its search engine about the structured data behind their web pages. A video posted by Google product management director Jack Menzel gives the snapshot: “Data Highlighter is a point- and-click tool that allows any webmaster to show Google the patterns of structured data on their pages without modifying the pages themselves,” he says.