Posts Tagged ‘microdata’

Web Components: Even Better With Semantic Markup

W3C LogoThe W3C’s Web Components model is positioned to solve many of the problems that beset web developers today. “Developers are longing for the ability to have reusable, declarative, expressive components,” says Brian Sletten, a specialist in semantic web and next-generation technologies, software architecture, API design, software development and security, and data science, and president of software consultancy Bosatsu Consulting, Inc.

Web Components should fulfill that longing: With Templates, Custom Elements, Shadow DOM, and Imports draft specifications (and thus still subject to change), developers get a set of specifications for creating their web applications and elements as a set of reusable components. While most browsers don’t yet support these specifications, there are Web Component projects like Polymer that enable developers who want to start taking advantage of these capabilities right away to build Web objects and applications atop the specs today.

“With this kind of structure in place, now there is a market for people to create components that can be reused across any HTML-based application or document,” Sletten says. “There will be an explosion of people building reusable components so that you and I can use those elements and don’t have to write a ton of obnoxious JavaScript to do certain things.”

That in itself is exciting, Sletten says, but even more so is the connection he made that semantic markup can be added to any web component.

Read more

Semantic Markup Pays Off But For Whom?

schemapix1 Many eyes are turning to research being done by SEO optimization vendor Searchmetrics about the virtues of semantic markup. Exploring the enrichment of search results through microdata integration, it says it has analyzed “tens of thousands of representative keywords, and rankings for over half a million domains from our comprehensive database, for the effect of the use of schema.org markup in terms of dissemination and integration type.”

Its study is still underway but so far its initial findings include good news – that is, that semantic markup succeeds:

  • Larger domains are more likely to embrace structured data markup, and the most popular markups relate to movies, offers, and reviews.  That said, overall, domains aren’t flocking to integrate Schema HTML tags.

Read more

The Web Is 25 — And The Semantic Web Has Been An Important Part Of It

web25NOTE: This post was updated at 5:40pm ET.

Today the Web celebrates its 25th birthday, and we celebrate the Semantic Web’s role in that milestone. And what a milestone it is: As of this month, the Indexed Web contains at least 2.31 billion pages, according to WorldWideWebSize.  

The Semantic Web Blog reached out to the World Wide Web Consortium’s current and former semantic leads to get their perspective on the roads The Semantic Web has traveled and the value it has so far brought to the Web’s table: Phil Archer, W3C Data Activity Lead coordinating work on the Semantic Web and related technologies; Ivan Herman, who last year transitioned roles at the W3C from Semantic Activity Lead to Digital Publishing Activity Lead; and Eric Miller, co-founder and president of Zepheira and the leader of the Semantic Web Initiative at the W3C until 2007.

While The Semantic Web came to the attention of the wider public in 2001, with the publication in The Scientific American of The Semantic Web by Tim Berners-Lee, James Hendler and Ora Lassila, Archer points out that “one could argue that the Semantic Web is 25 years old,” too. He cites Berners-Lee’s March 1989 paper, Information Management: A Proposal, that includes a diagram that shows relationships that are immediately recognizable as triples. “That’s how Tim envisaged it from Day 1,” Archer says.

Read more

Start Your Semantic Engines: TrueCar Looks To Foster Transition Of Vehicle Data From Flat To Structured And Enhanced

Back when he was VP and CTO at Hearst Interactive Media, Mike Dunn advocated the use of semantic technologies for media organizations to rocket-boost their control over content, both for internal operations and for presenting a better face to users out there on the web. (See our story with his insights on that here). Now, Dunn has recently made the move to Truecar, an eight-year-young start-up focused on improving the car-buying process. As CTO, his mission is to modernize its data stack.

How do the two worlds of media and automotive connect? “There’s definitely a connection if you think about content as data,” Dunn told The Semantic Web Blog during a few free moments at the recent Semantic Technology and Business Conference. And, TrueCar gets “the importance of data, even though you don’t always have to throw the semantic web [phrase] in there. But things like sentiment-enhancing and context – those are useful words that don’t confuse people.”

Today, says Dunn, much of the data around vehicles, sales processes, and how cars are customized or configured tends to be fairly flat – that is, either unstructured and/or proprietary, but doors open up when it gains meaning — becomes structured, enhanced and openly known and leveraged from an industry perspective. “That transition, which we believe we’ll be able foster, will allow the creation of additional enhancing services to consumers and the industry at large,” he says.

Read more

Yandex’ New Interactive Snippets: Now Users Can Book, Buy And Pay Bills Right From Its Search Page

Rich snippets – yep, they were a nice start, but Russian search engine Yandex thinks it’s time for something more powerful. Something it’s calling interactive snippets and a feature it’s branding as Islands for its search results pages.

Yandex says the new feature evolves from rich snippets, which CTO Ilya Segalovich refers to in the press release as “mere decoration.” Interactive snippets, in contrast, are actionable, letting users do things like book movie tickets, make reservations or pay bills right from the search page. Webmasters can choose to add this functionality to their web sites if they want to, and while it may get their business customers – especially those using smartphones and tablets – who want to make their transactions as seamless as possible, it does mean those users won’t be making the journey to the business’ own web site.

Read more

The Future of E-Commerce Data Interpretation: Semantic Markup, or Computer Vision?

How will webpage data be interpreted in the next few years?  The Semantic Web community has high hopes for ever evolving semantic standards to help systems identify and extract rich data found on the web, ultimately making it more useful.  With the announcement of Schema.org support for GoodRelations  in November, it seems clear semantic progress is now being made on the e-commerce front, and at an accelerated rate.  Martin Hepp, founder of GoodRelations, estimates the rate of adoption of rich, structured e-commerce data to significantly increase this year.

diffbot logo and semantic web cubeHowever, Mike Tung, founder and CEO of a data parsing service called DiffBot, has less faith that the standards necessary for a true Semantic Web will ever be completely and effectively implemented.  In an interview on Xconomy he states that for semantic standards to work correctly content owners must markup the content once for the web and a second time for the semantic standards.  This requires extra work, and affords them the opportunity to perform content stuffing (SEO spam).

Read more

Latest Version of RDFLib Released

Ivan Herman reports, “This has been in the works for a while, but it is done now: the latest (3.4.0 version) of the python RDFLib library has just been released, and it includes and RDFa 1.1, microdata, and turtle-in-HTML parser. In other words, the user can add structured data to an HTML file, and that will be parsed into RDF and added to an RDFLib Graph structure. This is a significant step, and thanks to Gunnar Aastrand Grimnes, who helped me adding those parsers into the main distribution.”

He goes on, “I have written a blog last summer on some of the technical details of those parsers; although there has been updates since then, essentially following the minor changes that the RDFa Working has defined for RDFa, as well as changes/updates on the microdata->RDF algorithm, the general approach described in that blog remains valid, and it is not necessary to repeat it here. Read more

Good-Bye to 2012: A Look Back At The Year In Semantic Tech, Part 1

Courtesy: Flickr/zoetnet

As we close out 2012, we’ve asked some semantic tech experts to give us their take on the year that was. Was Big Data a boon for the semantic web, or is the opportunity to capitalize on the connection still pending? Is structured data on the web not just the future but the present? What sector is taking a strong lead in the semantic web space?

We begin with Part 1, with our experts listed in alphabetical order:

John Breslin, lecturer at NUI Galway, researcher and unit leader at DERI, creator of SIOC, and co-founder of Technology Voice and StreamGlider:
I think the schema.org initiative really gaining community support and a broader range of terms has been fantastic. It’s been great to see an easily understandable set of terms for describing the objects in web pages, but also leveraging the experience of work like GoodRelations rather than ignoring what has gone before. It’s also been encouraging to see the growth of Drupal 7 (which produces RDFa data) in the government sector: Estimates are that 24 percent of .gov CMS sites are now powered by Drupal.

Martin Böhringer, CEO & Co-Founder Hojoki:

For us it was very important to see Jena, our Semantic Web framework, becoming an Apache top-level project in April 2012. We see a lot of development pace in this project recently and see a chance to build an open source Semantic Web foundation which can handle cutting-edge requirements.

Still disappointing is the missing link between Semantic Web and the “cool” technologies and buzzwords. From what we see Semantic Web gives answers to some of the industry’s most challenging problems, but it still doesn’t seem to really find its place in relation to the cloud or big data (Hadoop).

Christine Connors, Chief Ontologist, Knowledgent:

One trend that I have seen is increased interest in the broader spectrum of semantic technologies in the enterprise. Graph stores, NoSQL, schema-less and more flexible systems, ontologies (& ontologists!) and integration with legacy systems. I believe the Big Data movement has had a positive impact on this field. We are hearing more and more about “Big Data Analytics” from our clients, partners and friends. The analytical power brought to bear by the semantic technology stack is sparking curiosity – what is it really? How can these models help me mitigate risk, more accurately predict outcomes, identify hidden intellectual assets, and streamline business processes? Real questions, tough questions: fun challenges!

Read more

Search Engine Yandex Gets More Personal, And More Semantic, Too

Image courtesy of Pixomar / FreeDigitalPhotos.net

Search engine Yandex this week added personalization capabilities for Eastern European users’ search results. It analyses their online behavior including their search history, clicks on search results, and language preferences for its suggestions.

Kaliningrad is the name of the latest edition of Yandex’ personalized search engine. It uses that information to make suggestions and rank search results individually tailored for each user, showing book lovers that do a search on Harry Potter links related to the books, while those who prefer movies get film-oriented link fare.

Semantic markup didn’t play a role in the development of the technology, Yandex technical product manager and developer advocate Alexander Shubin says. But it can be applied for future enhancements, he notes. The new personalization reportedly leverages Yandex’ machine-learning-based query and search results algorithms “Spectrum” and “MatrixNet” to train the results to users’ requirements.

That said, Yandex has been diving deeper into semantic web waters. Beyond taking advantage of sites using schema.org markup to improve the display of search results, Shubin provides this update: “We enhanced our markup validator to understand all the markup (Open Graph, schema.org, RDFa, microformats). It is universal now (as Google’s or Bing’s instruments).”

Read more

Google Debuts Data Highlighter: An Easy Way Into Structured Data

Structured data makes the Web go around. Search engines love it when webmasters mark up page content. Google’s rich snippets, for instance, leverages sites’ use of microdata (preferred format), or RDFa or microformats: It makes it possible to highlight in a few lines specific types of content in search results, to give users some insight about what’s on the page and its relationship to their queries – prep time for a recipe, for instance.

Plenty of web sites generated from structured data haven’t added HTML markup to their pages, though, so they aren’t getting the benefits that come with search engines understanding the information on those web pages.

Maybe that will change, now that Google has introduced Data Highlighter, an easy way to tell its search engine about the structured data behind their web pages. A video posted by Google product management director Jack Menzel gives the snapshot: “Data Highlighter is a point- and-click tool that allows any webmaster to show Google the patterns of structured data on their pages without modifying the pages themselves,” he says.

Read more

NEXT PAGE >>