page

Q & A – Callimachus: Semantic Web Apps Made Easy

Q&A Session for "Callimachus: Semantic Web Apps Made Easy"

Q: Was Callimachus arguing against a "single inheritance" hierarchy or arguing against a universal "Thing" which is the superclass of all things?

A: Callimachus, the Greek poet, could easily justify placing many of his scrolls (of short poems) in multiple bins. He argued that some things have multiple categories (or types) and could not be arranged in single hierarchy.

Q: Can I use html 5 with this?

A: Callimachus serves the constructed pages using an HTML serialization with no doctype and a content type of text/html. Because of the missing doctype and HTML serialization Callimachus cannot serve standard HTML5 documents.

Q: Is Callimachus able to draw on the ontology, for example, to create a pull-down menu based on an enumerated class?

A: Callimachus will read the ontology into the RDF store for querying when placed in the webapps directory. However, Callimachus cannot query rdf:List (used for OWL enumerations) for drop downs. If however, these enumerations have a unique rdf:type relationship, they they can be populated in a drop down.

Q: I am wondering if there is a way for the form which creates a new resource to accept dynamic content from the user instead of having predefined entities in the xml file? Also, where are the new resources stored?

A: Using client side JavaScript, the RDF in the create/copy page can take on many dynamic forms. However, only triples that are about the target resource and match a basic graph pattern in the RDFa template can be created using the copy operation. New resources are stored in the embedded RDF store by default.

Q: Is it possible to have essentially a wild card for Delete you don't have to specify all the properties?

A: Not in the current version of Callimachus, but this may change in subsequent versions.

Q: Any thoughts of an IDS sort of interface for creating Callimachus templates rather than crafting raw XML? (such as pick a property from a dropdown and enter the specific parameter)

A: There is no UI builder in the current version of Callimachus, although such a feature is desirable in future versions.

Q: So, my understanding is that each of the templates has to be hand-written, is this correct?

A: The templates must either by hand written or generated from an XSLT. You can, for example, use XSLT to generate RDFa templates from a particular RDF/XML encoded ontology.

Q: Is it possible to share code between the templates? for example to have a footer or header on every page?

A: Callimachus provides a common header, footer, and navigation menu for templates that use the XSL stylesheet "/layout/template.xsl". This is a fairly simple XSLT file that can be found in the layout.war file. RDFa templates are free to use alternate XSLT files to include common sections. The XSLT is applied to the RDFa template before the RDFa is parsed.

Q: I thought a strength of RDF was to be able to add schema and data at runtime. Does the templating in Callimachus force the schema to become static? (i.e. if schema changes, Callimachus can't take advantage of it)

A: The schema is defined in the HTML. By changing the HTML Callimachus will start to use the new version right away.

Q: How complex can the RDFa schemas get and still be useable by the simple(?) query language that operates over them?

A: The RDFa template syntax is not as expressive as SPARQL. There is no way to express the type of join, there are no filters, or limit. For complex queries, SPARQL is still better. Callimachus can serve named sparql queries that are defined in a Turtle file within the webapps directory. Open the layout.war and take a look at the menu.ttl for a simple example and lookup.ttl for an example using SPARQL+XSLT.

Q: What size RDF files can Callimachus handle?

A: On startup Callimachus will crawl the webapps directory and upload any .rdf and .ttl files into the RDF store. Too many or too large .rdf and .ttl files can slow this down. Instead, bulk RDF can be loaded into Callimachus using an authenticated PUT request. See the page on RDF Import for details.

Q: What exactly are the .war files?

A: The .war files are a compressed (zipped) directory with the .zip extension renamed to .war. These files should not be confused with J2EE webapps.

Q: Is Callimachus a runtime environment, a development environment, or both?

A: Callimachus watches the webapps directory and "uploads" any changes to an internal data store. Callimachus will queue and lock the data store to allow uninterrupted service while changes are being made to the webapps directory. Watch the log files for parse errors that occur during the upload.

Q: Can we use this with Java?

A: Callimachus uses OpenRDF AliBaba for RDF-object mapping and object server. These libraries provide an interface for triggers written in Java and ways to intercept HTTP requests. Post any further questions on this to the discussion group.

Q: Can Callimachus be used to create applications outside of the Callimachus container (i.e. Spring Framework)? If not out of the box, does it have an API which can be used to accomplish this?

A: Callimachus does not publish an API for external use, although that could change in the future.

Q: How big can the repository be?

A: This of course depends on the hardware. Using commodity hardware with default configuration things may slow down at 100 million triples.

Q: Where is the repository being queried specified?

A: Callimachus uses an embedded RDF store by default that is specified in the Java resource META-INF/templates/callimachus-config.ttl.

Q: Is there any inference engine available, other than what is available through SPARQL?

A: Callimachus uses the Sesame API and can use any inference engine or store available through a Sesame interface.

Q: What is the SPARQL edit /submission tool you're using?

A: The browser plugin used in the demo is available in Firefox and Chrome as "HTTP Response Browser".

Bingo

sdfsf

Q&A Session for “The RDFa initiative in Drupal 7, and how it will impact the Semantic Web”

Q: How are the colleagues generated in the example? Manually nominated or picked up automatically by some social networking function?

A: The example you are referring to is a typical website built with Drupal and Fields (formally known as CCK) where the users have created pages (of type person in the example) and they filled in all the information like name, picture and also they choose the list of colleagues from the list of other person described on the site. The colleagues haven’t been picked from a social network. This example only illustrating the generation of social network data with Drupal and its export in RDF. The other way around (importing) is something more tricky which is still at a prototype stage and should be available in the future for Drupal.

Q: How do we use http://opengraphprotocol.org/ of facebook as a source in Drupal?

A: Yes, you can use Open Graph protocol module for Drupal 7 I created a few days ago.

Q: Will you be persisting RDF data and if so by what mechanism?

A: Drupal core does not store RDF data as triples, but instead it generates the RDFa markup on the fly. Because the RDF mappings are defined ahead of time and not really RDFa specific, the data contained in a Drupal site can also be expressed in RDF triples and stored in an RDF store. In fact the RDF module which you can download for Drupal 7 already allows to get RDF/XML for each page of the site. There will be shortly a module to federate all this RDF data into a local store and expose it via a SPARQL endpoint.

Q: Please advise the whitepaper and example references you mentioned.

A: All the relevant links have been added to the webinar announcement.

Q: How big does the vocabulary in RDF & RDFa need to be before the general user population finds it useful?

A: I’m not sure to understand the question 100% but I’ll assume you’re talking about the amount of RDF triples on a given page. A few triples can make quite a difference for machines to understand what information they are looking at. To take a more practical example, you only need to have four elements in RDFa to have your page reusable by Facebook. Similarly, Yahoo! and Google only require a few triples to start making your data available as enriched search results. Typically less than 10 triples are enough to get a decent looking search result.

Q: So if we are not quite ready to try Drupal 7 in alpha, what do we need to install on Drupal 6 to play with RDFa?

A: I would recommend to wait until you can move to Drupal 7. RDF in support in Drupal 6 is quite difficult to set up, and there are feature like RDFa which are not really supported. Drupal 7 core has been design so that it’s possible to output RDFa and works better with RDF in general. Aside from that, Drupal 7 also has a lot of new features in term of usability, testing, APIs.

Q: Do you have any query interface built for querying RDF data?

A: That more of a generic RDF question which not only applies to Drupal but any application producing RDF data. See Freebase’s Parallax and Sparallax developed by DERI.

Q: Can you speak more about mapping and make some recommendations?

A: Drupal 7 core ships with some default RDF mappings for each built in content type like blog posts, articles, forums. You can change them or specify mappings for the new content types you might create on your site with the RDF module.

Q: Are there important RDFa standards to follow when setting up content types and fields so that they are represented consistently by search engines, etc?

A: Yahoo! and Google are still aligning their vocabularies. The best place to get the latest specifications is their webmaster documentations: Yahoo! SearchMonkey: Getting Started and Google Rich Snippets.

Q: When will a browser be able to follow RDF links?

A: Regular browsers like Firefox are already able to browse HTML pages which contain RDFa, except they are not yet able to understand what type of links they are traversing. For an RDF specific browser, see Tabulator.

Q: Also, what happens if/when there are changes made to RDF information types – how do they propagate and "push" the notification of changes to the sites displaying the data?

A: The syndication mechanism can be either a pulling of the information every so often like it is the case with regular RSS. A more sophisticated method can involve a pubsubhubbub setup where subscribers are notified every time there is a change. The work for integrating RDF and pubsubhubbub is currently on-going, see an example with sparqlPuSH

Q: Thanks for inviting me in this great seminar. I have an ontology which was implemented in OWL. May I import it into Drupal? What kind of modules I need to import using the specific ontology in Drupal, instead of generating RDF inside the Drupal?

A: The RDF mapping API in Drupal has been designed so it can deal with any ontology. The RDF external vocabulary importer in Drupal 6 was the first prototype of a module allowing to import any ontology into Drupal. It is currently being ported to Drupal 7 and you will be able to import your custom ontology to map your Drupal site data structure to your ontology.

Q: How ready is Drupal 7? I am an experienced developer but no Drupal experience

A: Drupal 7 is definitely usable: you can install it, create your site structure and create pages. It is not recommended to go production with it yet since there are still some known issue, but a good strategy is to start getting used to its new user interface and many other improvements; and start planning how you want to build your site with Drupal. Drupal 7 will be stable when all bugs are fixed, so if you are a developer you can help fix these remaining bugs. If you are not a developer there are many other ways you can contribute. New users are encouraged to download Drupal 7 and try it out.

Q: In your example you have the term ‘bear’ defined as a SKOS concept. I didn’t see any namespace in the markup. Would there be one or are all bears equal?

A: The namespaces in Drupal 7 are located at the very top of the HTML document, in the html tag. Drupal 7 includes the namespaces of Dublin Core, FOAF, SIOC, SKOS, etc. Developers can add their own custom namespaces via the RDF mapping API.

Q: Will Drupal 7 support discovery of vocabularies, and, help the administrator to choose the most appropriate vocabularity?

A: Yes, it will. The Produce and Consume Linked Data with Drupal! paper (PDF) goes well in details of the approach we are planning to use in Drupal 7.

Q: How easy will it be to tag a persons name in an article?

A: Drupal core does not deal with unstructured data within, say, the body of an article. People interested in this should look at modules like Calais or Zemanta which will detect these entities automatically for you.

Q&A Session for “Introduction to Linked Data” webcast

Q: Can you go back to more clearly describe RDF?

A: You might have a look at this short introduction to RDF (note the date!) written by my business partner, Uche Ogbuji:
http://www.ibm.com/developerworks/library/w-rdf/

A longer and more complete introduction is available here:
http://research.talis.com/2005/rdf-intro/

Q: Are Apple apps already using SPARQL and LOD at consumer level, or are they using different method?

A: Apple apps use many of the W3C standards, but as far as I know they are not using SPARQL or LOD techniques. The big announcements around RDF usage by major consumer-oriented companies in the past year have been Google and Yahoo’s support for parsing RDFa from Web pages and Best Buy’s use of RDFa to increase their page ranking on those search engines.

Q: How does SPARQL fit with RDF?

A: SPARQL is a query language for distributed RDF data in the same way that SQL is a query language for relational databases.

If you want to see how to create SPARQL queries for real, try these:
http://www.cambridgesemantics.com/2008/09/sparql-by-example/
http://www.slideshare.net/ldodds/sparql-tutorial

Q: Can you also elaborate more on how LOD can overcome scalability issues?

A: Linked Data approaches consist of standards, formats, tools and techniques to query, resolve and analyze distributed data on the World Wide Web. Linked Data is based completely on the standards of the Web. The Web is the largest and most complex information system ever fielded because of scalability principles built into those standards. Roy Fielding, in his doctoral dissertation, captured and analyzed the properties that make the Web scalable. He called the collection of those properties Representational State Transfer (REST). Linked Data is built on REST. Roy’s dissertation is quite readable for a thesis and may be found at:
Fielding, R.T. (2000). Architectural Styles and the Design of Network-based Software Architectures. Doctoral dissertation, University of California, Irvine. http://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm

Q: When do you have to use absolute URI’s in RDF?

A: There are several different ways to store RDF information. There are at least five commonly-used document formats (RDF/XML, OWL/XML, Turtle, N3, RDFa or raw triples – I prefer Turtle), the SPARQL Query Results XML Format and the SPARQL query language itself. Most RDF systems exchange information using one or more of those formats. The syntax of the particular format dictates whether URIs must be absolute or whether they may be simplified. Most (e.g. RDF/XML, OWL/XML, RDFa, Turtle, N3, SPARQL result set and query language) allow some mechanism to shorten URIs (variously called "namespaces" or "compact URIs").

Q: Data inflation – great slide!

A: Thanks!

Q: What is your understanding of the relationsip between the semantic web and the human mind (e.g., what practices promote learning, development of expert-like knolwedge architecture, and generative thinking?)

A: There may be a good reason to use an associative model to describe information: Human memories have been claimed to be associative in nature [Collins 1975], and recent functional magnetic resonance imaging studies lends credence to that view [Mitchell 2008]. See the following for details if you don’t already know them:

Collins, A.M. and Loftus, E.F. (1975, November). A Spreading-Activation Theory of Semantic Processing. Psychological Review, 82, pp. 407-428.

Mitchell, T.M., Shinkareva, S.V., Carlson, A., Chang,K.-M., Malave, V.L., Mason, R.A. and Just, M.A. (2008, May 30). Predicting Human Brain Activity Associated with the Meanings of Nouns, Science, 320 (5880), pp. 1191-1195.

Q: I’m intrigued by the enabling of discovery; can you point to an application demonstrating it in a LOD project?

A: Sure. Try these:
http://www.zotero.org/
http://simile.mit.edu/wiki/Piggy_Bank

For the more technically minded, see the W3C’s expose on The Self-Describing Web: http://www.w3.org/2001/tag/doc/selfDescribingDocuments

Q: Seems like most of the project on Linked Data are academic/non profit projects. Why are there not more commercial projects on Linked Data?

A: There are many commercial projects using Linked Data, they just tend to be more circumspect. Some notable exceptions are Google and Yahoo’s support for parsing RDFa from Web pages and Best Buy’s use of RDFa to increase their page ranking on those search engines. The New York Times was a welcome addition. The BBC both provides data and uses it internally. The forthcoming book I mentioned from Springer (Linking Enterprise Data) will have more.

It is worth noting that the Linked Open Data (with a focus on "open") does not appeal to most businesses. That doesn’t mean that many businesses aren’t exploring or actively using Linked Data techniques.

Q: Just a comment– How we understand these relationsips inside our heads is referred to as structural knowledge. This is also the underlying idea behind concept maps.

A: Right. The same psychological research is behind RDF.

Q: What new security concerns do you see appearing as the web suppports more semantic data/queries?

A: There are several significant challenges for information security. Some of them are:

  • Changing DNS domain holders. If you query some RDF at a given DNS domain for a long period of time, how do know whether the DNS domain changes hands? You might one day be querying a different resource controlled by someone else.
  • International Resource Identifiers (IRIs). Intended as the internationalized replacement for URIs (so, e.g., Chinese people could have Web addresses in Chinese), IRIs are a boon to black hats (which has slowed their adoption). Consider clicking a link that reads walmart.com, but might go elsewhere because the character set of the address *looks* like US ASCII, but is really in another alphabet.
  • URI curation. Systems like PURLs (see http://purlz.org) allow management of the URI resolution process. This can be a benefit for security, or a detriment, depending on who does the curation.
  • Lack of technical trust mechanisms. We solve almost every issue of trust on the Web today socially. If Linked Data clients follow links (semi)automatically, how will they know when they are beyond trust boundaries?

Q: recommendations for best way to start?

A: I recently surveyed a good number of big companies who successfully fielded Linked Data solutions into production (I notice you are from a big company). Successful organizations had at least three things in common:

  • They all had at least one Semantic Web expert on staff.
  • They all worked on at least one real business problem (not a prototype or proof-of-concept).
  • They all leveraged existing investments, especially those with implied semantics (such as wikis or email).

Q: Are there performance optimizations available when working with RDF data? Can a PB of RDF data be queried in real time?

A: Absolutely yes. Querying an RDF database is going to be much faster than querying a bunch of RDF documents in XML format. Some Open Source RDF databases to look at include:
http://mulgara.org/
http://www.openrdf.org/

If you want to pay, try these:
http://www.talis.com/platform/
http://topquadrant.com/products/TB_Suite.html
http://www.openlinksw.com/

Q: The mixing to which you just referred seems to imply "trusted" sources. Could you discuss?

A: Sure. Just like with anything on the Web, the best kind of trust is socially-developed trust. We think we can trust Google to deliver objective search results. We think we can trust Amazon to give us real used-book prices. Similarly, we think we can trust major stores, publishers and governments to describe their own data well. We may be less sure of sites we don’t know. Some of the sites on the Linked Open Data cloud are very trustworthy (such as the scientific ones or the New York Times) and others less, perhaps due to their underlying data sets (such as DBPedia’s scrapping of Wikipedia). You can trust Wikipedia for some things (such as "hard" information like the periodic table or the location of Massawa) but not so much in regards to contentious subjects like climate change or political figures.

When you write SPARQL queries, you have to name your data sources. You therefore get to choose who you trust and for what.

Q: Anything on how to reuse vocabularies? The alphabet soup makes finding the right schema or OWL ontology just as bad as finding a webpage was in the early days of the web…

A: Yes, it does. There have been several attempts to make sense of the soup by allowing people to look up terms and vocabularies, but none have become dominant yet. A summary of the state of play (a bit dated) is at:
http://www.semanticfocus.com/blog/entry/title/semantic-web-search-engine-roundup/

Some ones to look at are:
http://www.sindice.com/
http://swoogle.umbc.edu/

Q: If we use the web as a database, are there any tools that map the schema, attributes, and attribute properties of the linked data?

A: Good question! Schemas on the Semantic Web are composed of the predicates (the URI-addressable terms linking two things) and additional information describing those predicates. When one creates a SPARQL query, one explicitly lists the Web addresses to data sources to query (because you couldn’t practically query the entire WebÉ). Putting those two statements together, it is possible to query your identified data sources for just the predicates they contain, and then the information about those predicates. That would give you the schema, attributes and attribute properties for those data sources. So, the tool you need is simply a SPARQL endpoint that will accept the SPARQL query you need to write.

Q: Relate at what level of granulairty? Page to page or idea unit to idea unit?

A: Both. Neither. It depends :) I suggested during the webinar that one not try to solve all problems from a top-down perspective. Instead, publishing just the data (and just the relationships) that one needs to solve a particular problem seems to work best, especially in larger teams of people (building top-down consensus can take a long time!).

In your particular case with Hylighter (if that was your question), you might consider objects like people, comments, documents and times so you could perform queries like "show me comments made by Peter on document x between 2:00 and 4:00". Capturing subjects or topics would be harder in your free-form environment, but some people use server-side entity extractors to try things like that. They sometimes work.

Q: Would you recommend a specific RDF, etc. authoring tool (WYSIWYG or otherwise) or is a good old text editor (along with heavy dose of "copy and paste" from existing RDF docs) still the best way to go?

A: I used to joke that programmers in my company could use any IDE they chose: vi or emacs. Text editors work just fine. You may have heard Eric Franzon say that he used Dreamweaver to add RDFa to the Web site he developed. I’ve seen demos of TopQuadrant’s TopBraid Composer (http://topquadrant.com/products/TB_Suite.html), which seems nice if you like a graphical environment. For ontology development, some people prefer Protege (http://protege.stanford.edu/), but I like SWOOP (http://code.google.com/p/swoop/) for its better debugging capabilities. The Eclipse IDE and the Oxygen XML editor also have some support. It really depends which of the many possible jobs you are trying to accomplish and the kind of environment you feel most comfortable in.

NOTE from Eric Franzon: Yes, I did use DreamWeaver, a text editor, and some heavy use of Copy/Paste.

Q: reusing terms/names is all very well but it’s important to understand the MEANING e.g. is one ‘customer’term the same as another?

A: Absolutely! Choosing terms to use on the Semantic Web is equivalent to choosing terms in any other information processing system. You do need to be careful to say what you mean. Fortunately, RDF terms are resolvable on the Web itself (by following terms’ URIs). Each RDF term should provide the ability for a user to read exactly what the author of the term meant it to mean. That situation is better than the short and ambiguous meanings of terms generally associated with IT systems (such as relational database schemas or spreadsheet column names).

Q: Are there query clients for the semantic web?

A: Sure, although at this point most programmers are making their own. There is no de facto standard tool in use by a dominant number of people. You might have a look at these for enterprise use:
http://www.talis.com/platform/
http://topquadrant.com/products/TB_Suite.html
http://www.openlinksw.com/

If you just want to try a few queries for yourself, try these:
http://demo.openlinksw.com/sparql
http://www.sparql.org/query.html
http://hyperdata.org/sparql/demo/
http://data.semanticweb.org/snorql/
http://dbpedia.org/sparql

To get some data to play with, try here: http://esw.w3.org/SparqlEndpoints

Q&A Session for The National Information Exchange Model and Semantic-Driven Development

Q&A Session for "The National Information Exchange Model and Semantic-Driven Development"

Following the webcast, Dan McCreary answered questions from the live audience. Here is a transcript of that discussion.

Q: My understanding is that NIEM is not considered a Federal Information Exchange Model and does incorporate Federal, State, Local, Tribal and Private Sector partners.

A:[Dan McCreary] I am not sure of the meaning of the question. NIEM is a US federal standard and is being used by federal, state, local, tribal and private sector partners. Perhaps the word "not" was a typo? Please see the NIEM web site for more details.

Q: Does NIEM and/or DATA.gov contain vocabularies expressed in RDF/OWL?

A: Not exactly, but you can easily convert the NIEM Core and other XML Schema files into an OWL file. The code is only about 20 lines of XQuery. Please send me an e-mail at dan@danmccreary.com if you would like a me to send you a copy of the source. I have published it using an Open Source Apache 2.0 license.

Q: When would one use SKOS in place the ISO/IEX Metadata repository standard?

A: SKOS is typically used at the beginning stages of the creation of a metadata registry. You start by capturing preferred business terms and their definitions using a simple SKOS editing tool (also available as an open source XRX application for eXist). Next you can start to group related terms using the "broader" tag. After that you can start to link terms together in a taxonomy and classify terms into subsets using "SKOS Schemas" for ISO classifiers. You can then start to see your full ontology forming from the business terms that are grouped together. From there you can mark each SKOS "concept" as being a potential "Conceptual Data Element" and migrate it into your ISO/IEC registry. From there you can create OWL reports.

Q: Sorry, that meant to ask, when would one use SKOS in place of the ISO/IEC Metadata Repository standard?

A: See above

Q: What is a uri?

A: A universal resource identifier. It is like a URL but it might only point to a "concept", but not necessarily a real web page. It is really a way of creating a site-specific way of identifying resources so that they can be merged with other resources on the internet.

Q: What is your view regarding acceptance of alias’es for element names?

A: Aliases are VERY important for aiding search, but should always be marked as such. Aliases are important for fundability of a the "official" or "preferred" term. But tools should prevent people from ever putting aliases in wantlists or subschemas. This would make it harder to merge graphs from different systems.

Q&A Session for Webinar II: SPARQL for Business Rules and Constraint Checking: Introducing SPIN*

The Following Questions were generated during the original webcast on April 1, 2009


Q: How easy is it to debug an application?

A: (Dean Allemang) As a user of this system (not its designer) I can respond to this from experience. It is surprisingly simple. Unlike Rete-based rule engines (for example), rule firing uses a very simple execution model.


Q: Do you do constraint checking before updating the data like in relational DB? How is the performance if data is frequently updated?

A (Holger Knublauch): This has been answered during the webcast.


Q: So if the rectangle has only height set what happens with the constraint.

A (HK): This has been answered during the webcast.


 Q: Can I use SPIN for OWL (not just RDF)?

A (HK): Yes, absolutely. First, OWL models are expressed in RDF and therefore you can query them with SPARQL. Second, you can execute SPARQL on top of models that employ other inference engines, including OWL engines. Or just run an OWL engine first and then operate on the resulting model. Third, you can express most of the OWL semantics using SPIN itself, see my blog entry:

http://composing-the-semantic-web.blogspot.com/2009/01/owl-2-rl-in-sparql-using-spin.html


Q: how hard would it be, compared to SQL, to write more complex aggregate queries, like the average number of female astronauts on missions?

A (DA): I don’t believe we will have time to show these type of queries during the webinar. The ARQ extension to SPARQL has aggregate operations. They are loosely modeled on what you are familiar with in SQL.


Q: Does the rules propagate for the inferred concepts?

A (DA): By default, the SPIN engine in TopBraid iterates over all rules and will therefore also consider intermediate results for the next iteration. This can be switched off to improve performance, if we know in advance that there are no dependencies among the rules.


Q: Is SPIN constrained to rules or are other aspects also used?

Dean Allemang – 11:41 am

A (DA): Can you clarify this question? SPIN is using SPARQL Constructs in the same context where e.g., SWRL uses a Rule. Some other aspects of SPARQL are used in other contexts (e.g., ASK for constraints, and SELECT for function definitions).

Does that address your question? Or did you have something else in mind?

Q: What variety of Functions? yes you addressed the earlier question-Thanks Dean.

Additional answer (HK): The ideas of SPIN can be used in many other contexts as well. For example you could define a new property that links a property with a SPARQL SELECT query, and then whenever someone asks for values of that property, the system could execute those SELECT queries. There are infinite ways of using SPARQL here, and SPIN RDF syntax makes it easy to link concepts with "behavior".


Q: Shouldn’t the template, when generalized, be called something like: Invalidcharacterinstring?

A (HK): Yes, the default label is derived from the initial comment and can then be modified later.


Q: Do you do inference in the process of constraint checking? For example, there is a constraint "every student has a ID". Jane is a graduatestudent which is a subclass of Student. Will you do this ID checking for Jane?

A (HK): Handled during the call. No inferencing by default when constraint checking is done, however the system will already have the inferences up to date if incremental inferencing has been activated.


Q: Is notation .n3 similar to .n2 that you describe in your book?

A (DA): If the book calls it N2, then it is a misprint – the book uses N3.


Q: Will these examples be available for download so we can examine them more closely?

A (HK): My blog contains further details on the computer game, the units example and the spinsquare example. Only the latter it currently for download. Please contact me if you want to get a copy of the computer game.


Q: The current demo answered my question

A (DA): Thanks for that comment – it has been a challenge for some of us to figure out when the game demo (cool as it is) addresses real concerns of prospective practitioners; your comment is very helpful.


Q: what is a LET clause?

A (HK): This if currently an extension of the Jena SPARQL engine only, but (a variation of this) will become part of the upcoming next SPARQL standard. It binds a variable to the result of evaluating the expression on the right. For example LET (?area := (?width * ?height)) computes the product of width and height, and assigns it to ?area, so that the subsequent parts of the query can query the ?area value.


Q: Constrain errors are inserted as triples?

A (DA): a bit more than that, but basically, yes

A (HK): Simple ASK queries do not create triples, but you can have CONSTRUCT queries as constraints, which construct an instance of spin:ConstraintViolation with additional information (see spec for details). You can also make spin:constraint a sub-property of spin:rule so that constraint checking will be done as part of the inferencing process.


Q: hi, you said that the rules are continuously executed. How does this work when using the spin api?

A (HK): You need to implement this yourself – ideally just create a GraphListener that collects all changes (that belong together) and then check which resources have been mentioned in the change. Then re-run those inferences. I can provide details, but please ask this on the TBC mailing list.


Q: for a JAVA/JENA semantic applications what is required to start using SPIN?

A (HK): We provide the TopBraid SPIN API (open source), working on Jena.


Q: I am sorry it was N3 only and is a different semantic way than tagging?

A (HK): I don’t understand this question – please ask on the TopBraid Composer mailing list.


Q: In the game: is the run-loop implemented in SPIN as well? Keyboard "events" create triples? Just curious this was implemented specifically for the game in TopBraid?

A (HK): The run-loop (very small piece of code) is in Java, because I needed fine grained control over each step. The rules create triples that instruct the engine what to do next, e.g. to replace a field next to the current field. Yes, the keyboard is mapped to (temporary) triples that are available in each loop. Yes, the game was just made as a demo for TBC, but it was a really simple exercise.


Q: How scalable is SPIN? How many instances can you handle for a reasonable number of SPIN rules? 100? 1,000? 10,000? 100,000?

A (HK): As many as you like, assuming that the SPARQL queries are efficient. I made a small demo based on wordnet (2 million triples) which was very fast. In SPIN you can leave the data just where it is, while many other inference engines require you to copy all data over into some other data structure first.


Q: Is the game available for download?

A (HK): Yes, the game engine is part of TopBraid Composer. The file to drive the game is available on request from me only (shoot me an email).


Q: Does this depend on binding to any particular reasoning engines? is it just SPARQL?

A (HK): SPIN operates on any SPARQL engine. The TopBraid SPIN engine operates on any Jena-compliant RDF database, including Oracle RDF, Sesame, AllegroGraph.


Q: In the description of the unit example, there was some "magic" involved – is that unit conversion system implemented entirely in SPIN? Or is there a Jena property function-type module at the core?

A (HK): The "magic" was in fact another SPARQL function that was defined using SPIN only – no Java coding or other magic was needed. If you review the video you can see where I navigate into the conversion function and show its embedded SPARQL query. If unclear, I can send you some background.


Q: With SPIN you can write executable semantics. Is there any experience if SPIN is a better approach for domain experts to encode domain logic? Are there numbers of how much lets say Java code could be avoided?

A (HK): SPIN is very new and so we have no empirical evidence except for our own evaluations. I strongly believe that you can use SPIN as a model-driven approach that saves tons of Java code.


Q: enjoyed it

A (HK): Me too! Thanks for attending.

Author Guidlines

 

Semantic Universe Author Guidelines

SemanticUniverse.com is an online journal focused on the application of semantic technologies in a variety of consumer, business and academic settings. We have subscribers in every part of the world, and from most industries. Our audience is represented by a wide range of job functions, from enterprises application managers, to venture capitalists, software engineers, entrepreneurs, researchers and web developers.

Because of this diverse audience, we welcome submissions from a wide community of practitioners. Please read the following material and review our style sheet before submitting an article for publication consideration.

Permission to Publish

All authors must agree to convey a limited copyright to SemanticUniverse.com prior to publication (you can cut and paste the text below into an email). This copyright is as follows:

I hereby grant SemanticUniverse the exclusive right to publish my article or work entitled “NAMEOFARTICLE” (the Article) without restriction, within 60 days of my submission to SemanticUniverse. I agree that the Article may be:

  – edited and/or re-titled as seen fit by Semantic Universe editors

  – re-purposed, excerpted or republished for promotional or other purposes

As author, I retain the rights to my own originally authored material. However, if the Article is published by Semantic Universe I agree not to publish or re-publish the Article substantially as written in any other publication without the prior permission of SumanticUniverse, for a period of at least 6 months after it is initially published by Semantic Universe.

Submissions: Send proposed Articles and submissions to the editor, Tony Shaw (tony AT semanticuniverse.com)

Submission Guidelines – VERY IMPORTANT

Please make your submission in the following format:

* Single-spaced manuscript in a Word, text or HTML file. Please include the full name, job title, organizational affiliation, and e-mail address of each author.

* Brief summary abstract (100 words or fewer), suitable for use as a “teaser” for the article when published.

* Biographical summary (150 words or fewer for each author)

* A high-quality JPEG photo for each author – sent as a separate attachment. Send each file as a separate file appropriately titled. Do not send author pictures embedded in Word, PowerPoint or PDF documents.

* Any additional files (e.g, tables, figures, clips, images, etc.) should be send as a separate file appropriately titled. Do not send author pictures embedded in Word, PowerPoint or PDF documents.

Semantic Universe Feeds

SPARQL by Example – Part I • Q & A with Lee Feigenbaum

Thanks to everyone who attended the SPARQL By Example Web cast or who has watched the archived recording of it. There was a tremendous level of enthusiasm during the one hour presentation, and as a result we did not have the chance to answer all of the excellent questions that participants submitted. Below, I’ve tried to summarize most of the unanswered (and some of the answered) questioons and provided some explanations and pointers to further information. Also, please note that due to the popularity of the first session, we’ll be holding a continuation Web cast on Thursday, January 22, at 1:00 PM EST / 10:00AM PST. During that Web cast, we’ll continue our example-driven look at some of the more advanced features of SPARQL. I hope you can join us then!

Lee Feigenbaum

About SPARQL Endpoints

We had several questions about SPARQL endpoints. A SPARQL endpoint is any URL on the Web that implements the SPARQL protocol. Generally speaking, this means that if the URL http://example.com/sparql is a SPARQL endpoint, then we can send queries to it by issuing requests to a URL that looks like: http://example.com/sparql?query=SELECT+%3Fname+WHERE.... Note that the query itself is passed to the endpoint as a URL-encoded string.

The SPARQL protocol is defined as an abstract interface that can be implemented over HTTP GET, HTTP POST, or SOAP. (The above example would work for a SPARQL endpoint that implements the protocol over HTTP GET.) An endpoint will normally return the results of a SPARQL query using the SPARQL Query Results XML Format, a simple XML format for returning a table of variables and their values that satisfy a query. Many SPARQL endpoints also support other return formats via content negotiation, such as a JSON result format or various RDF serializations.

In the tutorial, we ran our queries by going to a Web page and pasting the queries into a form. Those Web forms are not themselves SPARQL endpoints, but when we submit the forms the queries are being submitted to SPARQL endpoints. Many public SPARQL endpoints provide this type of human-friendly form for designing, developing, and debugging SPARQL queries.

In the tutorial, we also saw two types of SPARQL endpoints in action. When we ran queries against Tim Berners-Lee’s FOAF file, we used a generic SPARQL endpoint. This type of endpoint sits somewhere on the Web and goes out to retrieve RDF data from elsewhere on the Web to run a query. Because a generic SPARQL endpoint will query against arbitrary RDF data, we must specify the URL of the graph (or graphs) to run the query against. We do this either using the input boxe provided on the human-friendly forms, or using the SPARQL FROM clause. We also saw specific SPARQL endpoints such as DBPedia and DBTune. These endpoints are hardwired to query against a fixed dataset. Because a specific SPARQL endpoint will always query against the same data, we do not need to use the FROM clause when writing queries for these endpoints.

SPARQL and Reasoning

A few participants asked questions about the interaction between SPARQL and reasoning. In other words, for example, when I write a SPARQL query to search for all mammals, will I receive results for human beings that are not also explicitly typed as mammals? The short answer is that while some SPARQL implementations do inform their results via RDFS or OWL reasoning, many do not. The SPARQL standard does not require that query results take any reasoning into account.

For a more detailed answer, please see these two answers in the SPARQL FAQ.

Learning About an RDF Dataset

An insightful question cropped up a few times during the Web cast: How do we know what type of data lurks behind a SPARQL endpoint? How do we know what predicates (relationships) exist to be queried for? How do we know what types (classes, the objects of an rdf:type predicate) exist?

In many cases, we know via an out-of-band source. Perhaps a SPARQL endpoint also publishes documentation of their dataset, along the lines of the music ontology used by the DBTnue.org dataset we looked at. Other datasets build on well-known vocabularies, such as the core RDF and RDFS terms, or the common FOAF and Dublin Core vocabularies. And still other times we find ourselves writing SPARQL queries to access datasets that we (or our software applications) have created ourselves, and therefore we simply know what we want to query for with SPARQL. These out-of-band scenarios are really no different from how we know what databases, tables, and columns to query for when constructing an SQL query.

On the other hand, a significant part of the appeal of the Semantic Web in general, and of SPARQL in particular, is the ability to start with nothing but a SPARQL endpoint and to dive in and learn about the data lurking behind the endpoint. The basic mechanism by which we can do this is by writing queries that use variables to find all of the predicates and all of the types that exist in a dataset, and then to pick out interesting predicates and types and use open-ended queries to explore the structure of the data. Dean Allemang has written a blog post on this exact subject, so I’ll gladly reference his writing on using SPARQL to explore an unknown dataset.

SPARQL Language / Features

A few quick hits here to address some lingering questions:

  • SPARQL FROM clauses do not have a JOIN construct the way SQL queries do. This is because the graph model over which SPARQL queries naturally joins data together. That is, what would be a SQL inner join is expressed implicitly in SPARQL simply by including two triple patterns that reference a common variable (such as ?known in one of our early examples). In fact, the ease with which joins are written in SPARQL is one reason that SPARQL is particularly well-suited to writing queries that bring together data from multiple sources.
  • SPARQL contains the UNION keyword for "OR"ing together multiple triple patterns. The presentation includes an example of this in action.
  • The SPARQL OPTIONAL keyword is the equivalent of a SQL outer join.
  • One of the built-in SPARQL filter functions performs regular expressing matching. We could use that to limit results to just those with w3.org email addresses by adding: FILTER(regex(?email, "@w3\\.org")) to our query.
  • The a keyword in SPARQL is an abbreviation for the common predicate rdf:type that relates a resource to its semantic type/class.

I’m sure there are other questions that I have not managed to address here. Please drop me a line with any other questions. You can also check out the SPARQL FAQ that I maintain. Thanks!

Contact Us

We are always happy to hear from you.

For advertising opportunities, contact advertising@semanticuniverse.com.

If you have suggestions, comments, or questions, or, would like to contribute content, contact editor@semanticuniverse.com.

For technical issues, contact admin@semanticuniverse.com.

   Semantic Universe   
   300 Corporate Pointe 
   Suite 515
   Culver City, CA 90230 
   USA
   Telephone 310-337-2616 x102
   Email: admin@semanticuniverse.com

NEXT PAGE >>