Ruby and the Semantic Web

Posted by gkellogg Wed, 07 Dec 2011 06:21:00 GMT

This evening, I gave a talk on using Ruby RDF.rb and assorted gems at the Lotico San Francisco Semantic Meetup. I’ve uploaded slides to Slide Share.

I also showed a simple demo using the GitHub API to create FOAF and DOAP records for accounts and repositories, and to do some simple navigation. The demo is running at http://greggkellogg.net/github-lod, and source is (of course) available on GitHub.

The demo is not intended to be a complete application, but it shows some basic capabilities [Ruby LinkedData][(http://rubygems.org/gems/linkeddata) for generating RDF in a variety of formats from Active Record models (which cache the GitHub API calls). The Web-pages are, of course, marked up with RDFa, and you can use content-negotiation, or append an appropriate extension to the URLs, to retrieve the data in alternative RDF formats.

SPARQL Algebra

Posted by gkellogg Fri, 04 Mar 2011 12:17:00 GMT

For those intrepid enough, I've pushed version 0.0.2 of sparql-algebra. It relies on unreleased changes to RDF.rb and sxp-ruby, so you need to use bundler with the included Gemfile.

SPARQL Algebra implements the s-expression-based SPARQL algebra described in SPARQL 1.1 and Jena. Remaining work needed for _describe_ operator and query optimizations. This is the base for translation from SPARQL Grammar [4], which requires just a bit more work to be fully compliant. Both of these, along with support for an HTTP endpoint and solution serializer, will be sufficient to implement a complete SPARQL solution in pure Ruby.

SPARQL Algebra passes all but four W3C DAWG tests (data-r2), with those four not being worth implementing, in my opinion. As an example of an SSE based on the SPARQL grammar, consider the following:

PREFIX  foaf:  <http://xmlns.com/foaf/0.1/>

SELECT ?mbox ?name
 {
   ?x foaf:mbox ?mbox .
   OPTIONAL { ?x foaf:name  ?name } .
 }

which is equivalent to the following SSE:

(prefix ((foaf: <http://xmlns.com/foaf/0.1/>))
   (project (?mbox ?name)
     (leftjoin
       (bgp (triple ?x foaf:mbox ?mbox))
       (bgp (triple ?x foaf:name ?name)))))

There are outstanding pull requests to RDF.rb and sxp-ruby that are required to release it to RubyGems, but you're encouraged to play with it and send feedback!

Thanks to Arto and Ben for the initial work they did on this, and other enabling projects, as well as creating an excellent executable test suite!

Update

SPARQL::Grammar now complete, generating SPARQL::Algebra classes, allowing a complete end-to-end SPARQL solution for Ruby.

RDF::RDFa, RDF::RDFXML, and RDF::N3 0.3.0 releases

Posted by gkellogg Tue, 28 Dec 2010 00:25:00 GMT

The Nokogiri-based reader suite for the RDF.rb environment. This version offers substantial performance gains, due to general improvements in RDF.rb as well as a number of improvements in the readers:

General improvements

  • Readers save prefix definitions in :prefixes. Writers use :prefixes, or :standard_prefixes to generate QNames.
  • Readers supports :canonicalize and :validate options

RDF::N3

  • New parser based on Tim-BL's Predictive Parser supports quoted graphs and variables.
  • Stream-based reader can process an indefinite length input file, vs. the older Treetop-based reader that was a two-pass parser.
  • Substantial performance improvement over previous version, running at about x statements/second on an iMac.
  • From History:
    • New Predictive-Parser based N3 Reader, substantially faster than previous Treetop-based parser
    • RDF.rb 0.3.0 compatibility updates
      • Remove literal_normalization and qname_hacks, add back uri_hacks (until 0.3.0)
      • Use nil for default namespace
      • In Writer
        • Use only :prefixes for creating QNames.
        • Add :standard_prefixes and :default_namespace options.
        • Use """ for multi-line quotes, or anything including escaped characters
      • In Reader
        • URI canonicalization and validation.
        • Added :canonicalize, and :intern options.
        • Added #prefixes method returning a hash of prefix definitions.
        • Change :strict option to :validate.
        • Add check to ensure that predicates are not literals, it's not legal in any RDF variant.
    • RSpec 2 compatibility

RDF::RDFXML

    • RDF.rb 0.3.0 compatibility updates
      • Remove literal_normalization and qname_hacks, add back uri_hacks (until 0.3.0)
      • Use nil for default namespace
    • In Writer
      • Use only :prefixes for creating QNames.
      • Add :standard_prefixes and :default_namespace options.
      • Improve Writer#to_qname.
      • Don’t try to translate rdf:_1 to rdf:li due to complex corner cases.
      • Fix problems with XMLLiteral, rdf:type and rdf:nodeID serialization.
    • In Reader
      • URI canonicalization and validation.
      • Added :canonicalize, and :intern options.
      • Change :strict option to :validate.
      • Don’t create unnecessary namespaces.
      • Don’t use regexp to substitute base URI in URI serialization.
      • Collect prefixes when extracting mappings.
    • Literal::XML
      • Add all in-scope namespaces, not just those that seem to be used.
    • RSpec 2 compatibility

RDF::RDFa

    • RDF.rb 0.3.0 compatibility updates
      • Remove literal_normalization and qname_hacks, add back uri_hacks (until 0.3.0)
      • Use nil for default namespace
    • In Writer
      • Use only :prefixes for creating QNames.
      • Add :standard_prefixes and :default_namespace options.
      • Improve Writer#to_qname.
      • Don’t try to translate rdf:_1 to rdf:li due to complex corner cases.
      • Fix problems with XMLLiteral, rdf:type and rdf:nodeID serialization.
    • In Reader
      • URI canonicalization and validation.
      • Added :canonicalize, and :intern options.
      • Change :strict option to :validate.
      • Don’t create unnecessary namespaces.
      • Don’t use regexp to substitute base URI in URI serialization.
      • Collect prefixes when extracting mappings.
    • Literal::XML
      • Add all in-scope namespaces, not just those that seem to be used.
    • RSpec 2 compatibility

RdfContext version 0.5.4 with provisional RDFa 1.1 support

Posted by gkellogg Fri, 07 May 2010 07:25:00 GMT

I just released version 0.5.4 of RdfContext to GitHub and GemCutter. This version is notable for including support for RDFa 1.1 parsing. This is still based on an Working Draft, so it will likely change in the future.

RDFa 1.1 includes support for profiles, vocabularies and terms. And supports using URIs or CURIEs or terms anywhere that's legal within an HTML document. Right now, only the XHTML+RDFa profile is supported.

Default term URI using @vocab.

RDFa 1.1 allows URIs to be expressed using an NCName, called a term, by using the @vocab attribute, an author can define a URI to be used for a bare word to turn it into a URI. Take for example the following:

<div vocab="http://xmlns.com/foaf/0.1/">
   <p about="#me" typeof="Person" property="name">Gregg Kellogg</p>
</div>

willgenerate the following triples:

<#me> a foaf:Person;
  foaf:name "Gregg Kellogg" .

Profile documents for defining prefixes and terms

A Profile document allows the specification of a set of URI mappings and term mappings in a single document. These documents are RDF formatted, and may or may not be RDFa. The following shows an example Profile document:

@prefix rdfa: <http://www.w3.org/ns/rdfa#> .
[ rdfa:prefix "foaf"; rdfa:uri "http://xmlns.com/foaf/0.1/"] .
[ rdfa:prefix "dc"; rdfa:uri "http://purl.org/dc/terms/"] .
[ rda:term "name"; rdfa:uri "http://xmlns.com/foaf/0.1/name"] .
[ rda:term "created"; rdfa:uri "http://purl.org/dc/terms/created"] .

This bit profile results in namespace mappings and a bare terms. Multiple vocabularies may be used together to create a namespace composed of terms from several vocabularies, without needing to describe them explicitly. These may then be used in a document as follows:

<div profile="http://example.com/my_vocab">
  <p about="#me">
    <span property="name">Gregg Kellogg</span>
    is the author of
    <a rel="created"
        resource="http://github.com/gkellogg/rdf_context">
      RdfContext
    </a>
  </p>
</div>

Namespace definitions

RDFa 1.1 deprecates the use of @xmlns for defining namespace prefixes. The @prefix attribute defines one or more mappings between prefixes and URIs. For example:

<div prefix="foaf: http://xmlns.com/foaf/0.1/ dc: http://purl.org/dc/terms/">
  <p about="#me">
    <span property="foaf:name">Gregg Kellogg</span>
    is the author of
    <a rel="dc:created"
        resource="http://github.com/gkellogg/rdf_context">
      RdfContext
    </a>
  </p>
</div>

Defines and uses two different namespace mappings.

URIs Everywhere

In RDFa 1.0, certain attributes took a URI, others a CURIE, and still others either a URI or a Safe CURIE. This is confusing, and RDFa 1.1 now allows either URIs, CURIEs, or SafeCURIEs to be used most anywhere (SafeCURIEs are maintained for backwards compatibility). For example:

<div>
    <p about="#me">
      <span property="http://xmlns.com/foaf/0.1/name">
        Gregg Kellogg
      </span>
      is the author of
      <a rel="http://purl.org/dc/terms/created"
          resource="http://github.com/gkellogg/rdf_context">
        RdfContext
      </a>
    </p>
  </div>
  

Change History

The following is the change log for this version of RdfContext. Note that one change may potentially break existing code: URIRef#namespace no longer throughs an exception if a mapping is not found. Other changes are noted here:

  • RDFa 1.1 parsing supported (based on RDFa Core 1.1 W3C Working Draft 22 April 2010)
  • Fix URIRef#short_name (and consequently #base and #namespace) to not extract a non-hierarchical path as a short_name
  • Namespace no longer uses URIRef, but just acts on strings.
  • Namespace#new does not take an optional _fragment_ argument any longer.
  • Added Namespace#to_s to output "prefix: uri" format
  • Graph#qname first trys generating using bound namespaces, then adds well-known namespaces.
  • URIRef#to_qname and #to_namespace No longer generates an exception. Each take either a Hash or an Array of namespaces and tries them from longest to shortest.
  • Improved Turtle and XML serializers in use of namespaces.
  • Generate pending messages if RDFa tests skipped due to lack of Redland installation.
  • Change dcterms: prefix to dc: (fully compatible with previous /elements/ definitions)

RdfContext version 0.5.1 brings Turtle and enhanced RDF/XML serializers

Posted by gkellogg Sat, 03 Apr 2010 07:57:00 GMT

 Just pushed version 0.5.1 of RdfContext to GitHub and Gemcutter. This version includes a Serializer framework, including a AbstractSerializer, RecursiveSerializer and Turtle and RDF/XML serializers based on these. The RDF/XML serializer is a big improvement over the previous version, including Typed element names an RDF Container folding using parseType="collection".

RdfContext includes native Ruby parsers for RDF/XML, RDFa and N3-rdf, which includes Turtle and N-Triples. All parsers pass W3C tests (included in specs). It also includes context-aware quad store, with in-memory and SQLite3 storage models.

RdfContext gem version 0.4.5 pushed

Posted by gkellogg Sat, 09 Jan 2010 00:11:00 GMT

Bug fixes and minor API changes:

  • Order includes to remove platform dependencies on load requirements.
  • Fix XML Comparison matcher
  • Add --store option to bin/rdf_context. Let Parser#detect_format intuit file type./li>
  • Add Graph#contents as alias to store
  • Add ConjunctiveGraph#triples to return all triples from store, not just default context
  • Add Graph#nsbinding to retreive Store#nsbinding<
  • Fix numerious SQLite3Store and AbstractSQLStore bugs. Add round-trip tests to Graph spec.

RdfContext gem released

Posted by gkellogg Mon, 04 Jan 2010 01:40:00 GMT

I've released version 0.4.4 of the RdfContext gem. As the name implies, RdfContext supports contextual data-stores bound to graphs, along with a ConjunctiveGraph providing the union of contexts within a given data-store.

  • Parses RDF/XML, RDFa and N3. RDF/XML and RDFa both pass all relevant W3C test cases (may be run through specs).
  • Graph and ConjunctiveGraph with pluggable data-stores. MemoryStore and SQLite3Store both support contexts as well as quoted-graphs and formulae, although no appropriate graph classes yet exist.
  • Graphs serialize to N-Triples and RDF/XML.
  • An RDF distiller runs on this site to test out different parsers. This is also useful for running automated RDFa Test Harness. 

RdfContext is based, in part on Tom Morris' Reddy gem. See the readme on GitHub for more information. MemoryStore, SQLite3Store and ConjunctiveGraph are largely ports of Python RDFLib.

rdfa_parser gem released

Posted by gkellogg Sun, 18 Oct 2009 22:11:00 GMT

I just released version 0.1.0 of the rdfa_parser_gem. This parser is written in pure Ruby and uses Nokogiri XML parsing. It passes all XHTML1 test cases and most of the existing test cases for HTML4 and HTML5.

The gem is based on previous work done by Ben Adida and some libraries borrowed from the Reddy Gem.

The project is hosted on GitHub, feel free to clone. You can try out the parser through a distiller.