RDFa into Drupal 7 core is on its way! It is now time to start implementing it. We need to agree on a roadmap to help ensure we all head in the same direction before rushing to our keyboards and posting patches on drupal.org. I met up with John and Alexandre from the SIOC project and we came up with the following approach.
What is RDFa?
RDFa provides a set of attributes to annotate XHTML documents with machine-readable semantics. It is then possible to extract RDF data from these pages. Read more on RDFa and check its primer for examples.
RDF namespace registry
In order to be able to use Compact URIs (CURIEs) when referring to RDF vocabulary terms, prefixes should be defined in the
<html> tag using the XML Namespace mechanism. A simple registry can collect these namespaces defined by modules and serialize them in the header of generated XHTML output. This allows greater flexibility compared to hardcoding them in page.tpl.php: contributed module can define extra namespaces which might not come with Drupal core and it's also less work for theme maintainers.
The following changes concern the page.tpl.php file.
The doctype would need to be changed to
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML+RDFa 1.0//EN" "http://www.w3.org/MarkUp/DTD/xhtml-rdfa-1.dtd">
<html>tag would contain a list of namespaces used in the document serialized from the namespace registry such as
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en" dir="ltr"
The template files should be updated to include a link to a GRDDL transformation:
At present the tag
<h1 id="site-name">is hardcoded in the default page.tpl.php file. While it would be possible to hardcode
property="dc:title"directly in the template , a better approach would be to provide the template with a variable similar to the existing
$titlewhich would contains the correct RDFa property. Other variables might need to be created in the same fashion.
<div id="content">is most likely the best place to insert the about attribute defining the URI of the current entity (product, person, book...) represented by the page. A default value can be #self or #it:
<div id="content" about="http://example.com/node/123#self">
To ensure the URI is unique, it is important to choose a fragment which does not exist in the DOM tree of the page.
RDFa and the theme layer
RDFa requires XHTML which is already generated by Drupal.
Each module should be able to tag its data with RDFa. Since RDFa operates on the XHTML level, modules can specify their RDFa attributes via the theme functions along with the XHTML code.
Using the helper function l() we can write a link to Bob's homepage:
l('Bob','http://example.com/bob', array('attributes' => array('property' => 'foaf:name', 'rel' => 'foaf:homepage')));
which will output (ignore
<a href="http://example.com/bob" property="foaf:name" rel="foaf:homepage">Bob</a>
theme_item_list() would need to be modified to allow the title
<h3> to be tagged via the
$attributes argument, following the same model as above with l(). Otherwise it is possible to embed RDFa in the list elements
<li> via the
$attributes arguments as follows:
$items = array();
$items = array('Gift of Silence', 'property' => 'dc:title');
$items = array('Joe', 'property' => 'dc:creator');
$items = array('2006-10-01', 'property' => 'dc:date');
theme('item_list', $items, 'the book', 'ul', array('about' => 'http://example.com/book#gift_of_silence'));
<li property="dc:title" class="first">Gift of Silence</li>
<li property="dc:date" class="last">2006-10-01</li>
RDFaification of basic Drupal pages
While the above is being implemented, the XHTML output of the Drupal basic page types (blog, book, forum, user profile... ) including the blocks could be tagged with the most relevant RDF terms. This in itself could require some slight vocabulary alignments, which will be possible at least in the case of SIOC. For instance, an OnlineBook class was recently added in the SIOC Types module to fit the needs of a particular Drupal-enabled project and a class such as 'ProfilePage' might be added in the same module if needed.
The Drupal RDF Schema I posted earlier this year will be one of the things we are planning to work on at the next VoCamp in Galway (Nov 25th - 26th). VoCamp is a series of informal events where people can spend some dedicated time creating lightweight vocabularies/ontologies for the Semantic Web/Web of Data. One of the goals will be to update the Drupal core schema with the most suitable terms for describing Drupal data and tag its HTML output. The event is free and there are still a few places left.
RDFa in content types and fields
Since fields will be present in Drupal 7 along with content types, it would be a good idea to give site architects basic control over the RDF terms used to describe the content of their site. A simplified version of the RDF CCK module I presented in Szeged would suffice, where the RDF terms can be specified in a textfield or chosen from a short list of predefined terms.
- Each content type is described by an RDF class
- Each field is described by an RDF property
Modules implementing their content type and fields could predefine these terms, but site administrators could change these default terms to better match their application.
There will be some need to document RDF/RDFa best practices in order to educate both module maintainers and site architects.
- Should we force RDFa or should there be an option to turn it off?
- In some rare situations, site owners might not want to expose RDF data about their users for example. In the past, some sites had to turn off their FOAF exports after complaints. A mechanism should be implemented in order to let site administrators to opt-out. It could be done on the permissions level, content type/ field level, or on the node level (similarly to the way comments work). That would be easy to do in the l() function. For the theme functions, another mechanism should be put into place.
- RDFa in the node body
- This would be more the role of an WYSIWYG editor, or could be typed by hand. This is a separate issue, and we should focus first on implementing RDFa in the code Drupal outputs as described above.