Best practice for getting a single 'thing' URI instead of multiple language-specific page URIs?

Events happening in the community are now at Drupal community events on www.drupal.org.
jneubert's picture

I'm trying to set up a site about projects (http:/zbw.eu/labs), described by the doap ontology. Some of the metadata items for these projects (e.g., the descriptions) are provided in English and German. The custom node type for the projects is configured with entity/field translation. Currently, I use URL path prefixes for language detection. So far, a pretty straightforward setting.

However, I want to publish ONE URI for the project itself (besides the different URIs for language specific content). What the rdf module gives me currently, with an url alias in place, is

  about="/labs/en/project/zbwlabs" (for the English page), OR 
  about="/labs/de/project/zbwlabs" (for the German page)

whereas "/labs/project/zbwlabs" would be more appropriate to talk about the project itself.

Is there a standard way, or are there best practices deal with this? I'm not sure if the solution I found (see below) is a good fit.

The described behavior can be traced to the following code in rdf.module:

<?php
/**
* Implements MODULE_preprocess_HOOK().
*/
function rdf_preprocess_node(&$variables) {
 
// Adds RDFa markup to the node container. The about attribute specifies the
  // URI of the resource described within the HTML element, while the @typeof
  // attribute indicates its RDF type (e.g., foaf:Document, sioc:Person, and so
  // on.)
 
$variables['attributes_array']['about'] = empty($variables['node_url']) ? NULL: $variables['node_url'];
 
// ...
  // lots of other stuff
}
?>

I was able to override the default behavior by a process hook implementation (preprocess did not work):

<?php
/**
* Implements MODULE_process_HOOK().   
* Runs a node type specific process function, if it exists.
*/
function labs_lod_process_node(&$variables) {
 
$function = __FUNCTION__ . '_' . $variables['type'];
  if (
function_exists($function)) {
   
$function($variables);
  }
}

/**
* lproject-specific implementation of MODULE_process_HOOK().
*/
function labs_lod_process_node_lproject(&$variables) {
 
// add a language agnostic URI for the node
  // overrides the result of rdf_preprocess_node()
 
global $base_url, $base_root, $language;
  if (empty(
$variables['node_url'])) {
   
$url = NULL;
  } else {
   
$url = $variables['node_url'];
   
$base_path = str_replace($base_root, '', $base_url);
   
$url = str_replace($base_path . '/' . $language->language, $base_path, $url);
  }
 
$variables['attributes_array']['about'] = $url;
 
$variables['attributes'] = empty($variables['attributes_array']) ? '' : drupal_attributes($variables['attributes_array']);
}
?>

It feels a bit like a hack, however.

I wonder if a pluggable/overrideable function for building the rdfa "about" attribute could be a better solution. It could also be used in other places - in entity.module, for example. In its template_preprocess_entity() function, this module seems to provide a slightly different implementation:

$uri = entity_uri($entity_type, $entity);
// ...
$variables['attributes_array']['about'] = empty($uri['path']) ? NULL: url($uri['path']);

Other people may have the requirement to drop language specific domain names. Others again may look for a way to implement a http-range-14 compliant solution. So such a function could perhaps make different people happy. What do you think?

Cheers, Joachim

Comments

In Drupal 7, each translation

scor's picture

In Drupal 7, each translation is a node, as a consequence the rdf module will output whatever node url you are on. In Drupal 8 things will be different, where the translations will be at the field level, so you will have a unique node to deal with.

I know in Drupal 7 there is a way to redirect a user to a specific translation based on their browser settings (not sure if that's core locale or some contrib module). In your set up do you have a way to generate a language agnostic url, and rely on this redirect mechanism to serve the appropriate page? you could use this language agnostic URI in your RDF output.

Sorry, I did not point that

jneubert's picture

Sorry, I did not point that out thoroughly: The field translation model can be used in Drupal 7, too, and I use this model (not node translation). So I am dealing already with a unique node, which has different fields for body, title (through title module), and so on. The language detection and selection, content negotiation and redirection is handled by the locale and entity_translation module (perhaps with the aid of others). They offer, besides URL prefixes as used in the example above, language-specific domain names (e.g., en.zbw.eu, or de.zbw.eu) as another option. The definition of aliases for these nodes can be done in one place, omitting the language-specific part.

Content negotiation works in this setting, delivering - for the language-agnostic URI - English content by default, and German, when French or German were requested:

# curl -IL http://localhost/labs/project/zbwlabs
HTTP/1.1 200 OK
Date: Wed, 08 Aug 2012 19:06:40 GMT
Server: Apache/2.2.3 (CentOS)
X-Powered-By: PHP/5.3.3
Expires: Sun, 19 Nov 1978 05:00:00 GMT
Last-Modified: Wed, 08 Aug 2012 19:06:40 +0000
Cache-Control: no-cache, must-revalidate, post-check=0, pre-check=0
ETag: "1344452800"
Content-Language: en
Link: </labs/en/node/5>; rel="shortlink",</labs/en/project/zbwlabs>; rel="canonical"
X-Generator: Drupal 7 (http://drupal.org)
Connection: close
Content-Type: text/html; charset=utf-8

# curl -ILH "Accept-Language: fr, de" http://localhost/labs/project/zbwlabs
HTTP/1.1 200 OK
Date: Wed, 08 Aug 2012 19:11:05 GMT
Server: Apache/2.2.3 (CentOS)
X-Powered-By: PHP/5.3.3
Expires: Sun, 19 Nov 1978 05:00:00 GMT
Last-Modified: Wed, 08 Aug 2012 19:11:05 +0000
Cache-Control: no-cache, must-revalidate, post-check=0, pre-check=0
ETag: "1344453065"
Content-Language: de
Link: </labs/de/node/5>; rel="shortlink",</labs/de/project/zbwlabs>; rel="canonical"
X-Generator: Drupal 7 (http://drupal.org)
Connection: close
Content-Type: text/html; charset=utf-8

So most of the hard pieces should be in place already. What (to my knowledge) is missing, is an api function to access the language-agnostic URI of the node, and a way to reference this language-agnostic URI in the "about" RDFa attribute. To this purpose I wrote the code above, and I would be happy if there were a better way to do it - perhaps in Drupal 8 :)

Thanks for looking into this - Joachim

ok, thanks for giving more

scor's picture

ok, thanks for giving more details, I understand now. have you tried with entity_uri()? I think it would make more sense to work on this at the entity API level, since it would benefit all the other modules leveraging the entity API. Note that it might already be available in entity API. I'd recommend to file a support request in the entity API issue queue to find out. Check also the documentation of the translation modules you're using. If you want to narrow it down to a specific module, maybe try to disable the translation module one at a time to see which one adds the language in the URL.

Well, the language in the URL

jneubert's picture

Well, the language in the URL is a feature (provided by locale module), which I enabled deliberatly, because it allows users to switch languages at will, and also enables links to a language-specific page.

Thank you for the hint to entity_uri(). It returns 'node/5' as the 'path' array element. This is indeed language-agnostic, but includes a serial of the particular Drupal installation, which does not make it exactly a cool URI. The "canonical" uri (in the example above) is built in node_view_page by calling url() on the 'path' element returned by entity_uri().

If the url() function is called with the language 'undefined' as an option, it returns the language-agnostic url. And if an language-agnostic alias exists, this alias is used. Now my code looks much cleaner:

<?php
/**
* lproject specific implementation of MODULE_process_HOOK().
*/
function labs_lod_process_node_lproject(&$variables) {
 
// set a language agnostic "about" URI for the node
  // overrides the result of rdf_preprocess_node()
 
$url = language_agnostic_url($variables['node']);
 
$variables['attributes_array']['about'] = $url;
 
$variables['attributes'] = drupal_attributes($variables['attributes_array']);
}

/**
* Get the canonical URL without the language specific parts.
*/
function language_agnostic_url($node) {
 
// get the internal url (node/nnn)
 
$entity_uri = entity_uri('node', $node);
 
// if exists, return the alias for undefined language
  // (otherwise, the internal url is returned)
 
$language = (object) array('language' => 'und');
 
$options = array ('language' => $language);
 
$url = url($entity_uri['path'], $options);
  return
$url;
}
?>

I'm quite happy with this state of affairs.

However, for the RDF modules in Drupal 8 I'd suggest

  1. providing a hook or a pluggable function for setting the "about" URI
  2. providing a helper function for language-agnostic URIs (based on the standard Drupal functionality as described above)
  3. providing a configuration option to activate the use of language-agnostic "about" URIs for multilingual sites (if some preconditions are matched - see below)

Opening up the construction of "about" URIs brings a lot more flexibility (especially for people who want to use different URI schemes, e.g. hash URIs).

In the Drupal 8 Multilingual Initiative Drupal tries to unify and enhance its multilingual support.
As multilingual content is a field where LOD/Semantic Web technologies shine, it would be great if the RDF modules would support this "out of the box".

This would apply to multilingual sites using field translation and language detection from URL (LOCALE_LANGUAGE_NEGOTIATION_URL_PREFIX or LOCALE_LANGUAGE_NEGOTIATION_URL_DOMAIN).

As far as I can see, the approach described above should be quite robust: Firstly, against changes in the multilingual functionalities (because all the parts used are quite simple and here to stay). Secondly, against different approaches of redirection and SEO, as provided by redirect/global_redirect modules (because it simply takes the output of the url() function).

Thus, it should be doable - what do you think?

Cheers, Joachim

Semantic Web

Group organizers

Group notifications

This group offers an RSS feed. Or subscribe to these personalized, sitewide feeds: