Proposal for RDF in Contrib, storing and loading, after first day of sprint

You are viewing a wiki page. You are welcome to join the group and then edit it. Be bold!

Posted by mlncn on May 11, 2009 at 6:29pm
Last updated by mlncn on Wed, 2009-05-13 09:17

To become an update for the issue RDFa: Add semantics from the ground up:

UPDATE ON THE UPDATE: This is the approach we think could be taken in contrib for RDF, and the original title ~~Proposal for RDFa in Core, storing and loading, after first day of sprint~~ has been changed. You can see the current approach over here.

Our CRUD functions:

<?php
rdf_mapping_create($bundle, $attribute, $rdf_term);
rdf_mapping_read($bundle, $attribute = NULL);
rdf_mapping_update($bundle, $attribute, $rdf_term);
rdf_mapping_delete($bundle, $attribute);
?>

You never load a bundle-- you load an object, and then you know what bundle is attached to it, and you get the fields attached to that object. So node.module and user.module etc need to call our CRUD functions.

node_type_save would have to handle title, type, revision

An example hook_node_type($op) implementation:

<?php
function node_rdf_node_type('save', $node) {
  rdf_mapping_create($node->type, 'title', 'dc:title');
  rdf_mapping_create($node->type, 'uid', 'dc:creator');
?>

We will want a way for contrib modules to modify the mapping.

So while field_rdf.module's implementation of hook_field_create() calls rdf_mapping_create(), rdf_mapping_create() and rdf_mapping_update have a call to all modules that implement hook_rdf_mapping_alter($bundle, $attribute, )

We will want a way for contrib modules to modify the mapping.

How would we do this for a contrib module, such as flag module? It can have itself added to the rdf array that is attached to fieldable objects (objects that implement bundle, even if they don't have fields) just like anything else:

<?php
$node = array(
  'type' => 'page',
  'title' => 'My page title',
  'flag_bad' => TRUE,
  'rdf' => array(
    'title' => 'dc:title',
    'flag_bad' => 'judgement:bad',
  ),
);
?>

For separate mappings for one field (such as latitude and longitude) we would have sub-arrays.

Cache statically and in cache tables by bundle.

Some implications:

Comments would have to become bundles.

There is even talk of making the site itself a single instance of a fieldable entity (a special bundle type).

Attachment	Size
DSC_0322.JPG	95.95 KB
DSC_0321.JPG	112.96 KB

Comments

namespace

Posted by scor on May 12, 2009 at 8:07am

I'm not sure if this covers taxonomy mappings. Won't there be any conflict is say a taxonomy is named with the same bundle name as another bundle? Are bundle name meant to be unique in a given site? Alternatively, we could add a column with the name of the module responsible for each bundle and that way create a namespace for each module.

restriction to bundles?

Posted by fago on May 12, 2009 at 1:25pm

I wonder why are you restricting this to bundles and so to the field api?

There is even talk of making the site itself a single instance of a fieldable entity (a special bundle type).

This sounds to me like a homebrewed problem. I don't think it makes much sense to make a "site entity" that would have only one instance. Instead just don't restrict it to bundles, and generate the RDF out of the site settings in place. So the implementation should ask all modules, which RDF triples they provide for an entity, then modules can give them back. So the field module would be just a special module adding quite a bunch of RDF triples and you don't need specific support for that built into node module + $node->rdf - one way does it.

So what about the idea "Tokens & RDF" idea I described there: http://groups.drupal.org/node/19786 ?

I think the RDF implementation could look similar to this token patch: http://drupal.org/node/113614#comment-1147714

But instead of just having callbacks to get the actual data, I think it would make more sense to let modules to also provide some schema information. So what about a hook instead of $node->rdf, e.g. hook_rdf_schema() allows modules to pass in schema information + a way to actually retrieve the data.

properties vs. schema + properties

Posted by fago on May 12, 2009 at 2:53pm

Thinking a bit more about it I see another point which needs clarification. Should we just provide a mapping to existing "properties" or a schema and optionally also a mapping?

If we would do only a mapping to known properties like dc:creator we can't handle user created fields in a good way - as we just know the name and the type of a field - we cannot determine a good mapping automatically. If we a provide a schema schema we could let the field types provide schema information and we could so generate new properties out of the box. Modules should be able to change the used properties, so a module can provide UI to allow the user to choose some other existing properties (like rdf cck). For this it's important to have the schema information available, so one can present the user a list of "possible properties". Apart from that the schema information would be useful in a lot of other cases, e.g. as I wrote above we could build the token implementation on top of it.

tokens only solves one half

Posted by scor on May 12, 2009 at 4:11pm

tokens only solves one half of the problem. It is useful to inject RDFa markup in HTML, but it does not resolve the mappings definitions, which has to be handle upstream. We slightly changed our approach for mapping definitions, and these CRUD functions above will belong to contrib. Core will have a simpler interface. We will allow non bundle object to be mappable, and we realized that the name we choose should be more generic than bundle. Nodes and users are bundles, and comments and vocabularies/terms are likely to become bundle too, but we not implementing anything restricted to bundles. Re the schema, we don't declare a schema per say, but rather define the mappings against a pre conceived schema, which is known upfront for the core modules. For contrib it will be different, and the same way you can get information about CCK fields in D6, you will be able to get the same for bundles in D7 (and whatever you want to create mappings for).

@tokens: I think it's the

Posted by fago on May 12, 2009 at 6:00pm

@tokens: I think it's the other way round - the basic API could be built similar, but for adding RDFa to the markup probably some rdfa integration in drupal_render would be a good idea.

Do you plan to let modules provide rdf properties independently from fields? If so, we would need the schema to be able to build an UI for the mapping process.

@cck:
So what are the default mappings then when one adds a new field? None?

yes, we plan to build the

Posted by scor on May 13, 2009 at 8:50am

yes, we plan to build the RDFa injection at the drupal_render level. We will let modules provide rdf mapping independently from fields! For the UI, it could reuse the existing fields the same way it's done in D6. We even thought of allowing menu paths to be mappable, and that way go far beyond the node/bundle limitation, but we haven't implemented anything for that yet.

No, there would be no default mapping for new field, however there will be a default mapping for node(title, body, author) which is inherited by the node types, which can be overridden.

default mapping

Posted by fago on May 13, 2009 at 11:16am

I think the default mapping would be of a great value. Consider a site about artists and their albums. If you create content types and fields for both, a default mapping would expose that data as RDFa per so and would so generate a new schema. Yep, it's not yet reusing known properties/schemas, but you have invented your own.

Furthermore once you have "schema information" mapped to field types, one could use the reverse mapping and build a module that takes an existing vocabulary and auto-generates a content types + fields for it. Imo a really nifty module.

We did have this discussion,

Posted by mlncn on May 13, 2009 at 10:38am

We did have this discussion, "Define a schema of what is mappable? Have all 'mappable' containers provide what is mappable about them?"

Ultimately we didn't see the need for a schema separate from the mapping.

All bundles would be whatever is mappable about them "hard coded" plus any and all fields attached to them.

For instance, content types (node bundles) would have a title, uid, created date, modified date, and comments as well as all fields.
Users (the single user bundle) would have uid and created date as well as any and all fields.

While in theory the UI could add extra information once it is known what is mappable, I argued against, saying "You will need to write a way to output it anyway."

Six minutes later, I'm contradicting myself, and suggesting that we get a schema for free by declaring non-mapped attributes NULL so we know they exist, and Florian reminds me that exposing this in a UI does nothing because one has to write the output implementation. So modules are responsible for both the initial mapping and RDFa-ized output of content they deal with based on the mapping passed through our mapping functions. Other modules can modify this mapping. They can also add onto it, provided they take responsibility for a way to output RDFa for new attributes.

^{benjamin, Agaric Design Collective}

^{benjamin, agaric}