Canonical Entity representation

Crell's picture

One place that the Web Services and Context Core Initiative (WSCCI) and Configuration Management Initiative (CMI) overlap is the need to have a standard, canonical format to represent nodes and other entities in non-PHP and non-SQL format. There are a number of places where that is useful:

  1. Including entities in exported configuration, or in configuration files.
  2. Taking a content snapshot in some form other than an SQL dump file (which, you know, kinda sucks for mose uses).
  3. Transferring a node from one site to another for content sharing purposes.
  4. Aggregating content from many sites together for improved searching and cataloging.
  5. Exposing Drupal content to other non-Drupal systems. This is made easier by using non-Drupal-specific formats.

These are all problem spaces that exist in Drupal 7 now, and did back in Drupal 6, too. Various one-off solutions exist. For Drupal 8 we should have a better universal answer to this question, and be able to build common tools to support it. Those can, and should, also influence our API design to help improve external integration.

Existing solutions

There are four general approaches that I am aware of.

  1. Serialized PHP: The simplest of course is to simply dump a node to PHP code using var_export(), or just use PHP's serialze() function on it. While that does result in a string representation of a node (or other entity) that can be saved to disk or sent to another site, it is a generally poor format. It is PHP-specific, Drupal-specific, serialize() is easily corrupted, and it does not do anything to help with IDs that differ between sites or references to other entities. In short, it's not worth our consideration.
  2. drupal_execute() arrays: This is the approach taken by the Deploy module in Drupal 6. The basic idea is that nodes in Drupal 6 rely too heavily on the Form API for, well, everything, so saving a node and not going through a form save operation would lose half the useful data. That is fortunately not the case in Drupal 7 anymore thanks to improvements in the Field API, and of course FAPI arrays are one of the least portable formats we could come up with so it fails goals 1, 2, and 5.
  3. json_encode(): Deploy in Drupal 7 drops drupal_execute() in favor of running json_encode() on a node object to send between sites via Services.module. The receiving end then simply runs json_decode() and node_save() (well, a custom entity_save() routine since core has no entity_save()). That is much cleaner and more portable. JSON is a well-known standard format that is dead-simple to parse in PHP, it's understood by a wide variety of systems, and can be included in either JSON-based or XML-based configuration files. (With some escaping it's just CDATA.) However, blindly dumping a node object to JSON without thinking about its structure is not useful for external integration, because the structure is too unpredictable for anything but custom parsing.
  4. Atom/XML: On a previous client project in Drupal 6, I worked on a team that produced the Views Atom and Feeds Atom modules. The basic idea was to serialize nodes to a custom XML format, and then use the Atom format (IETF 4287) to wrap them for transport between sites. Atom turned out to be an excellent choice, as Atom supports multiple payload formats, including non-XML; it supports encryption (although we did not use it); it supports UUIDs for synchronizing objects to avoid content duplication; and it supports PubsubHubbub, an Atom extention that makes push-based updates possible. (And yes, there's a module for that.) It worked well, and at Palantir we're now starting work on a Drupal 7 project based on the same tools. (Expect Drupal 7 versions of that full suite soon.)

I spoke with Deploy maintainer Dick Olsson (dixon_) earlier today, and we both agreed that we really ought to standardize on one format that Deploy can use now in D7, that we can use for clients like the one I'm working with now, and for a Drupal 8 standard. There's plenty of good reasons for it, and no good reasons to not standardize, and we can standardize now, even without Drupal 8 being anywhere close to a release.

We also agreed that Atom was probably the best wrapper format, since it contains a number of features (as above) that are useful when needed and can be skipped when not. It's also a well-recognized standard, which in most cases is superior to some Drupal-proprietary format.

So, let's do. And let's do while I have a client that can pay for at least some of the work to help build a common library for it. :-)

Requirements

A canonical serialized entity representation should:

  1. Be at least somewhat human readable, or at least is if you pretty-print the whitespace.
  2. Be reasonably straightforward to parse in PHP.
  3. Be parsable by non-Drupal, non-PHP systems as well.
  4. Have a consistent, regular, predictable structure.
  5. Be supportable by any entity automatically by virtue of being an entity.
  6. Not try to handle everything that an entity might have on it, only those things that are fully supported. That is, entities right now have basic properties that are defined by the entity type, and they have fields. It's been common for modules to also throw any random stuff they want onto the bare object structure at various times. Those are very specifically not supported, as that makes the structure too unpredictable.
  7. The following workflow must work, and result in no change to an entity (this is not an API example):

    <?php
    $entity
    = entity_load($type, $id);
    $string = entity_serialize($entity);
    $entity = entity_deserialize($string);
    entity_save($entity); // Once this API call exists.
    ?>

  8. Be revision-aware.

Two options immediately spring to mind. One is to reuse the XML format from the Views Atom module. (Note: The sample there is namespaced, which makes it uglier to read, but the actual tags are fairly simple; please pardon the namespacing.) That has the advantage of already existing, and we can rip parsing logic out of that module into a standalone library. We could also tweak it as needed before making it a canonical format.

The other is to use JSON, but something more robust than just throwing an object into json_enocde(). For one thing, in Drupal 8 entities are classed objects and will have non-public properties, so that won't even work in the first place as those non-public properties would get lost. For another, we want a more regular and non-Drupal-specific structure than that would give us.

(Yes, the XML vs. JSON wars have already been fought. That CMI is going XML at this point is a mark in XML's favor. Please do not simply repeat anything already said in that thread. Please.)

I will also offer that there is no intrinsic reason we cannot provide both an XML and JSON canonical form, as long as they are reasonably related. It does not have to be either/or.

References and Dependencies

Here of course comes the ugly part. Drupal entities routinely contain references to other entities. Nodereference, Userreference, Entity Reference, File fields, OG group membership, the author property of nodes... the list goes on. Of course, those references are generally entity IDs, which means totally and utterly useless when a node is serialized and used anywhere except right back on the same site. We need some alternate way to represent them.

I encourage everyone to read these two articles on REST before commenting on this section. They contain very valid points regarding how resources should reference each other in a REST/hypermedia form. There's some discussion of them in this earlier thread, too. Remember, the receiving system may not be a Drupal site!

Just throwing UUIDs on everything is only a partial answer. Having some sort of /entity/$type/$uuid path that we can always rely on could be a part of the solution, but perhaps not. I'm not sure here yet.

The other question is files. Not only do we need to translate fids into something useful, using a Drupal stream wrapper URL may not be useful. Sometimes it will be; actually in the client project I have right now we do want to send over Drupal stream wrapper URLs, because we have a common file server. However, that will not always be the case. So what do we want to do here?

Wrappers and control

For both an XML format and a JSON format, the Atom spec actually provides a very nice envelope. It's a widely understood format, extensible, supports both push and pull based updates, has an extension that can push deletion notifications, and scales well once you introduce an external PuSH hub server.

Naturally not every use case will need a wrapper; if we're just saving out a serialized entity to disk, then Atom doesn't have any real purpose. For a web service wrapper, though, we could do far worse.

Discuss. :-)

Login or register to post comments

Another issue : text input

yched - Tue, 2011-12-20 08:53

Another issue : text input formats. What to export ?
- [raw input + text format name] means nothing outside drupal land (or when exported to a drupal site with different text formats
- exporting the check_format()'ted string means no drupal re-importability.


In case its useful, there is

xtfer's picture
xtfer - Tue, 2011-12-20 09:12

In case its useful, there is a Drupal 6 UUID URI resolver, though I notice its currently without a release... http://drupal.org/project/uuid_resolver


restws project

yched - Tue, 2011-12-20 09:46

Another module trying to address exactly that in D7 : RESTful Web Services (relies on Entity API module). We should probably ping fago and klausi.


RESTWS

klausi's picture
klausi - Tue, 2011-12-20 16:20

RESTWS does not use any special format. It was designed to support many different formats that are driven by FormatControllers. However, it heavily relies on the Entity property API for retrieving and naming properties. We took special care of properties that are references to other entities by providing the ID, a fully qualified URL and the resource type, to comply with the REST principles, but that's it. So I think the actual format does not really matter; how the properties are named and how they are retrieved is important.

The suggested XML Atom format from above looks fine, I just can see one inconsistency: "title" should be listed inside the "properties".


restws module and my thoughts

fago's picture
fago - Tue, 2011-12-20 16:23

RestWS handles entity URIs already in a format specific manner, so proper RESTful references are constructed. It generates /$entity_type/$id URLs for that, whereas mapping to different default URIs like taxonomy/term/$id is not yet solved (redirects maybe).

For more information about the restws module, please see this post. The post also shows the used representation formats.

In short, it's working based upon entity property info of the entity API module. Based upon that information we know about entity references regardless from the storage back-end and can take it into account. Generally, I think basing the representation only on known properties is a good idea, such that interim-module added stuff is kept out.

See http://drupal.org/node/1346220 for the d8 entity property info issue.

For both an XML format and a JSON format, the Atom spec actually provides a very nice envelope. It's a widely understood format, extensible, supports both push and pull based updates, has an extension that can push deletion notifications, and scales well once you introduce an external PuSH hub server.

I fully agree. Having Atom would be very nice, in particular in conjunction with push. I'd prefer having a more light-weight implementation not relying on views additionally though. Implementing atom means we should take over the atom vocabulary though what might be unwanted in a straight-forward conversion where e.g. the comment subject remains a subject and is not converted to atom's title.
So maybe there should be both, a straight-forward xml/json conversion and an atom xml/json conversion?

@json/xml
I think that we should provide both, JSON and XML representations. So developers can pick what they are most comfortable with.

- [raw input + text format name] means nothing outside drupal land (or when exported to a drupal site with different text formats
- exporting the check_format()'ted string means no drupal re-importability.

That one is tough. Actually, the desired export depends on your use-case (external access vs deploy to another drupal instance).
Well, the raw input variant would have to include a reference on the text-format used. Still, that's probably not a big help for anyone. Generally, I think it's a good idea to process the text formats, but don't sanitize it (web services do not sanitize data). As that might not be desirable in some case (content staging, entity unserialize/serialize) we need to be able to opt out from that though. Maybe via separate output formats?


Envelope not required

Crell's picture
Crell - Wed, 2011-12-21 01:01

The Atom envelope is not required for all use cases. We absolutely should be able to get just the XML/JSON version of an entity as a string and do what we want with it.

I'm more suggesting that if we need any sort of flow control, syndication tracking, etc. (for deploy, some services calls, site to site syndication, external feeds, content notification/push, etc.) that we standardize on Atom as our go-to wrapper.

I also agree entirely that if we go this route we want to have stand-alone libraries for encoding/decoding an entity, and separately for building a basic Atom feed out of them. Whether core exposes an actual feed URI or we leave that to contrib/views I don't know, but that logic should be kept as stand-alone as possible.

In fact, if we can find and adopt an Atom parser/generator library rather than writing our own, so much the better.


But atom isn't just an

fago's picture
fago - Wed, 2011-12-21 09:33

But atom isn't just an envelope, isn't it? Once you are using it, you have to make use of its required elements, i.e. atom:title. Also, once we are using atom we should make use of its opt. vocabulary too.

I'm more suggesting that if we need any sort of flow control, syndication tracking, etc. (for deploy, some services calls, site to site syndication, external feeds, content notification/push, etc.) that we standardize on Atom as our go-to wrapper.

Sounds good! It would be great to see us also implementing the Atom Publishing Protocol, as it'd standardize any restful interface.


Machine names for formats?

pwolanin's picture
pwolanin - Tue, 2011-12-20 20:47

For Drupal 8 at least, I hope formats might have a machine name (or UUID) instead of just an integer ID?

Perhaps formats themselves should be entities which are exportable? Of course, they would still have to reference code libraries or some other bigger logic.

Would it make sense to have both the raw and rendered content in the export? For example, having the rendered content means the exported entity could be used to populate a search index.


Output formats cannot be

DjebbZ - Wed, 2011-12-21 09:07

Output formats cannot be exported, since they're not data but algorithms...

About Atom parser/generator : SimplePie can parse atom and rss feeds (it's in the Feeds module), but it's going under rewriting and needs help to port it to full PHP5 OOP. The PHP Universal Feed Generator, found in the top Google results and in several answers in StackOverflow, seems a good tool to generate valid Atom feeds. I've read the code quickly, it's OOP, simple and straightforward.


About Url's

DjebbZ - Wed, 2011-12-21 09:12

The REST resources Url's could be constructed based on the html output url + '/$format_name', e.g. node/[nid]/json or node/[nid]/xml, taxonomy/term/[tid]/xml, etc. So that services discovery is made even easier.


Well, then we're well out of

Hugo Wetterberg's picture
Hugo Wetterberg - Wed, 2011-12-21 09:51

Well, then we're well out of the bounds of the REST philosophy. The preferred way is to send accept-headers or as a fallback: file name extensions.


Got you Hugo. Another thing

DjebbZ - Wed, 2011-12-21 14:03

Got you Hugo.

Another thing we said yesterday during the WSCCI meeting in IRC is that our API for generating such canonical representations of entities should work with any entities, not only core entities, so that custom created entities can be represented without any additional code or implementation by the custom entity creator. If it's needed, it's never gonna be done.


Account for language

stevector's picture
stevector - Thu, 2011-12-22 16:28

The Views Atom format used in Drupal 6 likely could not be used verbatim as it does not account for the changes in field-level language handling made in Drupal 7. I don't know if these or any other i18n changes coming in D8 make Atom, JSON or any other format preferable.


Have we looked at the Open Data Protocol?

cpliakas - Thu, 2011-12-22 17:15

The Open Data Protocol seems to be relevant here. The spec supports two formats, the XML-based AtomPub format and the JSON format, which seems to be inline with the OP. In addition, there are existing PHP and JavaScript libraries available for download. I haven't worked with the libraries nor do I know what license they fall under, but their mere existence suggests we should at least take a good look at it. Ironically the PHP library is sponsored by Microsoft, which does raise an eyebrow. In addition seems there is a heavy MS bias throughout the site, but there doesn't seem to be an MS bias in the protocol itself.

Anyways, just wanted to throw it out there.
~Chris


Nice!

Crell's picture
Crell - Sat, 2012-01-07 19:08

I spent a few hours yesterday reading the OData documentation, and so far I really like what I'm seeing.

tl;dr version for people who haven't read the site (although I suggest you do so): It's really just a formalization and slight extension of the Atom spec, which is already pretty darned good, and a mechanism for defining the payload tags you use. There's a default standard for simple data, but that probably won't work for us. Something inspired by it may. Also, it has essentially a JSON alternate version of the same thing although I've not looked into that as much. And then to that it adds an index format so that a site can list what Atom feeds it has available.

Another advantage here is that we already have Atom-generating code available in Drupal (views_atom and the atom module, which I suspect will be merging in the next few months), and Atom consuming code (the OData module, which I just discovered yesterday).

The site also has freely available libraries that we could download and use, in PHP and various other languages. There's just one problem: It's Apache 2 licensed. That means it's only compatible with GPLv3, not with GPLv2, which Drupal uses. Since I think it's unlikely that we'd switch to GPLv3 for Drupal 8, that means we could not bundle their code. :-( (I'm also not entirely sure about their library API yet; I've only just glanced at it.)

Thoughts? Do we want to build OData support into Drupal as our first-class export/serialization mechanism?


OData looks indeed very interesting

dixon_'s picture
dixon_ - Sat, 2012-01-07 21:37

OData looks indeed very interesting, it seems like something we definitely should look into.

I will have a deeper look at the protocol, in context of this discussion as well as being the maintainer for the Deploy module, and come back with some feedback, as soon as possible (tm).

// Dick Olsson


While OData looks fine, in

xtfer's picture
xtfer - Sun, 2012-01-08 00:47

While OData looks fine, in principle, I can see two problems with it. Firstly, it is essentially a Microsoft project, and as such is covered by the Microsoft Open Specification Promise, which has its own quirks and limitations. Secondly, as a data format, it is not widely used as yet, and was basically built as a serialisation mechanism for Microsoft products like Dynamics and Sharepoint. For those two reasons alone, and despite is obvious positive qualities, I would not want to make it the basis of our export mechanism.


Somewhat thinking out loud here

andremolnar's picture
andremolnar - Sat, 2012-01-07 16:24

Further reflection re: the format wars. I agree with the statement:

I will also offer that there is no intrinsic reason we cannot provide both an XML and JSON canonical form, as long as they are reasonably related. It does not have to be either/or.

Its almost implicit, but we should be explicit. If the contract is done right it should be easily transformed from any to any as long as what is represented doesn't change. Put another way: the data (and maybe the Interface) is canon, not the format. There is no reason export or import tools or configuration storage or object creation or whatever you can imagine couldn't all be pluggable systems - each potentially consuming/producing a different format if that's what the developer's heart desires.

Arguably, the fewer the better, but the sky is the limit.

As for references, I think you've hit the nail on the head. References need to be URIs to the canonical information.
IF you're exporting you have a choice to make - either bundle up what is returned by the reference URI OR just include the URI and let the consumer decide what to do with it. But, that's a good choice to have. In each case you have the right data (right now) or the right data forever as long as it lives, but you have the right data.


CMIS

dixon_'s picture
dixon_ - Sun, 2012-01-08 11:33

Me, heyrocker and skwashd briefly talked about this on IRC today, and the CMIS came up as a potential candidate. None of us has fully wrapped our heads around the format or specification yet. It might not even make sense. But I'm just adding it in here, as a note:

Specification: http://docs.oasis-open.org/cmis/CMIS/v1.0/os/cmis-spec-v1.0.html
Existing Drupal module: http://drupal.org/project/cmis

// Dick Olsson


My research

dixon_'s picture
dixon_ - Wed, 2012-01-11 15:02

Based on my completely (un)biased research that I conducted myself, I would recommend the OData specification as the way we should represent canonical entities in Drupal 8.

In my research I compared OData and CMIS. Those are the two biggest specifications, it seems, related to CMS and ECM systems.

Continue to read for my conclusions and what action points I will take to try this in Drupal 7.


CMIS

Specification: http://docs.oasis-open.org/cmis/CMIS/v1.0/os/cmis-spec-v1.0.html
Example of an entity: Atom format

CMIS is a very common specification, but mostly related to ECM systems, like Alfresco and Sharepoint, that (more or less) assumes documents in a hierarchical folder structure. The specification does not cover "programming interface objects" or other "administrative entities" like user profiles(!).

Here are some outstanding quotes from the specification:

CMIS provides an interface for an application to access a Repository. [...] In accordance with the CMIS objectives, this data model does not cover all the concepts that a full-function ECM repository typically supports. Specifically, transient entities (such as programming interface objects), administrative entities (such as user profiles) [...] are not included.
(from http://docs.oasis-open.org/cmis/CMIS/v1.0/os/cmis-spec-v1.0.html#_Toc243...)

There are four base types of objects: Document Objects, Folder Objects, Relationship Objects, and Policy Objects.
(from http://docs.oasis-open.org/cmis/CMIS/v1.0/os/cmis-spec-v1.0.html#_Toc234...)

Pros with CMIS

Cons with CMIS

  • Very extensive specification difficult to wrap one's head around
  • Doesn't map well to what we are looking for here, it focuses on documents and folders, not generic data entities/objects
  • Doesn't support the JSON format (although it might be on the way)


OData

Specification: http://www.odata.org/developers/protocols
Example of an entity: JSON format, and Atom format

OData is a common specification (but not as common as CMIS) that defines ways to represent abstract data models, or Entities that may be a part of (not assumes) Collections or feeds. OData also specifies some URI conventions for querying resources, that goes inline with REST principles (although I didn't look closer at it in this research).

Here are some outstanding quotes from the specification:

[...] enables the creation of HTTP-based data services, which allow resources identified using Uniform Resource Identifiers (URIs) and defined in an abstract data model, [..]
(from http://www.odata.org/developers/protocols/overview#Introduction)

OData supports two formats for representing the resources (Collections, Entries, Links, etc) it exposes: the XML-based AtomPub format and the JSON format.
(from http://www.odata.org/developers/protocols/json-format)

Pros with OData

  • Quite well supported specification
  • Lightweight specification and easy to understand
  • Supports both Atom and JSON
  • Maps well to what we are looking for here and Drupal's data model (i.e. Entities, Entity Types etc.)
  • The specification seems to empathize RESTful principles more (i.e. how references are handled etc.)
  • Has some ready-to-use SDK's (http://www.odata.org/developers/odata-sdk)

Cons with OData

  • It smells a bit Microsoft (it's published under Microsoft Open Specifications)
  • It's SDK is not GPL compatible (but for canonical entity representation it's not useful anyhow, also read Action points below)


Short conclusion

CMIS is very complex, doesn't do exactly what we want and only supports Atom. OData is lightweight, do what we want and supports both Atom and JSON with more empathize on RESTful principles.


Action points

As the maintainer of UUID and Deploy, I will:

  1. Add a Services resource type to the UUID module that represents entities according to the OData specification.
  2. References in that resource will be made as URIs according to the OData specification (http://example.com/[services api path]/entity/[uuid] and the entity type is specified in the payload)
  3. Implement support for this OData resource in Deploy module for its content deployments

I won't give a fixed timeline for these action points, but I'm on a project that will benefit and thus be able to provide me time to work on most (hopefully all) of this stuff.

// Dick Olsson


OData SDK is Apache 2.0 Licensed

skwashd's picture
skwashd - Wed, 2012-01-11 22:31

Just to clarify, the OData SDK is licensed under the terms of the Apache 2.0 License, which is GPLv3 compatible, but not GPLv2 compatible which means it can't ship with Drupal code.

Here is the Software Freedom Law Centre's review of the Microsoft Open Specification Promise.


Blargh

Crell's picture
Crell - Wed, 2012-01-18 20:36

Money quote from the writeup: "The OSP cannot be relied upon by GPL developers for their implementations not because its provisions conflict with GPL, but because it does not provide the freedom that the GPL requires"

So an implementation of an OSP-covered specin GPL PHP would be fine, but could get people downstream in trouble. Blargh.

That said, OData is 90% straight up Atom, which is not MS-proprietary in the first place, so I'm not sure to what degree it applies in practice.

Have I mentioned that I hate non-Free code/specs/companies?


thanks for the writeup! I

fago's picture
fago - Fri, 2012-01-20 20:44

thanks for the writeup!

I agree with you that CMIS looks more complex. Odata seems to be a good fit, technically I really like it. In particular I like that the specification includes an optional place for service metadata (Service Metadata Document) too. I'm not so sure about OData's legal / microsoft concerns though. :/

References in that resource will be made as URIs according to the OData specification (http://example.com/[services api path]/entity/[uuid] and the entity type is specified in the payload)

For a real RESTful design I don't think there should be an "api path" involved. According to the REST principles each resource should have a unique URL/URI. Thus, the URL a user uses to view the resource in HTML shouldn't differ from the "api-url". That's important for references to work regardless of the "api endpoint" used.
However additionally, it's nice to have a simple uniform URL pattern, like http://site/entity_type/id. That doesn't fly with some existing URLs like taxonomy/term/id, but still we could redirect http://site/taxonomy_term/id style URLs to the right ones.


This seems like a great place

sethviebrock's picture
sethviebrock - Sat, 2012-01-14 07:41

This seems like a great place to start focusing -- the intersection of these two initiatives.

Re: input/text format exporting, it would seem that exportable formatted textual data (to be consumed by another Drupal instance) might be required to conform to one of a set of default-Drupal-core-provided formats so that the formatting could be referenced between instances by machine name / ID, which could be ignored by everything non-Drupal that would consume this data. Any algorithmic deviation from default-Drupal-core formats could render the entity's textual formatting as "unexportable" (akin to the "overriden" state in the Features module), which a corresponding UI adaptation would have to handle. This is re: "Not try to handle everything that an entity might have on it, only those things that are fully supported." So, in this instance, if users really want to go beyond core defaults, they can, but it won't be exported by core (but someone will surely develop a little workaround module like http://drupal.org/project/input_formats, which is fine, but that shouldn't be in core.) Seems like this logic could apply in similar quandaries, if any arise.

sethviebrock.com


We seem to be talking about

xtfer's picture
xtfer - Mon, 2012-01-16 12:22

We seem to be talking about two things in this discussion that might need to be teased apart:

  1. The way an Entity is represented when exported (its Format)
  2. How an Entity is moved from one place to another (its Transport)

Transport and Format should be somewhat independent of each other.

On OData...

Ive done some more digging into OData, and it seems a poor fit for Formating canonical entities, primarily because OData is not specifically "a way to represent data", but "a Web protocol for querying and updating data", with a tacked on entity model. As such, it has a lot of HTTP related stuff which would get in the way other stuff we are already looking at. It is a simple form of web service (no WSDL), and falls into the Transport bucket.

Additionally, to do it properly, and not just cherry pick its representation of entities, requires describing your data for OData services, which means implementing a restful web service using OData, which would likely conflict with anything implemented in the rest of WSCCI.

We'd also want to be sure that exporting an Entity in OData format could be done without the corresponding OData web service, and understand what the implications of that might be. Can it be consumed, for example?

Also, looking through the OData SDK for PHP, its very Microsoft-centric, and has some possible reported issues when not running on Windows.


Yes

Crell's picture
Crell - Wed, 2012-01-18 20:49

I agree, the format and transport are/should be separate questions. OData happens to try and address both of them.

The OData SDK is a non-starter anyway, due to it being Apache 2 licensed. I don't see Drupal 8 moving to GPLv3. Drupal 9 maybe. :-)

At this point I do think that Atom is the right transport format; it doesn't impose a payload format, however. OData defines a mechanism for defining a payload format, but I haven't checked yet to see if it's flexible enough to handle Drupal entities. It appears from my read through that the SDK would do some kind of decoding, but the spec itself is just the wire format.

How do other folks feel about the OData legal concerns? Too hot to touch?

At this point I'm confident that Atom is the right transport format. I'm just not sure of the right payload format. Worst case we just use the format defined by views_atom and call it a day.


Being web-servicey, OData's

xtfer's picture
xtfer - Wed, 2012-01-18 22:03

Being web-servicey, OData's format must be described to OData for every entity type. That could probably be automated, but its still fiddly. I'd prefer Atom from that perspective, for transport.

Are there any schema's associated with the custom format? I notice in the example that its defined in RDF, but has no associated schema defined. Do we need something like...

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
         xmlns:drupal="http://drupal.org/ns/entity/">

Taxonomy Import/Export by XML has a similar problem, in that it uses drupal.org as the base URI for vocabularies exported as SKOS. Theres the possibility of namespace collision there too (though there is also prior art, at least for taxonomies).