Proposing an alternative to application/vnd.drupal.ld+json

linclark's picture

During the WSCCI Web Services Format Sprint last summer, JSON-LD was chosen as the front-runner for primary format. After that, we looked at what features of JSON-LD we would use and what use cases we wanted to support (summarized here).

As we fleshed out the use cases, it became clear that we were trying to accomplish at least two incompatible goals using one format, and that we would need to define a custom profile or media type for one of them. So we came up with a plan to use application/vnd.drupal.ld+json for the site specific API.

However, to support this, we needed to introduce new code and concepts. For example, we introduce a site schema URI for each field and (eventually) each field property. Since this vocabulary is simply expressing the site’s custom domain model, which a site builder cobbles together on a site-by-site basis using field instances, the properties generated would not be reused across sites.

In the use cases addressed by the other version of JSON-LD (application/ld+json media type), sitebuilders would align to external domain models such as Schema.org, so the ability to use RDF vocabularies makes sense... but it doesn’t for the use case we were targeting with application/vnd.drupal.ld+json.

Changing focus

Meanwhile, as the work on the Serialization module progressed, it became clear that focusing on perfecting a “primary format" to ship with Drupal core is unnecessary. Symfony’s Serializer component makes it easy to add support for new formats and toggle between them when using content negotiation... and I hope that some of the work we hope to complete on top of Serializer will make it even easier to add advanced features with a trivial amount of code. For example, Handle entity references on import and Support serialization with same media type, different contracts.

So in large part, my focus has shifted to enhancing these fundamental parts of our serialization system. However, we still want to ship D8 using a basic hypermedia format in the content deployment / site-specific API use case.

Choosing a hypermedia format for the content deployment / site-specific API case

First off, let’s be clear here—whatever format we choose is unlikely to be a magic bullet or panacea. The linking-in-json field is still wide open with no clear front-runner, and new formats are being proposed as we speak.

This is why I suggest we go with the simplest format that has the hypermedia support we want. Since we know that whatever we pick is unlikely to be the long-term winner, we might as well pick something that is trivial to maintain in addition to whatever we add next.

So the recommendation is HAL. It simply adds two reserved keywords, “_links" (a structure also used by GitHub’s Pull Request API) and “_embedded", which contain the link relations and embedded resources. Additionally, HAL can be encoded in both JSON and XML, which means that we sidestep another religious debate.

I know there are one or two supporters of JSON Schema in the crowd, so I want to make sure it’s clear that JSON Schema can layer on top of HAL’s JSON variant (or really, pretty much any JSON based media type since it is recommended that an instance and its JSON Schema be correlated via profile media type parameter or describedBy link relation, not a separate media type). In fact, the Halidator has a JSON Schema option for validating HAL.

Or, if you don’t care about hypermedia, you can just use the plain vanilla JSON or XML support that we’ve already added.

To finish... please, no format wars

We aren’t going to get it right... because there simply isn’t a One True Path at this point. What we can do is make it easy to add new formats from contrib. Then, implementers of core and contrib format modules can work together on the generalizable parts of serialization such as entity reference importing. This makes us competitive now by adding a simple hypermedia-capable format to core, but also competitive in the future as the landscape changes.

For discussion

Based on how limited the understanding of complex formats is in the community, the limited benefit of complexity in this particular use case, and limited resources we have, I propose we move forward with this simpler format as the hypermedia format we ship with core. If there are any concerns on this point, please voice them below. Bonus points for including practical reasons why something is an issue in real life use cases.

We want to get moving so that we can figure out what we need solve before API freeze. For this reason, I’d like to timebox this discussion to February 26.

Comments

Sounds like a great switch. I

rszrama's picture

Sounds like a great switch. I gave Mike Kelly a ping on Twitter to see if he has anything to add. I'm guessing he'd say go for it. : )

Hey, This is great news! At

mikekelly85's picture

Hey,

This is great news! At this stage don't really have anything to add other than to agree with the rationale (no suprise there, I guess) and let everyone know if they have any questions/suggestions/etc about hal they should either get in touch with me or hit the hal-discuss mailing list.

Glad to hear you're working

bendiy's picture

Glad to hear you're working in support for switching formats.

Locking this into one format is not the best solution. Having the flexibility will payoff in the end as the format war settles down in the next few years (or heats up). I'm using JSON-Schema on a project that will be communicating with Drupal. Being able to have them communicate in the same format will be very helpful.

Never a single format

Crell's picture

The plan was never to only support one format. Rather, we need to have a "primary" format that is assumed for Drupal-to-Drupal communication, but easily allow for other formats to be supported by just writing a few classes.

The proposal here is to just switch the "primary format" from JSON-LD to HAL (with support for both JSON and XML variants, hopefully). You should still be able to support JSON-LD, Atom, or whatever else you feel like.

I've been using the general

Grayside's picture

I've been using the general HAL structure for awhile, having decided the project was not ready to tackle semantic at the same time.

It works well. The primary issues are proper use of _embed (there are subtle situations that slip by), and as always with Drupal API building, coming up with a payload structure that makes sense if you don't already know Drupal Data structures.

This makes sense to me as a way to build hypermedia into core while allowing more sophisticated formats to evolve in contrib, alongside the tools that make them powerful.