DrupalCon: Discuss what we want from JSON-LD and whether it is supported

Events happening in the community are now at Drupal community events on www.drupal.org.
linclark's picture

As I explained in What features of JSON-LD would we use in Drupal 8?, there are some issues that are not yet stable in the spec. Two issues which I raised on the spec were resolved, which made me feel comfortable moving forward with JSON-LD.

Two of those issues have since been reopened, with removal from the spec possible. Specifically, the issues we had with Identifying properties with URIs and Language version handling are issues again.

We should consider why we want to use JSON-LD... what it gets us. The parts that I think we want to use seem pretty unstable right now.

Some options:

  • We move ahead with JSON-LD, but assume we have to write our own processing of the data for the content staging use case.
  • We create our own JSON serialization which is inspired by JSON-LD, but can stray from the spec.
  • We choose a different standardized serialization for the content staging use case.

I think we should have this conversation while we're here at DrupalCon. I believe the BoF rooms are booked, but we could stake out a space.

Comments

For reference, the two issues

linclark's picture

For reference, the two issues on the JSON-LD spec are:

Different serializations depending on the consumer?

scor's picture

Is it bad practice to offer two different serializations in the same format (say JSON) based on the need of the consumer? Let me explain. I argue that the two use cases we're trying to cover (content sharing on the Web and content staging) both require their own serializations (possibly both in JSON and/or JSON-LD). 3 reasons at least:

1) I've argued before that different kinds of IDs should be used depending on the use case: UUIDs are a good fit for content staging, while Linked Data URIs are more appropriate for content sharing in general.

2) In my post about entity RDF and the JSON-LD serializer, I've explained how the default field data structure doesn't necessarily make sense for a generic consumer, while it could be something that is required for the deployment scenario. For generic content sharing, the body rendered in HTML is all what's needed (much like in RSS):

"body": "<p>DrupalCon is an international event...</p>\n",
},

for the content staging scenario, this HTML rendered value is pretty much useless, instead what is useful is both the raw body value as entered by the site administrator, and the text format.
"body": {
    "value": "DrupalCon is an international event...[node:some-token]\n",
    "format": "filtered_html"
},

Note how you're most likely working in a control environment when you expect the text format to be something recognized by the sites you're sharing your content with. Typically text format settings aren't something you'd control in the same fashion as you would for content, but rather that's a configuration settings you'd keep in your VCS.

3) Bandwidth: if we wanted to support both scenarios in the same serialization output, we'd have to include two versions of value for each text field (the rendered HTML and the raw value). This would double the size of the JSON-LD output in the case of large posts. It sucks bandwidth and also adds some extra processing and memory usage on the consumer end.

In conclusion maybe we need to consider providing a serialization for content staging and another one for regular content sharing. Consumers would choose by sending a specific HTTP header. Having two separate serializations would reduce the need to have terms expanding to multiple URIs.

Yes

Crell's picture

Finally getting back to this line of thinking...

Yes, we discussed in Paris as well that we need to be thinking about 2 separate use cases, if only for security reasons. The format we use for deployment and staging will contain unfiltered markup, potentially secure fields, etc. That's very different than what is needed to syndicate content to arbitrary 3rd parties, where we will need formatted HTML, certain fields removed (because certain user roles do not have access), etc. At this point, I think both should be in JSON-LD for simplicity but we do need to handle both use cases separately.

Having entirely separate unique IDs for each version is interesting. I'll have to consider that further, but it has potential.

Mis-communication

msporny's picture

Hi Lin,

Just want to touch base on this particular phrase:

"reopened, with removal from the spec possible"

I don't agree that either one of them should have been re-opened without the approval of the group (especially since the group had a resolution on record). Markus is fairly new to how the W3C Working Groups operate, normally he wouldn't have been allowed to re-open the issue without new data (and the chair of the group would've had to approve the re-opening of the issue). We're in a bit of a weird part of the W3C Process since the JSON-LD CG is working on behalf of the RDF WG. We'll discuss this on the call tomorrow and get an answer to you quickly.

Here is what should be your take-away from this post: We (the JSON-LD CG) are committed to supporting Drupal 8. We took quite a bit of time to get you the features that you needed, and we will continue to try and make sure those features are there for Drupal. Not everyone in the group will agree at times, the process may not be clear to certain members of the group at all times, but know that the CG knows that those two features are important to Drupal and we'll try our hardest to make sure that your use cases are supported.

Let's have a quick chat to sync up after the telecon tomorrow. Ping me on Skype if you can: msporny.

The only reason for

lanthaler's picture

The only reason for re-opening the issues was that the API spec wasn't updated yet. That's how we have been operating for 2 years now.. that's the how we make sure that we don't forget anything.

The issues are still marked as "resolved" and nothing else changed. They are just not fully implemented yet. That's all.. I think we also clarified that in todays telecon. Sorry for the misunderstandings I caused.

Markus Lanthaler
@markuslanthaler

Thanks for clarifying, it was

colette's picture

Thanks for clarifying, it was the comment suggesting that the feature be revisited (in the property generator issue) which had me concerned. From the IRC log it looks like a lot of the algorithm issues which you were concerned about were discussed in today's telecon, which is encouraging.

Thanks for clarifying, it was

linclark's picture

Thanks for clarifying, it was the comment suggesting that the feature be revisited (in the property generator issue) which had me concerned. From the IRC log it looks like a lot of the algorithm issues which you were concerned about were discussed in today's telecon, which is encouraging.

EDIT: sorry for the repost, didn't realize that my GSOC student was logged in.

I'm definitely interested in

gdd's picture

I'm definitely interested in joining this discussion, should we just set a date and time and do it? The BoF slots are all full already, but we can just gather somewhere. My schedule is pretty full but I have some free time this afternoon, and Thursday afternoon around 1.

So it seems like most people

linclark's picture

So it seems like most people are good for Wednesday after the last session, so the plan is to meet in the coder lounge.

Would you also be working on

mradcliffe's picture

Would you also be working on steps in Entity Serialization API for web services (e.g. content staging)? I have been looking into creating a unit test class for EntitySerialization (basing work off of fago's sandbox).

Drupal Use a Potential Influence on JSON-LD?

ethanw's picture

I'm returning to this conversation after a bit away. One quick thought is that adoption of JSON-LD by Drupal that relied on some of these features might be a factor in the status of some items in the spec, as they demonstrate significant use cases. I wonder if others closer to the JSON-LD project might have additional perspective on that.

In another case, we could hold off on forking till the spec items we need diverged.

Look forward to tomorrow.

A quick update to this

linclark's picture

A quick update to this discussion:

The JSON-LD group still had some questions about how we would use the language maps and multiple property URIs, so Manu asked for a discussion to clarify the use case further, which both Stéphane and I agreed to.

It turns out that there are still some issues that need to be resolved in the JSON-LD spec before the spec fully supports our use case. The problems center on:

  • Ensuring that we can round trip (go from compact JSON-LD to expanded JSON-LD and then back again) without mushing up our data in a weird way when using multiple property URIs
  • Ensuring that we preserve the language handling on entity references when we go to expanded form. As I explained in my summary of the JSON-LD features we will use, RDF can handle language for string field values but not for entity reference field values.

Manu is confident that they can solve these issues and it seems to be a high priority, so I don't think this changes our decision, though others should feel free to weigh in.

The issues for tracking are:

Once the issues are resolved, I will add a new version of the serialization to JSON-LD serialization: DrupalCon discussion.

Web Services and Context Core Initiative

Group organizers

Group notifications

This group offers an RSS feed. Or subscribe to these personalized, sitewide feeds: