WSCCI Serialization Format Preliminary Report-Back

We encourage users to post events happening in the community to the community events group on https://www.drupal.org.
ethanw's picture

Summary

After an initial review of each of the initial proposed serialization formats proposed by Crell for consideration as the canonical representation format for the WSCCI projects' work, I recommend we focus on CMIS, HAL and JSON-LD for further in-depth evaluation. These projects are sufficiently mature and active, and have (or are actively developing) well-supported open source libraries in PHP and JavaScript with engaged community participation. Other formats reviewed included OData, JSOP, WRML and Collection+JSON, however each of these were marked by issues with available PHP/JS libraries, lack of active development community, immaturity, inactivity, and/or insufficient features.

[edit 5/8 to make the main questions clearer and fix linebreaks/typos]

Next Steps

For next steps, I propose the creation of a few wiki pages in this group: one overview page with general comparison notes, evaluation guidelines, etc., one for each of the "short list" evaluation candidates, and potentially one additional one with detailed evaluation process instructions. Once we have indiviual pages for each format, we should invite the maintainers of current Drupal modules relating to each of these formats to contribute to the evaluation. There are Drupalists with experience in pretty much each one of the shortlist formats.

I have begun stubbing out pages and preliminarily evaluating most of the short list, but wanted to make sure to run this by the group before committing a bunch of new docs. Does anyone see any major issues with this approach?

The main questions for this post are regarding that plan:

  • Does anyone see any major issues with this approach?
  • Does anyone feel strongly that any of the non-short-listed formats should be included further?

Additional Thoughts

For what it's worth, my personal recommendation/intuition is that one of the JSON-based formats (most likely JSON-LD) would best serve as the canonical/core serialization format, while the system used to generate and parse serialize data structures should also be made with CMIS in mind, using that format as the "contrib format of reference". Using this approach the core system will provide all that's needed with the most frequently integrated technology for Drupal: the browser, while also supporting another well supported and powerful enterprise standard and making sure additional formats can be supported via a standardized interface (as in API). This would also imply a slight (perhaps just formal/semantic) change to the proposed task/architecture diagram that would add a "Serialization Engine" somewhere in the workflow for the project (probably as part of or just before the Standard Format Implementation). This engine would also likely need to go somewhere within the Kernel's request handling flow, but since that currently deals mainly with routing requests and not routing I'm less clear how that would look. In any case, these thoughts are just my preliminary ones.

Further Reading

For those interested in further reading on these formats and more general API architecutre and design information, I've posted all online articles and sites I've reviewed as part of the work to date on Diigo as a WSCCI Serialization Format List. I would particularly recommend the following:

Format-specific Resources

These will be included in the wiki pages for each format, but I'll put a quickset of links here for reference:

JSON-LD

CMIS

HAL

OData

JSOP

Other related/inspirational technologies

Comments

I think server-2-server CMIS

lsmith77's picture

I think server-2-server CMIS is the best protocol as it really includes all the details for federation and is actually used for exactly this in all sorts of other CMS solutions.

Server-2-client JSON-LD seems the most attractive assuming that clients usually will not require everything that a CMIS requires and in many cases the message just needs to be parseable, its structure doesn't need to be "discoverable" let alone "spec'ed" out in a generic fashion.

Twice the work?

Crell's picture

While true to an extent, my concern is that we end up needing to support not one but two very different and highly complex formats in core. (OK, JSON-LD is not as complex as CMIS, but you know as well as I that CMIS is a beast.) That's not something I'd relish. It may not be twice the work, but it's at least 1.5x the work.

I like CMIS, in particular

fago's picture

I like CMIS, in particular the RESTful AtomPub binding, as it's very complete. It'd be odd if there is no JSON version yet though.

JSON-LD looks nice, however I'm not convinced that we need the semantic mapping in our main serialization format as it's usually not needed and makes common usages like creating/updating an entity much harder. With JSON-LD you'd have to figure out the right predicates compared to going with plain property names...

re: finding the right predicates

msporny's picture

Actually, with JSON-LD you don't have to find the right predicates for everything. There are two approaches to respond to this question, so I'll answer in both ways:

1) You don't need a semantic mapping for everything in a JSON-LD document - you can mix regular JSON with JSON-LD and that would constitute a perfectly valid JSON-LD document. With JSON-LD, you only add the semantics that you need and you can leave the rest of the JSON as-is.

2) There is an argument that if you're creating new key-value pairs at will and adding them to the system, while the system is flexible and ad-hoc, it's not very well designed. What we have found is that when we are forced to associate JSON keys with URLs at some point in the process, our designs become much cleaner because we start to see cruft building up in the system. We see duplicates whereas before, we wouldn't. Things like whether or not "label" and "name" should be two separate terms and not just one term tend to be discussed - which is good for keeping a system simple. Simpler systems are easier to maintain and thus it's good to have the design discussion at some point.

So - you don't need to assign a semantic mapping for every term in the system with JSON-LD. However, it's usually a good idea to at least have the discussion on whether or not you're going to assign a semantic mapping for a particular term. Often, you will find that there is a better way to represent the data.

re: server-to-server

msporny's picture

I don't think I understand why you think CMIS is superior to JSON-LD as a server-to-server messaging mechanism. That is, JSON-LD was created for the Web Payments work - an initiative to create a decentralized financial system on top of the Web's infrastructure. At it's core, the Web Payments work uses JSON-LD to do server-to-server messaging. All data with a URL is naturally "discoverable" because it's an extension of the Web and is built on Linked Data principles.

Also keep in mind that there are other projects around JSON-LD, such as the Web Keys specification

http://payswarm.com/specs/source/web-keys/

... that enables digitally signed JSON-LD messages, a Web-based public key infrastructure, message diffing, etc. So, there has been quite a bit of thought and implementation put into JSON-LD as a secure, trusted, and verifiable server-to-server messaging mechanism.

JSON-LD

sun's picture

I'm not familiar with CMIS. Grepping its spec for "JSON" yields no results. Does it have a JSON representation at all?

I've looked into the others and JSON-LD looks very clean and attractive.

Daniel F. Kudwien
netzstrategen

CMIS "Browser Binding" Project

ethanw's picture

I have found a few bits related to the CMIS "Browser Binding" Project, which appears to be a JSON implementation.

Seems fairly young at this point, and I'm not sure how we want to figure it in/count on it in this eval or any planning.

http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=cmis-browser and http://www.oasis-open.org/committees/download.php/41895/cmis-spec-v0.5-b... provide some info.

CMIS 1.1 will provide a JSON

lsmith77's picture

CMIS 1.1 will provide a JSON binding AFAIK

Browser Binding/JSON in CMIS v1.1 Working Draft

ethanw's picture

The latest working draft appears to be here: http://www.oasis-open.org/committees/document.php?document_id=45517&wg_a...

It looks like the general architecture is standard HTTP/HTML form MIME encoded requests w/ JSON returned, which seems unique but does have the advantage of allowing standard browser form submission as input.

Serialization Shootout?

dago.aceves's picture

After doing some research it seems like all of the serialization formats have their own merits. Would anybody be interested in seeing a Shootout out together? Or is this something that would rather be debated in the comments?

As far as my serialization format of choice, I really think JSON-LD is the front runner. It's simple to pick up and also very powerful because of its simplicity. Though, I think the bigger win is that the JSON is already a familiar format to a lot of folks. Learning a new set of tokens to a familiar syntax would help sustain a higher level of developer following. Lowering this bar, if you will, would mean we also sustain a higher level of comprehension.

The other JSON based format, Collection+JSON, had less concise documentation, which is why I threw in my hat for JSON-LD. This is also why I raise the thought of having a Shootout, it may shake out some gotchas/facts.

Shootout?

Crell's picture

What exactly is a shootout? :-) You mean "implement both and see what happens"? I'd prefer to not do that until we get the list narrowed down better. Ethan had 3 recommendations to look at: CMIS, JSON-LD, and HAL.

HAL has both JSON and XML forms.

CMIS is XML-based, not entirely from-scratch. So it's no less familiar in its underlying syntax than JSON is. (It has a more complex model on top of it, though.)

"Evaluation" as "Shootout"

ethanw's picture

Now, before we go shooting stuff...

First step in a shootout is probably dueling evaluations. I've set up the framework pages for the comparison (albeit belatedly) here:

http://groups.drupal.org/node/230828 (overview page, can be updated as evaluations progress)
http://groups.drupal.org/node/230833 (evaluation stub, to be used as the basis for each individual format evaluation)

I think all of the shortlist candidates will rank fairly high on all of these notes, but the first question is whether they qualify to enter the "let's implement" ring.

Shootout as Comparison/Evalution

dago.aceves's picture

ethanw got it. I thought "shootout" was the accepted vernacular with all the cool kids. :) It seems he's also taken steps in that direction as well. Good stuff.

re: JSON-LD to XML conversion

msporny's picture

You can convert JSON to XML using this online tool:

http://jsontoxml.utilities-online.info/

Were you guys trying to get something more sophisticated accomplished? If so, we could probably see if somebody in the JSON-LD community wanted to work on a tighter JSON-LD to XML and back converter. Keep in mind that working in XML would only make things harder for most of the use cases we're working on. In fact, we started with XML and abandoned it many years ago because it was just easier to map these concepts to JSON.

There is a reason it's called JSON-LD and not XML-LD. :P

That said, I know that certain people feel very strongly about JSON-only or XML-only stacks... so the JSON-LD stuff can certainly accommodate some level of XML round-tripping.

re: developer familiarity

msporny's picture

Hi Dago,

Yes, the reasons that you cite above are the exact reasons we made the design decision to express Linked Data in JSON. Web developers are very familiar with it. We were very careful to not break the development model that many Web developers use. For example, at one point it was proposed that we use TURTLE instead of creating JSON-LD. This was a non-starter for most Web Developers because they'd have to use tools that they've never used before, learn a new language, etc. Rather than that, we just added a few flourishes to regular JSON and bang! - you've got Linked Data in JSON.

So your reading on JSON-LD is accurate, and it's nice to see this as one of the primary editors on the document because you got it without us having to elaborate upon it. :)

Implementing even partial

lsmith77's picture

Implementing even partial CMIS will be a big undertaking, but might be a good step if indeed eventually Drupal adopts PHPCR. That being said I would try to pull in Alfresco if you want to go that route. I used to work with Jeff Potts (http://ecmarchitect.com/) and could try to get him into the loop if you guys are interested. They are the people behind the current CMIS integration into Drupal.

Drupal Community Expertise

ethanw's picture

Thanks @lsmith77, I think it is key to get the various domain experts involved at this point. I've just set up the stub for individual format evaluations, and having their input on that format would be helpful. Then having support from those with a lot of experience as we actually fill those out wil be very helpful.

re: Technology requirements form

msporny's picture

I think I filled out the right content for JSON-LD - let me know if I missed anything.

Good Resource

ethanw's picture

There are some very helpful slides in there. I especially noted slide 7, the list of challenges, the multi-format deserializer architecture in slide 9 and slide 27 (similar to the representer pattern I was thinking in this post) and the summary slide 35.

One challenge discussed in these slides which we haven't worked into our considerations yet is versioning. This will be key in order to identify version conflicts when the same resource is edited simultaneously by multiple users via the API, etc. I'll add these to the stub right now, and add that resource to the overview page.

JSON-LD versioning/conflict resolution

msporny's picture

You can use any JSON facility that does time-stamping, versioning, patching, conflict resolution to resolve this. Things like JSON Patch:

http://tools.ietf.org/html/draft-pbryan-json-patch-04

If you want to go a bit deeper down the Linked Data rabbit hole, you can use what are called Named Graphs. Named Graphs allow you to talk about information, not just express it. So, a named graph says "This is the information that I know about the URL http://example.org/foo/blah.txt". You can time-stamp when you retrieved information from that URL, you can version it by normalizing the data and digitally signing it (which JSON-LD supports both graph normalization and digital signatures), you can do conflict resolution in a variety of ways since JSON-LD supports information diff-ing (which allows you to tell, deterministically, what new data was added and what data was removed from a particular URL). More about this feature here:

http://json-ld.org/spec/ED/json-ld-syntax/20120522/#named-graphs

and here:

http://json-ld.org/spec/latest/rdf-graph-normalization/

Bottom line: These features allow you to do things like delayed conflict resolution and JSON object/change-set merging. It's not simple... but delayed conflict resolution rarely is simple. What JSON-LD provides is the core language functionality of normalization to make this a possibility. The problem of doing graph diff-ing absolutely should not be underestimated - it took us 18 months to create an algorithm that worked.

CMIS is not going to cut it

chx's picture

It's a colossal piece of work to get CMIS working and yet you do not get i18 support. See https://wiki.oasis-open.org/cmis/Candidate%20v2%20topics Internationalization is listed as a v2 candidate.

Added i18n to Eval Stub

ethanw's picture

Should have been there from the start, but at least it's there now. Thanks for pointing out.

CMIS uses XML, which includes xml:lang attribute

effulgentsia's picture

So, at that level it's on par with JSON-LD's @lang. I think what i18n as a v2 feature is referring to is the API portion of CMIS, for which there's no JSON-LD equivalent.

re: JSON-LD API

msporny's picture

A bit hard to parse what you're saying - it could be "JSON-LD doesn't have an API" or "JSON-LD doesn't support I18N in the way that we need it".

JSON-LD does have an API:

http://json-ld.org/spec/latest/json-ld-api/

The JSON-LD API supports the language attribute:

http://json-ld.org/spec/ED/json-ld-api/20120524/#attributes-4

JSON-LD supports a ton of I18N use cases - which one specifically do you think it doesn't support?

JSON-LD Background

msporny's picture

Hi folks,

Stéphane Corlosquet (scor) pointed me to this discussion and asked if I had some time to answer some JSON-LD questions. To which I thought: It's the Drupal community - of course we have time for the Drupal folks!

Just a quick bit of background before going into a high-level on JSON-LD and then replying to concerns. My name is Manu Sporny - I wrote the first version of the JSON-LD language and continue to be one of the primary editors of the document. I'm the current Chair of the RDFa/RDF Web Applications Working Group at W3C as well as the acting chair of the Web Payments and Linked Data in JSON Community Groups at W3C. I am also an editor in the Microformats Community, Web Payments / PaySwarm CG (Web Keys, Web Payments, Payment Intents), RDFa WG (RDFa 1.1 Lite, RDFa 1.1 Primer), HTML Working Group (HTML5+RDFa), RDF WG (JSON-LD Syntax, JSON-LD API), and member of the Semantic Web coordination group, RDF Working Group, and a variety of other groups operating at W3C. So, I dabble in Linked Data, the semantic web, and payment standards for the Web.

We created JSON-LD because we wanted a way to express Linked Data in JSON that didn't have any of the nasty RDF cruft that had built up over the years. Even though I participate in a variety of RDF/Semantic Web related activities, I've always hated how complicated it was to write code for the Semantic Web and Linked Data. So, JSON-LD was a back-to-basics approach to Linked Data - RDF was put on the back burner and the primary focus was on making Linked Data easy to use for Web Developers.

Specifically, we needed a way to pass JSON objects and messages for the Web Payments work, ensuring that the messages could express Linked Data (IRIs, I18N, etc.), but without changing the workflow for Web Developers that already use JSON. Yes, there is a loss-less mapping from JSON-LD to RDF and back, but that's very much in the background - you don't have to work in triples, or SPARQL, or RDF, or any of the other technologies that separate you from the data. Just work in JSON for the most part. We've even built an API for JSON-LD (called the JSON-LD API) that allows you to transform JSON-LD into a variety of different layouts that make programming with the language easier. For example - we provide a feature called "framing()" that allows you to take input data and re-structure it so that it aligns nicely with the algorithms in your application (JavaScript, Python, Ruby, PHP, etc.)

So, JSON-LD seems to align really nicely with the WSCCI stuff that you're working on because... this is exactly what we designed the language and API to do. You can view the latest spec for the Syntax here:

http://json-ld.org/spec/latest/json-ld-syntax/

and the latest spec for the API here:

http://json-ld.org/spec/latest/json-ld-api/

If you'd like, you can play around with a live JSON-LD editor here:

http://json-ld.org/playground/

Click the buttons for the examples at the top to see the JSON-LD and then the output. In the next post, I'll try to respond to how JSON-LD meets the requirements of the project. Hope this is helpful. :)

JSON-LD Deep Dive

msporny's picture

Hi all,

Here are all of the requirements that you listed for this project and how I think that JSON-LD addresses each criteria:

Format Expressiveness/Hypermedia Linking

JSON-LD was created to express Linked Data in JSON without requiring folks to change the way they publish and consume JSON (unless they wanted to use some of the more advanced features of JSON-LD). This means that you can add meaning to any JSON document today by just adding an HTTP header:

http://json-ld.org/spec/ED/json-ld-syntax/20120522/#referencing-contexts...

Or by adding key-values to the data (one key-value if you want your keys to expand mean something in Linked Data, two key-values if you want to give your data IDs that are IRIs and make it true Linked Data):

{ 
  "@context": "http://json-ld.org/contexts/person",
  "@id": "http://dbpedia.org/resource/John_Lennon",
  "name": "John Lennon",
  "birthday": "10-09",
  "member": "http://dbpedia.org/resource/The_Beatles"
}

Example Hypermedia node content

You use IRIs to link to content - so any IRI will do. Here are some examples

{
  "@context": "http://example.org/drupal-wscci",
   "localFile": "file:///tmp/foo.txt",
   "localResource": "http://localhost/foo.txt",
   "remoteResource": "http://www.youtube.com/watch?v=RYlCVwxoL_g&feature=g-vrec",
   "remoteFile": "http://example.com/music/mysong.ogg"
}

Support for ad-hoc/configurable resource definitions

All JSON keys and values can be given "meaning" by modifying the JSON-LD Context:

http://json-ld.org/spec/ED/json-ld-syntax/20120522/#the-context

Not every value in the JSON needs to have a mapping in JSON-LD. That is, you can mix plain old JSON with JSON-LD and that's a perfectly valid way to operate in JSON-LD. This allows you to add fields first and give them meaning later - allowing flexibility during the design/development process. If a particular key/value becomes used widely, you can give it "meaning" by defining things like a URL to identify the term, a datatype, a default language, etc.

Discoverability is done just like on the Web. URLs and IRIs are first-class citizens in JSON-LD, which means all you have to do is "follow your nose" and you may find /more/ data at that URL. For instance, if you see this URL in JSON-LD: "http://example.org/foo/bar" - you can, via HTTP negotiation, ask for a JSON-LD representation of that IRI and you may get back a document with more JSON-LD data in it. So, you can effectively crawl the web to find more data... this is one of the powerful concepts that Linked Data uses to make data more useful and less tightly coupled with the system that is publishing it.

Formatter Implementation/Handling

I don't really understand this requirement, but I'll take a shot at it anyway:

You can associated a slew of meta-data with a URL in JSON-LD. This means you can give that URL a human readable name like "Really funny cat video", or a creation date like "2012-05-12T21:48:22Z", or even things like saying what sort of editor should be used to modify the URL (like a Web-based video editor that is started via a Web Intent): "contentEditor": "http://www.aviary.com/web". This approach is incredibly flexible and allows you to describe as much as you want to about a particular piece of data...

Internationalization

JSON-LD supports full UTF-8 Internationalization, and even allows you to tag any string with a language value (which allows you to do things like specify text labels for a particular piece of editable content in multiple languages):

http://json-ld.org/spec/ED/json-ld-syntax/20120522/#string-international...

Support for Collections

There are two types of supported collections in JSON-LD: sets and lists.

Sets are unordered collections (the concept of a mathematical set).
Lists are ordered collections (the concept of an array).

http://json-ld.org/spec/ED/json-ld-syntax/20120522/#sets-and-lists

UUID

UUIDs are supported in JSON-LD. There are three major ways of expressing UUIDs. The first is through a simple key-value:

"uuid": "550e8400-e29b-41d4-a716-446655440000"

The second is by representing the UUID as an IRI:

"@id": "uuid:550e8400-e29b-41d4-a716-446655440000"

The third is by translating a URL for the data to a version 3 UUID ( http://en.wikipedia.org/wiki/Universally_unique_identifier#Version_3_.28... ):

"uuid": "6ba7b810-9dad-11d1-80b4-00c04fd430c8"

But the best way to give your data an ID is to generate a unique URL for it that resides on the Drupal system:

"@id": "http://groups.drupal.org/comment/reply/229318#data-389247982"

Versioning/Conflict Resolution/Locking

You can use any JSON facility that does time-stamping, versioning, conflict resolution to resolve this. If you want to go a bit deeper down the Linked Data rabbit hole, you can use what are called "Named Graphs". Named Graphs allow you to talk /about/ information, not just express it. So, a named graph says "This is the information that I know about the URL http://example.org/foo/blah.txt". You can time-stamp when you retrieved information from that URL, you can version it by normalizing the data and digitally signing it (which JSON-LD supports both graph normalization and digital signatures), you can do conflict resolution in a variety of ways since JSON-LD supports information diff-ing (which allows you to tell, deterministically, what new data was added and what data was removed from a particular URL). More about this feature here:

http://json-ld.org/spec/ED/json-ld-syntax/20120522/#named-graphs

and here:

http://json-ld.org/spec/latest/rdf-graph-normalization/

Bottom line: These features allow you to do things like delayed conflict resolution and JSON object/change-set merging.

PHP Libraries

There is a reference implementation in PHP:

https://github.com/digitalbazaar/php-json-ld

It's the same one we use for our commercial implementation - it's always up-to-date (and will be kept up-to-date) because our business (Digital Bazaar) depends on it.

JavaScript Libraries

There is a reference implementation in JavaScript:

https://github.com/digitalbazaar/jsonld.js

It's the same one we use for our commercial implementation - it's always up-to-date (and will be kept up-to-date) because our business (Digital Bazaar) depends on it. You can see this library in use at:

http://json-ld.org/playground/

Current Drupal Projects and Groups

Unfortunately, I don't know enough about active Drupal Projects and Groups to answer this question. I know scor (Stephane Corlosquet) has a JSON-LD module for Drupal.

Community Experts

There are a number of people that are always on the #json-ld IRC channel on freenode.net. We're on there 24/7, so if you have a question, just drop it in the channel and you will most likely get a response within an hour (unless all of us are sleeping).

Other Drupal Resources

JSON-LD sprang out of the requirement for a light-weight Linked Data format for Web Developers. We were working on the Web Payments work and RDFa when we needed something lightweight to do a good Linked Data REST API:

http://payswarm.com/

Anticipated "Lift" for Core Implementation

I think that we've implemented most of the language/API stuff that you'd need. There are up-to-date libraries in PHP and JavaScript that have commercial support via Digital Bazaar. The last remaining bit that we need to settle once and for all is the Framing code for the JSON-LD API... but that's progressing nicely.

Also keep in mind that the W3C just picked up this spec for standardization... so, it's on track to become an official standard at some point in the next year:

http://www.w3.org/2011/rdf-wg/meeting/2012-05-30#line0235

Marketshare

Look at slide #15 for the systems that are currently integrating or have integrated JSON-LD:

http://www.slideshare.net/lanthaler/jsonld-for-restful-services

JSON/XML Flexibility

Since there is a path from JSON-LD to RDF... you can render in RDF/XML - which would be an awful thing to do. Since JSON-LD exposes a tree-based structure, you could also easily convert the key-value pairs into an custom XML-based Linked Data format. I can go into more depth on this if required as it's a big post in and of itself.

Semantics

JSON-LD is all about semantics... after all, it's a Linked Data format in JSON. More about this here:

http://json-ld.org/spec/ED/json-ld-syntax/20120522/#introduction

Support for Semantic Querying

There are two primary ways that JSON-LD can be queried. The first is via the JSON-LD API. You can query by example using the Framing feature of JSON-LD:

http://json-ld.org/spec/ED/json-ld-api/20120524/#framing

To view a framing example in the JSON-LD Playground, go here: http://bit.ly/KLvuTO
Click on the "Framed" tab at the bottom. You will notice that the input data is a flat representation of libraries, chapters and books. The framed output organizes the data into a more hierarchical form using something called a "frame" that places all chapters into books and all books into libraries. This is a mechanism called "query by example". You give the JSON-LD API a JSON object that you'd like to see in the output and it queries the data to find "things" that look like that object.

You can also translate JSON-LD to RDF and put that into a triple/quad store and use SPARQL to query the triple/quad-store.

Semantic Libraries/Tools Using

Aside from the libraries themselves, you can see other projects that are integrating JSON-LD here:

http://www.slideshare.net/lanthaler/jsonld-for-restful-services

Expect to see many more when the spec is finalized in the coming months and released as an official W3C standard.

Hope this helps. :)

wow

sun's picture

Thank you, @msporny, this totally helps! :)

  • re: "Formatter Implementation/Handling"

    I'm not 100% sure on this either, but I can only guess: A typical problem of raw data formats is the question of how to represent raw input vs. formatted output, potentially usable within HTML.

    As a hypothetical example, let's pretend that http://groups.drupal.org/user/283.json is supposed to return my user account in JSON. Let's also pretend that g.d.o user accounts would allow users to enter their biography as a long, formatted text (through a textarea), to which we potentially apply output filters in order to turn double-newlines into paragraphs, single newlines into BRs, and automatically convert URLs into anchors/links if they aren't already -- on output.

    Thus, another site or web service that just wants to consume my biography as HTML to embed it into its output shouldn't have to re-implement and redo all of that input formatting on its own. However, in other cases, e.g., two Drupal sites communicating with each other, the consuming site would instead want the raw, unfiltered input along with the filters to apply (defined and retrieved separately). In turn, we have two representations of the same { biography: '...' }.

  • re: PHP library

    Any plans for migrating/converting the jsonld library to PHP 5.3 + PSR-0? :)

Daniel F. Kudwien
netzstrategen

re: wow

msporny's picture
  • re: "Formatter Implementation/Handling"

Oh, okay. Then yes, JSON-LD can do that... one fairly straight-forward approach would be this:

{
  "@context": "http://drupal.org/contexts/accounts"
  "rawBiography": "This is the raw, unedited data...",
  "formattedBiography": "<p>This is <em>formatted</em> bio data..."
}
  • re: PHP library

I don't see why not... I would imagine we could do it in a day or two... if there is a requirement by Drupal that this needs to be done, we can put some engineers behind it to make sure it happens.

Welcome!

Crell's picture

Thanks, Manu! This is all very helpful! At the moment it looks like we are leaning toward JSON-LD. The main drawback is that it doesn't include the sort of standard related object, listing, and subscription mechanisms that Atom/AtomPub do. If we're wrong about that and that sort of logic is in there, please let us know because that would be sweet. :-)

As far as a library, Drupal has adopted PSR-0 for its own code and strongly prefers it for 3rd party code. An all-OO PSR-0 library is just much easier to work with from an API consumer perspective. I've not looked into your company's library vs. Markus' in detail, but all else equal we'd favor a PSR-0 library for simplicity. (That doesn't mean you have to change yours if Markus' is sufficient for our needs. We can just use that. I'm not entirely sure why y'all have two reference libraries, though... :-) )

Atom/AtomPub vs. JSON-LD

lanthaler's picture

Atom is basically a container holding a list of entries that is targeted to content syndication (therefore all entries have titles, authora etc.) whereas JSON-LD is more generic in the sense that it was built to allow the serialiazation of arbitrary Linked Data graphs. That being said, there's nothing that prevents you to use JSON-LD in a similar fashion than Atom. You can also just as easily subscribe to a JSON-LD "feed" if you want.

Regarding AtomPub you are right. JSON-LD doesn't define/prescribe an interface for manipulating collections and their entries like AtomPub does... therefore there's also no tooling support available for that (yet). Of course you can implement a RESTful API and use JSON-LD as the data format to achieve the same thing. Having worked on a number of RESTful APIs I can say that Atom+AtomPub is really useful and provides a lot out of the box, but often it is quite difficult or even impossible to shoehorn a specific application into that container+entries model. Often Atom is just not flexible enough or too cumbersome.

The reason why we have several libraries is that we wanted to make sure that we create a interoperable solution and specify it well enough for other implementers. Furthermore I wanted to have a library I could use to experiment with in the process of creating JSON-LD itself. That's why my processor behaves slightly different in framing at the moment than Digital Bazar's one for example (framing is still a work in progress).

Markus Lanthaler
@markuslanthaler

Yep

Crell's picture

I've used Atom (but not AtomPub) on a few projects before, mainly for the PushHub/Tombstone support. While it doesn't map to Drupal perfectly it is not that far off. We used a custom XML format for serializing entities, but we're trying to avoid custom formats as much as possible. At the moment our thinking is that if you need that level of flow control and subscription, tossing JSON-LD strings into an Atom feed is a bit clunky (since then you need both an Atom and a JSON-LD serializer/parser) it's not the end of the world, and we couldn't think of a reason that it couldn't work. We'll likely need to develop our own set of REST rules for manipulating entities, for which we'll skew toward AtomPub's approach wherever possible.

Gotcha on the dual libraries. I don't know that we'll be doing anything advanced enough to worry about framing differences at the moment, but we'll figure that out in implementation. :-)

JsonLD: PHP 5.3 + PSR-0

lanthaler's picture

Hi Daniel,

I'm also an editor/author of the JSON-LD spec. You might be interested in my JSON-LD processor which was built for PHP 5.3, has PSR-0 support, and can be installed via Composer.

Please note that it isn't as complete as Digital Bazaar's implementation yet - to/from RDF is still missing. Also there are currently some differences in framing which is still being discussed (everything is documented in the README). All the rest works and is fully tested - it passes the whole official JSON-LD test suite :-)

I've also set up a playground so that you can test it without having to install anything.

Hope this helps,
Markus

Markus Lanthaler
@markuslanthaler

Double-Wow

ethanw's picture

Since you've already completed the bulk of the work on this one, feel free to post as a wiki page which we can then link to from the overview main page. I'll be doing the same for CMIS and HAL.

This is really an awesome start, and it's great to have someone from the JSON-LD community so engaged with the effort.

Available for phone discussion, if necessary

msporny's picture

I forgot to mention that the JSON-LD Community is open to the public. We have weekly teleconferences, which are also open to the public, are audio-recorded and minuted... example here:

http://json-ld.org/minutes/2012-05-29/

We would be delighted if a couple of you wanted to join us on a call to discuss Drupal's needs. Our calls are on Tuesday at 10am EST, details on joining are here:

http://json-ld.org/minutes/

If that doesn't work, we can setup any other time to have a call that works for a group of you here. We will host the call and can record it (and minute it) so that folks in the Drupal community can listen to the call at a later point or read the minutes. Let me know if this would be helpful and we can arrange a telecon to specifically discuss Drupal issues.

There is also the mailing list where you can drop in questions and will usually get an answer pretty quickly:

http://lists.w3.org/Archives/Public/public-linked-json/