Anatomy of a complex path

You are viewing a wiki page. You are welcome to join the group and then edit it. Be bold!

Posted by sdboyer on September 20, 2012 at 4:14pm
Last updated by sdboyer on Mon, 2012-10-15 05:35

There's a lot of discussion about routing going around, and to help clarify it, I'm putting together this example of a complex-but-reasonable case for the set of determinations a Drupal 8 site may need to make in order to fully resolve to a single router. I'm hoping that such an example will help bound the discussion and allow us to tease apart the appropriate separation of concerns for the remaining challenges we face.

Currently, we have agreement that we should permit multiple routes per path, and that the routing system should be able to effectively select between them based on varying additional contextual information. We've further determined that that contextual information may be native HTTP protocol information as represented on the Request object, or that it may be secondary/derived information that is not directly present in the Request, but has been deduced from it.

The main routing patch, as of this writing, is pioneering a NestedMatcher approach, which essentially does the following:

Find all registered routes (a RouteCollection) that match the path in the Request
Iterate over a pluggable-through-DIC-compilation series of PartialMatchers, each of contains logic to disqualify routes based on some logic. Disqualified routes should be removed from the RouteCollection.
Pass whatever remains in the RouteCollection to a FinalMatcher, which makes the final determination about which route to select based on its own logic. The default is just to grab the first route from the list.

Some concerns about the NestedMatcher approach have been raised. They're part of the motivation for this discussion - by laying out the big picture, we can better figure out whether the approach is, indeed, problematically non-deterministic.

node/{node} (simplified)

If we're looking for a complex case, node/{node} is the obvious choice. So let's look at that, and let's start by pretending that each route could only conceivably serve a single MIME type.

I've emphasized where these cases are likely to be created, as that influences the manner in which we can construct logic to satisfy the case.

text/html - most of what Drupal does, and the type where all the Scotch/Panels-y stuff will do its thing.
- Catchall - has no further conditionality beyond wanting text/html and rendering a node. This'll be the standard route core provides to handle the rendering of nodes.
- For nodes of type 'Product' - say that a contrib module provides their own bundle called Product, and they want it to ship with some funky special route in order to provide a different set of blocks for the page. More on that later.
- For nodes created in the last week - this is more of an site or distro builder, business logic-type conditional. thus, it would probably not be manually coded. This may not merit its own route.
application/ld+json - with all the discussion about using JSON-LD, this only makes sense to include.
- Catchall - core-provided, this would map to a controller with the default behavior for representing a node in JSON-LD.
- For nodes of type 'Article' - probably contrib-provided for some special use case, IF it exists at all. Which I think it probably shouldn't - per-node-type dynamism should be achieved via logic contained within the controller, and has no reason to live at the route level.

(Please feel free to edit & add more)

The key point here is why the text/html variants merit routes, whereas the JSON-LD behavioral variation typically ought to be performed beneath the controller: in order to make Scotch/Panelsy controllers work, we need something to hang configuration (block placement & styling, caching, etc.) on. Maybe more importantly, we need to know the sort of Request that will cause that configuration to be used. That's an essential prerequisite as it's the only way for us to be able to perform cross-"panel" operations - e.g., making the decision at admin/structure/blocks that we want to add the 'Navigation' block to all right sidebars. Since we aren't injecting such blocks at runtime (doing so would undermine theming consistency, destroy the guarantee of accurate caching, and potentially have access control problems), the decisions made at that global level have to be injected into the configuration for each individual "panel." And the best way to determine which of the "panels" need configuration injected is to use the routing/matching system itself: mock a Request object and see which routes come back that match its criteria. So, if we were to move the conditionalized/variant selection logic anywhere below the route level, we would be less if at all able to reuse the existing routing system to make the determinations.

Accept, Content-Type, and mod_negotiation

Unfortunately, content negotiation is hardly so simple as the above list suggests. Browsers tend to send Accept headers that look more like this:

Accept:text/html,application/xhtml+xml,application/xml;q=0.9,/;q=0.8

Sooo, yeah, that's a whole stack of stuff to match against. With respect to doing the resolving, I agree with what Larry's said elsewhere already: it'd be great to follow on the heels of Apache mod_negotation for this, perhaps with a library like BadFaith...though it seems that progress on that has stalled. In any case, we still have to figure out what the best practices will be for designating the MIME type of the Response a given route is capable of producing. The Symfony\Component\HttpFoundation\Response class has some basic logic in it already: if the controller fails to explicitly set the Content-Type header, then it first attempts to set Content-Type based on the Accept header of the Request, and if that fails, then it simply defaults to text/html. That's fine, but will hopefully rarely happen, as I'm hoping the route building system will ensure some value is always given Content-Type on routes declared via hook_route_info(). That'll ensure we can filter them out effectively during routing.

There's a question hanging in the air, though: does it make more sense to have multiple routes differentiated by Content-Type, or should there be a single route with a controller that is capable of reading the Request's Accept header and responding accordingly? I don't think there's one answer; we need to take it case-by-case. That said, I do think that the Panelsy controller for text/html would be ill-served by trying to do double-duty, and should really just stick to the one MIME type.

HTTP methods and other headers

mod_negotation takes care of the Accept, Accept-Language, Accept-Charset and Accept-Encoding headers, but there are still others. Most likely to act as differentiating dimensions in the routing system are probably User-Agent, Referer, and maybe TE or Authorization. The various HTTP methods are also ripe candidates for differentiation.

These dimensions are all orthogonal to one another, which can make the resolving process somewhat complex. However, they also have externally defined ranges that are not especially subject to change. So while they may be multidimensional, they are at least non-arbitrary in the data they look at, which means we can (and should) define strategies around matching, fallbacks, and some form of wildcarding for each. This does seem to be a good case for the NestedMatcher approach, as we can stack up PartialMatchers for each of the dimensions in whatever order we determine to make the most sense (and maybe also only compile those matchers into the container if something actually needs them), and they can operate in their domains across a set of known ranges without stepping on each others' toes.

Controller resolution

Work on controllers, and the practical implications of this, is happening at http://drupal.org/node/1812720.

Comments

As you point out, JSON-LD

Posted by linclark on October 6, 2012 at 8:37pm

As you point out, JSON-LD ought to be handled beneath the controller. We don't want to have to register a controller which then replicates the logic of the controller we're circumventing. A serialization module such as JSON-LD should only be in charge of one thing... taking an entity and serializing it to a JSON-LD string, and vice versa.

There is currently an open issue to implement an entity render controller. As Crell has pointed out for other controllers, this should be renamed to something other than controller to avoid confusion, so I will call it the Renderer.

This renderer allows each entity to register which class should be used to render it. Renderers implement EntityRenderControllerInterface, which declares three functions:

buildContent
view
viewMultiple

The renderer is registered in hook_entity_info using something like 'render controller class' => 'Drupal\node\NodeRenderController'. If this was instead registered as an array keyed by mime-type, then we could use the Accept header to determine which class should be used to render the entity. And then serialization modules would just have to add their Renderer using hook_entity_info_alter.

Then, any entity route (i.e. node, node/%) would be required to provide one controller that only loaded the pertinent entities and used the functions from EntityRenderControllerInterface. If the request was for any content type that is not html (or a content type which the entity has a renderer for), then it would default to this controller. A second route that was specific to HTML could be provided, which would use node_page_view as its controller.

This way, Panels could still completely take over for html routes and define a totally different controller... but serializations wouldn't have to define their own controllers, just renderers. And if a serialization module DID want to totally override the controller for a route based on Accept header, it still could.

poked around at a couple of

Posted by sdboyer on October 14, 2012 at 3:05am

poked around at a couple of the other issues and thought about this some more, and i think i can provide a more cogent response now.

i think the idea of having the methods on EntityRenderControllerInterface be the points of ingress for doing non-html rendering could work. the obvious problem is that we need to allow other modules to add to the set of MIME types a given route is capable of accepting, but can't directly add methods to the base class. a bit of delegator pattern, and that's taken care of. and with that in place, it's quite easy to add a listener on the controller event or at the end of the request event stack that, as you've said, puts the appropriate class in place IFF the serialization route was selected by the matching process.

what's somewhat trickier is making sure that we're able to compose a sane Content-Type property for the route out of the various dynamically registered serialization formats. this DEFINITELY can't be done at runtime; we need to figure it out at compile time and store that in the db. shouldn't be too hard; the biggest challenge, i think, is figuring out which declared routes this should all be attached to. that's taken care of if the entity system is generating its own routes, though.

only other note would be that these controllers should, IMO, only be for the serialization formats. there's a ton of stuff we need to do with controllers for the blocks/layouts approach, and we have an entirely separate set of classes with which we do it. so yeah, i'm good with the entity system generating its own routes...so long it backs off of html :)

handling html different?

Posted by fago on October 17, 2012 at 8:29am

Interesting idea.

The HTML case is different though, not only as it won't use any probably existing serialization interfaces, but goes via the theme system. Also, we won't have unserialization support ;) It should be possible to only implemation the serialization part of the interface though.

What's probably also different are hooks. We have a lot of hooks for entity viewing which we probably don't want to fire or should be at least available be per media type?

So, we could streamline that into a single API if we want to. But do we want to have entity_render($entity, $view_mode, $mime_type = 'html') and something like $mime_type in hooks?

Since I posted the comment

Posted by linclark on October 17, 2012 at 2:48pm

Since I posted the comment about entity render controllers, I started working with Symfony Serializer component more. I now believe that our entity system won't need to worry about mime type.

I posted a diagram in Enable JSON-LD entity serialization which helps show how the correct class is chosen for serialization.