March 20, 2012 Drupalcon BOF, Semantics: What is a 'page'?

You are viewing a wiki page. You are welcome to join the group and then edit it. Be bold!

Posted by smira on March 20, 2012 at 10:59pm
Last updated by westwesterson on Sun, 2012-03-25 14:46

In this BoF we tackled the job of clarifying for ourselves the meanings of 2 words that are commonly used in discussions around Core Context UX. The words were 'page' (what is a web page?) and 'context' (how many ways is this word used?).

To do this we split into 2 groups of 15 or so people and recorded our findings. Here are our notes for the 'page' discussion. The 'context' discussion is documented at http://groups.drupal.org/node/218904

(MK: Thanks go to Miro for volunteering to capture everyone's thoughts as we went around the table.)

Notes:
This post is the summary of the discussion where each person defined their meaning for what is a page? (each person uses themselves as the "role")
Miro - what loads on the other end of a url
Mike - where the content goes. Html5, everything that goes in the content
Jake - subject, summary, author, meta and url. All the data and elements that get passed
Steve - data response that gets passed from an html request
Jen - everything inside an html tag
Lisa - single display of content made up of various sources for different purposes
Pete - page is the content of a browser window in between page reloads (includes ajaxified content)
Micheal - location on a website with content which is identified by a url
James - A set of pieces or blocks that are organized and delivered to a user
John - there are different types of pages (apps vs content pages) content pages are a collection of main content and supporting content arranged in a certain way. Application pages are like the drupal backend or other interfaces (ie. Mailchimp admin interface)
Ryan- a page holds content and has weight points to let the user know where they are and links to allow the user to navigate
Chuck. A page is a group of messages or content within html that is directed to a user. A page has different groups or types of messages geared towards that user
Stan - as a content editor a page is a grouping of arranged content delivered to the browser in an embeddable widget (ie phone etc...)

We agreed that a website is a collection of pages, some live in the back end and others are public.
The bottom line is that a website is a set of urls which describe pages.

Michael proposed the idea that they are locations in an 'information architecture space'.

James pointed out that this paradigm is only partially correct since it is possible to display partial content to not-authenticated users vs the entirety of it to authenticated users.

Does the page include the tools that allow you to create or edit the page? Ie. Wysiwyg toolbar

We need different definitions of what a page is depending on role.

Differences depending on role:
The layout
The information
The navigation

Similarities depending on role:
The url

A site editor's goal is to serve a message or experience to the user.
For example a page that is only available to registered users

A page might describe a relationship between the business owner, the content, and the end user that is consuming the content.

As a site builder would you have a conversation with the business owner about the purpose of this page? If so the page exists conceptually long before actual content has been defined.
In these terms you could map out an entire website (IA) as a collection of url that define pages.

Michael proposes that we should re-appropriate the term as a SPACE that is very simply defined as a url (as opposed to using the term 'page'). This is also why we differentiate from the term url and uri.

The panels page manager can be redefined in core as a way to organize the spaces defined by url on the site.

The site builder has to work with the client to define the pages of the website. There needs to be a simple transition between this business oriented conversation and the process of website building.

Another consideration is where a site has a large number of apparent pages, such as Youtube or Amazon, there needs to be a "wild card" that determines the specific page, space, pattern. (User Advocate: There may be only a few page archetypes and a shallow hierarchy. Everything is driven by contextual variations related to tagged content.)

The more complex and the more data you have the more the hierarchy of the site goes away. Search result pages for example are non-hierarchical way of organizing content.

The synthesis of a page at a given IA Space location can be defined as a behaviour. (User Advocate: this relates strongly to the concept of 'variants' in Panels or 'responses' in WSSCI terms)

Who are you + where are you = page (these are all contextual factors)

A site builder needs a systematic way of controlling layouts.

Tomorrow afternoon core discussion at 3:45

Concerns:
If there are too many pages or urls this could be intimidating

Comments

I think the key concept that

Posted by karschsp on March 21, 2012 at 3:49am

I think the key concept that came out of this conversation for me was a URL represents a resource, or space, whose contents could change depending on context. So resource (space) + context = page. Or in the "everything is a block" vein of things, resource(s) + context(s) = page.

Re: I think the key concept that

Posted by eclipsegc on March 25, 2012 at 12:02am

So, just to piggy back on this a little further, I think it's more precisely:

(url + context) determines the layout, which defines the components, which assembles into a page.

Hopefully that's not complicating the discussion, but having read the results of this session, it seems everyone is sort of on the same page, so I wanted to take the next logic step.

Eclipse

Not sure if you want to get

Posted by xtfer on March 25, 2012 at 6:00am

Not sure if you want to get into too much detail here, but the execution login is slightly different, because layout is actually two concepts, the layout itself (which is arbitrary), and the configuration of the regions which make up that layout (or build mode), which is more specific.

Region configuration occurs at build time, while layout occurs at render time.

- (url + context) determines the possible components, including layout AND region configuration
- region configuration is run through layout to produce a page

We tend to conceptualise a 1=1 relationship between layout and region configuration, but this is an assumption, because the region configuration is tied to the (url+context), not its specific layout. This has to be this way to support returning multiple formats for the same url (eg JSON is a specific region configuration with only one region, for example), and even something simple like switching layouts for a build mode.

Perfect.

Posted by AmyStephen on March 25, 2012 at 6:16am

Perfect.

~~ Amy Stephen ~~
http://OpenSourceCommunity.org

To me a key aspect of this

Posted by amorsent on March 25, 2012 at 7:37pm

To me a key aspect of this initiative is to flip the logic from being push based to pull based.
In other words, right now we currently build a whole bunch of stuff and pass that off to the theme / layout and say, "here, hope this is what you need".

I think what we want to do is define the layout and let the layout determine what gets built.

Considering that we're talking about everything basically becoming a block, and every block as being a full fledged resource with it's own url, I think it's somewhat useful to me to consider a 'page' as basically just a list of sub urls inside a layout framework.

Context certainly effects decisions about access control, layout, the urls included, and what arguments are passed into those urls. This is true recursively all the way down for each sub url.

The only thing special about a 'page' is that it's the top-level request, and it gets wrapped and packaged as a fully formed html document.

Imagining a resource as a list of urls, seems fairly conceptually similar to Nate, Jen and Carl's template proposal: http://denver2012.drupal.org/program/sessions/token-templates-new-templa...

The only real diference is that one uses tokens and the other is urls. The important thing in both cases is that the logic is pull based, not push based.

You are largely correct about

Posted by xtfer on March 25, 2012 at 8:04pm

You are largely correct about the request structure, however the separation between configuration and layout remains the same. A block is merely a region (or a field). Depending on the type of resource requested, it will have to pass that information to different rendering methods, some of which will support "layout" (e.g. HTML) and some of which may not (e.g XML, JSON).

Theoretically, for example, you may wish most of your regions to return HTML, but for one region to return CDATA, i.e. JavaScript. Thats a legitimate use, however the region returning CDATA wouldn't have any layout, per se.

I think we're complicating

Posted by eclipsegc on March 25, 2012 at 8:40pm

I think we're complicating some aspects here and confusing some others, so let me try to get us back on track.

The html layout is interested in one thing and one thing only... responding to requests whose method is GET and who's Accept/Content-Type header is text/html (or equivalent). Requests which do not conform to these specifications will not be our problem (but the web services initiative's problem instead).

That being said:

A Page is a resource which can be reached from a particular url, the output of that page is determined by method and Accept/Content Type headers of that request. The layout is one of a potential N number of layouts where the layout delivered is determined by a series of conditions (you may want to deliver a different layout for page nodes than you do for article nodes, these reside at the same router item, node/%node, but the layout varies by node bundle). Once the proper layout response for this url+context is chosen, than layout knows all blocks meant to display within it. Regions are defined by layouts, contents of the regions are defined by the layout's configuration. Individual blocks within that layout can further be access controlled via condition mechanisms.

This conversation seems to be blending text/html responses with json/application/other responses and that's simply not how it will end up working. The block system responds to text/html. A different system will likely be require to render various drupal data to json, xml, etc.

Eclipse

Well, this is an inherently

Posted by xtfer on March 25, 2012 at 10:25pm

Well, this is an inherently complex problem...

Once the proper layout response for this url+context is chosen, than layout knows all blocks meant to display within it. Regions are defined by layouts, contents of the regions are defined by the layout's configuration.

As I said previously, this conflates a configuration region with a layout region. A layout does not know "all the blocks within it", the region configuration knows this. The two are not the same, and if you tie them together, you will run into problems down the track, because the relationship is arbitrary. Specifically, you might find yourself tied to specific response formats and specific cases. This happened to both Display Suite and Panels.

There is also the problem of render order. If I configure a region in a layout, then override the layout template, I have changed only the layout, but not the configuration, and in addition Drupal will not even be aware the layout has changed – what remains constant is the build mode/context+url.

Panels, Display Suite and Drupal core all work this way: collect configuration information in the build stage, then pass through a layout in the render stage. Without having seen any code yet, what you seem to be proposing merges configuration required for the render phase into the build phase.

The html layout is interested in one thing and one thing only... responding to requests whose method is GET and who's Accept/Content-Type header is text/html (or equivalent). Requests which do not conform to these specifications will not be our problem (but the web services initiative's problem instead)... This conversation seems to be blending text/html responses with json/application/other responses and that's simply not how it will end up working. The block system responds to text/html. A different system will likely be require to render various drupal data to json, xml, etc.

A block should be able to return JSON, for example, if thats what's requested - isn't that the entire point of the WSCCI project? If I load a page using text/html, then come back later wanting the data content of a block from that page in json/application (which may well be a block containing layout information), then the same set of information must be retrieved as for the HTML response, only the format returned is different. Separate the build and render workflow's and you wont have a problem there, because you can simply pass all the data into JSON formatter instead.

If all fields are blocks, then any half-decent layout tool will be responsible for field configuration as well as layout (more or less), so any rendering plugin which supports (at minimum) field ordering or label configuration (etc) should be in scope.

EDIT: It's quite possible you understand all of this, but Im just relating my experience trying to solve a very similar problem with Display Suite.

I DO understand all of these,

Posted by eclipsegc on March 26, 2012 at 1:39am

I DO understand all of these, but I think that perhaps some communication has not been made that should have been.

There is no magic webservice stuff planned as far as I know. Blocks react to text/html, that is all. They do not also magically react to other content-type requests because you cannot know what you might want to deliver to someone when requesting a particular url. That is for site builders to define. So when you visit node/%node and that node is a page bundle, I may want to deliver a particular layout in text/html, but in json, I may want to only deliver the title and body of the node. When visiting an article I may want to render via text/html some different layout from pages, while in json I want to render the title, author, body and number of comments... I am being arbitrary here, but that is purposeful because we cannot simply expect blocks (of which there will be multiple) to be responsible for a json response to a particular page. Likewise, webservices must also decide whether PUT methods are allowed against node/%node for articles so that some sort of 3rd party application can be utilized to update that node. Hopefully you get where I'm going here. In short this is not a block's problem, the web services initiative will probably need some sort of custom plugin with a pluggable response. Also Symfony is smart here and has the notion of unsupported method and accept headers, and we will be acting on those sorts of exceptions.

Hopefully this makes sense. It was discussed at length in Boston, and this was the general consensus.

Eclipse

Okay, thanks for your

Posted by xtfer on March 26, 2012 at 2:26am

Okay, thanks for your response, Eclipse. Thats a good clarification.

We're falling a bit too deep

Posted by neclimdul on March 26, 2012 at 2:32am

We're falling a bit too deep into implementation details here. If it wasn't clear from the notes, the Michael asked us to stay as generic as possible. While technically we know we're reversing the logic of block rendering and providing clearly injected context and all these technical things, these sort of architectural decisions will deliver the needs of the interface being designed not lead it. I'd be careful dwelling on this and the related technical discussions here and leave that to issues and other threads.

Ive stopped dwelling.

Posted by xtfer on March 26, 2012 at 2:45am

Ive stopped dwelling.

I think we should try not to

Posted by pwolanin on March 26, 2012 at 9:11pm

I think we should try not to support multiple resources at the same URL - that breaks many assumptions of the web.

I know Larry wants to separate things based on accept headers, but I'd honestly be happier if we e.g. used a standard path convention like .json or .xml

So a resource doesn't seem

Posted by neclimdul on March 26, 2012 at 10:26pm

So a resource doesn't seem well defined. Is a resource the concept of what a page is about as a whole or what is used to build the page. Even a node page can fall into the multiple resources used to build the page pretty quickly if we look at something like the tried and true album/track URL. A page might be about a track but its also about an album and an artist and all those resources are on the page either by explicit relations or by url arguments or what ever. You might even call something a amorphous as Solr suggestions or a view of other tracks as a resource if you broke it down.

This seems like maybe that's something that would be worth breaking out and discussing?

As far as different formats for a "resource" I think this is a bit of a different discussion and while I don't think the average site builder would use the granularity of different accept headers, someone building a URL specifically for services might which is why Larry is concerned about it.

I think a resource is

Posted by karschsp on March 26, 2012 at 11:05pm

I think a resource is something that has its own URL/URI and may or may not contain other resources. Even in the non-Drupal web, an HTML document can contain images, for example. Of course, in Drupal 8 this translates as a page which is accessed by a path that contains many other resources (blocks) which also have paths.

Too simple?

What's a new resource vs. a purely mechanical difference?

Posted by effulgentsia on March 27, 2012 at 12:11am

http://blogs.msdn.com/b/dotnetinterop/archive/2008/03/28/content-type-ne... discusses this a bit and mentions this quote by Roy Fielding, the guy who coined the term REST:

We encourage resource owners to only use true content negotiation (without redirects) when the only difference between formats is mechanical in nature.

The article claims that a JSON vs. XML difference is purely mechanical in nature, but for example a JPEG version would be a different resource deserving its own URI. Based on this, I would also expect HTML to be a different resource than JSON, not merely a purely mechanical format change.

http://www.w3.org/QA/2006/02/content_negotiation.html is an old article from the W3 that also claims that GIF and PNG at the same URL is ok, but that language negotiation should redirect to a unique URL per language.

The above was based on a very quick Google search. If anyone has up to date standards recommendations, let's use those, but my current understanding is that we should have unique URLs for each "sufficiently different" resource, but also support generic URLs that can content negotiate based on headers and return redirects to the specific URL, so in spirit, I agree with pwolanin, but as to whether the specific example of XML or JSON constitutes a "sufficiently different" resource, I don't know.

It's a new resource if it's different information

Posted by Rj-dupe-1 on March 27, 2012 at 2:38am

I just want to expand on some of the comparisons that are being made because I don't think the detail is being documented here.

The idea is that the encoding of the information doesn't matter, the resource is just the underlying data, which the server should ideally be able to present to you in whichever encoding that you require.

GIF & PNG containers can be the same data, with just the encoding being different and if they are the same, they should use the same URL. If they are not (eg PNG with 8-bit alpha) then they should use different URLs because they are different resources. Since a JPG version of the same image will contain different data, it should be available on a different URL. Similarly, two photos that are taken at different times or with different cameras would be different resources, even if they looked the same, no matter which encoding was used.

JSON, XML & HTML are just different containers. They can all contain exactly the same information, including semantics. If they do, they should all be at the same URL and allow the client to specify which format should be delivered in the request headers. Of course, if one format is going to include less, more or different information then that would be a different resource and should properly be on a separate URL.

Different language translations of the same concept will warrant different URLs because it's not a direct encoding issue; although the definition of equivalent words in two languages might be the same, they will often infer other information due to custom or other issues and so a perfect translation likely does not exist. Thus, they are different resources.

In summary, if it's possible to re-encode the same data freely back and forth between two mechanical formats with no data loss, then it is a single resource and should properly be made available on a single URL.

Pushing on that

Posted by effulgentsia on March 27, 2012 at 3:52am

Thanks, Rj. To a large extent, I like that as a guideline. In practice, for a Drupal site, HTML will almost always contain different information than XML/JSON (the HTML, for example, will have blocks surrounding the main content). But the problem with this strict definition is that the HTML is potentially different for every user. For example, one user might not have access to view a certain field. Drupal currently does not provide a separate URL for node/1 for every possible permutation of what can be visible on that page, depending on your permissions. And I sure hope that REST doesn't require such a thing, cause that seems almost impossible to implement. So I think we're still left with a concept of a URL/resource as needing to be pretty specific, but still allow some variation based on the request/context, which I think circles us back to the original thread question.

Cross-link to WSCCI routing thread

Posted by effulgentsia on March 27, 2012 at 4:14am

I also posed this questions on http://groups.drupal.org/node/220269#comment-721844.

To add another example, RDF

Posted by xtfer on March 27, 2012 at 8:09am

To add another example, RDF resources identified by a URI must always use the same URL, regardless of format. RDF/XML, Turtle, whatever, it must be the same URL. However, the data returned does not have to be identical. Ideally it should be, but that's not a constraint. It must, however, represent the same class or individual.

For example, a node in HTML might return all sorts of other related content, however the same node in RDF would only return and RDF description of the node.

So providing different resources formats on the same URL is a given, it has to be supported, but I don't think we should get too hung up on whether a Route returns identical content. It merely has to return the same resource, if the type requested is different. If you have another Route which defines additional parts to explicitely return a different type (for example, node/{node-id}/json, in addition to node/{node-id}), thats okay too.

Link to separate thread on this

Posted by effulgentsia on March 28, 2012 at 12:55am

Thanks. This helps and I think is consistent with http://groups.drupal.org/node/220519. Please provide feedback there if I misunderstood something, got something wrong, or wasn't clear.

Work no the question

Posted by AndrzejG on March 25, 2012 at 10:39pm

Having so many answers we should search for the "most useful answer".

Useful for what?

So, what is the purpose or motivation of the question "What is 'a page'?"

What is the problem behind this question?

What solution has "a hole" or "a gap" to be filled with the answer?

I just hammered out a

Posted by andremolnar on March 27, 2012 at 4:37am

I just hammered out a response, but it got gobbled in the process. Lets try again.

The gist of it was that this is my understanding too. With IA spaces being one of the primary considerations to how a 'site visitor' thinks about a page.

Consider the following http://example.com/Air/KellyWatchTheStars, http://example.com/Air/MoonSafari/KellyWatchTheStars, http://example.com/Air/GreatestHits/KellyWatchTheStars

A site visitor would surely think of each of those as its own page. Each is in their own IA space (assuming in this example that paths represent IA hierarchy in some way). But each are the same resource (in the sense that they might all be node/5).

Its also worth noting (though completely obvious), that just because some elements on the page are different (logged in user name, personalized recommendations etc) base on some contextual information, they aren't new 'pages'. (Though they may be a number of different layout configurations).

All of this has also got me thinking about what the 'page creator's' expectation is when they go to create a page. Do they expect that 'creating a node of some type' = creating a page [Drupal may have conditioned them to think so]? Do they expect that creating a layout = creating a page? Do they expect creating a layout and THEN creating a node = creating a page? Do they expect creating a layout and then creating complex rules for displaying related information, then creating a node = creating a page? And don't forget views.

Do they expect to do some or all of these things at once in-place and in-context = creating a page?

Like it or not, its likely the last item on the list that is true.

Also need a definition of 'layout'

Posted by user advocate on March 30, 2012 at 6:00pm

I'm just formulating another response to help us sort out possible UX workflows that may be involved with all this. But first I need some clarification on the term 'layout'.

@EclipseGc, when you speak of 'layouts' are you referring to a container for blocks or an arrangement of blocks? In other words is a 'layout' a reusable object that is referenced (somehow) by a page? I'm assuming that blocks are essentially objects so I didn't ask that question. I'm also assuming that 'regions' are simply subdivisions that allow spatial groupings of blocks. Correct me if I'm missing something.

This is a bit more of an implementation question but I want to be sure I'm in alignment with your architectural strategy and that we all mean the same thing when we say 'layout'.

Michael Keara
User Interface Systems Architect,
The User Advocate Group

March 20, 2012 Drupalcon BOF, Semantics: What is a 'page'?

Comments

I think the key concept that

Re: I think the key concept that

Not sure if you want to get

Perfect.

To me a key aspect of this

You are largely correct about

I think we're complicating

Well, this is an inherently

I DO understand all of these,

Okay, thanks for your

We're falling a bit too deep

Ive stopped dwelling.

I think we should try not to

So a resource doesn't seem

I think a resource is

What's a new resource vs. a purely mechanical difference?

It's a new resource if it's different information

Pushing on that

Cross-link to WSCCI routing thread

To add another example, RDF

Link to separate thread on this

Work no the question

I just hammered out a

Also need a definition of 'layout'

Web Services and Context Core Initiative

Group organizers

New groups

Group notifications