Architectural Plan

You are viewing a wiki page. You are welcome to join the group and then edit it. Be bold!

Posted by Crell on April 22, 2010 at 11:01pm
Last updated by silverwing on Thu, 2011-07-14 23:56

The following is an over-all architectural plan and quasi-roadmap based on discussions surrounding Blocks TNG and Context at DrupalCon SF 2010. Special thanks to co-editors and reviewers Jeff Eaton, Earl Miles, and Dmitri Gaskin for their feedback to date.

High-level goals

Centralzing all "context" into a single object rather than scattered about various globals, collector functions, and statics.
Allow portions of core to be segmented into easily testable parts.
Allow for centralized, lazy-loaded, pluggable access to "rich context" that is not currently available in any standard fashion. (Current OG, etc.)
Convert core's limited routing system (hook_menu) into a flexible, pluggable dispatch system.
Provide an in-core page building mechanism built around enhanced, context-sensitive blocks modeled on the Panels module.
Leverage the centralized context system to streamline and simplify "abnormal state" conditions, such as the installer, update script, Drush, and database failures.
Provide an in-core mechanism for non-web-page responses to HTTP requests, including AHAH fragments, JSON, ReST, SOAP, etc. This effectively turns into "services in core".

Implementation plan

Implementation will proceed in multiple phases. Each phase is designed to be a discrete atomic upgrade. Although the overall goal is to implement all phases eventually, Drupal will be stable and functional should development stop or be delayed between any two phases while still be a marked improvement over prior phases.

It is possible that not all phases will happen during the Drupal 8 development cycle. That is OK. As long as we are not in mid-phase when Drupal 8 goes into code freeze we will be left with a stable and greatly improved system.

Phase 1: Context Object

Step 1:

During bootstrap, we create a context object instance of a class with a defined interface. This object will initially contain information derived from the HTTP request itself. The context object will be responsible for ensuring sane defaults are returned in unusual cases.

GET parameters
POST parameters if any
Cookie values
Session information
$_SERVER information
Additional information available directly from the HTTP request or equivalent

It will also include the following derived information that other code may assume is always present. The context object will be responsible for ensuring sane defaults are returned in unusual cases.

Current language
Requested path (both raw and de-aliased)

The context object will also offer centralized access to the following derived contextual information that MAY be lazy-loaded on request.

Current user (Because this is explicitly driven by session/cookie in the Request)
Level-1 loaded entities (i.e., what is currently loaded by %node and similar menu callbacks)
Pager state
Others that may be found during implementation

The context object will not be alterable. Once a given element of context has been determined, it will remain constant for the remainder of that context object's lifetime. The context object will, however, provide a mechanism to wrap itself in a mock context object that allows for overriding of selected context elements at creation time only. This mechanism is modeled on the mocking mechanism present in many testing frameworks. The wrapped context object may then be passed on to other routines which will then not know that those elements have been mocked.

Step 2:

A global accessor mechanism, most likely a drupal_get_context() function, will be provided. This accessor will do nothing except return the current context object so that modules and core APIs may act upon the context as appropriate.

The drupal_get_context() function will be marked deprecated immediately upon its addition. It will be documented as a transitional mechanism only.

Where relevant and appropriate, various systems and routines will be built to accept a context object as a function parameter or constructor parameter rather than accessing the global object. That will allow those systems to be successfully mocked, which makes them inherently more unit testable and maintainable. This pattern will also be used by the Blocks TNG effort (Phase 5, below).

Step 3:

All existing uses of one-off context in core will be removed in favor of the global context object. These include, but are not limited to, arg(), drupal_get_normal_path(), the global $user and $language variables, the $pager_* globals, all references to PHP super globals ($_GET, $_POST, etc.), and numerous other instances of scattered context. This functionality will be available in a more robust and abstracted form in the context object. Once all uses of such scattered context are expunged, the existing mechanisms will be removed. (That is, arg() will be removed from core in this step. There will then be a party that includes ponies.)

Step 4:

A backport of the context object will be maintained for Drupal 7 such that forward-looking modules may leverage it instead of the existing disparate mechanisms, allowing for cleaner code and a smoother transition to Drupal 8 when the time comes.

The Drupal 7 implementation of the context object will likely be a simple wrapper around the existing functions and globals in many cases. However, as long as the Interface for the object remains consistent between the Drupal 8 core and Drupal 7 contrib implementations that is not a problem.

Phase 2: Extended context

A registration system will be provided to allow modules to register components with the context object that will answer additional contextual requests on demand, such as:

Current Organic Group(s)
Current user-blog
Book module navigation location
Forum hierarchy
Any others that a module chooses to register itself for.

The exact mechanism for such registration has not yet been defined. The context object will be responsible for ensuring sane defaults are returned in unusual cases or if a given additional context is not registered.

Phase 3: Display controllers

The second phase involves introducing a display controller mechanism that controls the handling of request responses. It is an evolution of the delivery callback present in Drupal 7, providing a much more robust set of options and operating far earlier in the page request process. It involves splitting what is currently hook_menu and menu_execute_active_handler() into separate, pluggable Dispatcher and Display Controller. This means that we can completely swap out hook_menu for something else.

The general mechanism is as follows:

Drupal builds the context object as above.
The context object is passed to a Dispatcher routine that, based on the context, selects the appropriate Display Controller object and configuration for the Display Controller. The Dispatcher may leverage a variety of information made available by the context object as necessary, including GET parameters, URL fragments, HTTP headers, and the ReST standard.
Drupal will call the appropriate method on the Display Controller object to generate a string, which will be returned to the requestor. This may be a full HTML page, HTML fragment, JSON object, SOAP response, ReST response, or any other response type. The Display Controller may also specify additional HTTP headers to be included in the response via another appropriate method.

Each Display Controller implementation is responsible for generating the appropriate response based on the supplied context and configuration.

Initially we will need a display controller that returns a complete HTML page based on hook_menu information. It is essentially the current system replicated in an object.

As a side effect of the dispatcher, the bulk of the services module becomes supported directly in core. The services module should serve as inspiration for how to implement the dispatcher properly. Use of ReSTful standards here is recommended.

Core should support the following additional display controllers that may be developed in parallel once the system is in place:

Form submission handling based on POST data.
JSON object response. (Necessary for auto-complete functionality.)
RSS/ATOM
RDF

The exact interaction and separation between the Dispatcher and Display Controller requires further investigation.

Providing clean support for tools like AtomPub, PubSubhubbub, CSV output, and so on becomes an exercise for contrib, or for Tim O'Reilly.

Phase 4: Clean up hacky systems

Convert install.php, update.php, authorize.php and similar alternate environments to use a custom display controller and context object. This will provide a substantial simplification of all of the above systems.

Potentially a "detect broken state" check could also route off to a special maintenance mode display controller.

Phase 5: Blocks TNG

The fifth phase involves implementing a new block-centric display controller using the system developed in phase 2. This system is intended to replace the hook_menu-based implementation. It will also by nature provide support for block-specific AHAH responses.

This will involve conversion of blocks into classed objects with configuration objects, inspired by Panels Content Panes. It may be advisable to rename blocks to "components" or some other name in order to provide more designer-friendly terminology. The net result of this phase includes considerably more power in core as well as a layout mechanism that is closer to common designer workflow.

Phase 6: Hooks and the Rabbit Hole

Add to the context object a method named ->invokeHook($hook_name, array $args = array(), $context = NULL).

Hooks will then be modified to accept a context object. If the invokeHook() method is called with a context object (mocked or otherwise), it will be passed to the specified hooks. If not, the context object will pass itself to the hook.

As hooks are converted to accept a context object, the drupal_get_context() function will be deprecated. Eventually it will be removed.

module_invoke_all() will eventually be deprecated and removed.

invokeHook() will also support mocking as appropriate.

Phase 7: Functionality access

Selected core systems will be given accessor methods on the context object, which will become the primary access to those systems. Examples of possible mechanisms include:

$context->db('slave')->query("...");

$context->theme($theme_key, $args);

Phase 8: Profit

We have the biggest party in the history of open source development. With Ewoks dancing.

Comments

Convert install.php,

Posted by merlinofchaos on April 23, 2010 at 2:56am

Convert install.php, update.php, authorize.php and similar alternate environments to use a custom display controller and context object. This will provide a substantial simplification of all of the above systems.

In my mind, the Dispatcher performs the task of menu_execute_active_handler() (which determines what the active handler is) and this is the piece that needs to be plugged in to provide alternate databaseless environment for install.php and update.php

Agreed

Posted by sdboyer on April 23, 2010 at 3:23am

Agreed

Yep

Posted by dmitrig01 on April 23, 2010 at 4:54am

I was talking to Crell and that's what it means, but you're welcome to change it.

Thoughts on the Dispatcher

Posted by sdboyer on April 23, 2010 at 8:56am

A little concern over what I feel like this text implies about the nature of the Dispatcher logic:

It involves splitting what is currently hook_menu and menu_execute_active_handler() into separate, pluggable Dispatcher and Display Controller. This means that we can completely swap out hook_menu for something else.

The context object is passed to a Dispatcher routine that, based on the context, selects the appropriate Display Controller object and configuration for the Display Controller. The Dispatcher may leverage a variety of information made available by the context object as necessary, including GET parameters, URL fragments, HTTP headers, and the ReST standard.

I think this points towards a conceptualization of the Dispatcher logic that ultimately may be insufficient. I'm almost certainly reading too far into this here, but I think it's worth noting this in the early architectural phase. Here's my thinking:

If the Dispatcher is defined as a single interface that we expect to implement for various different routing/dispatching cases, then that suggests something like the D6/7 menu router would be remodeled into a class implementing that interface. If that's the case, and there's just the one Dispatcher object that does dispatching, then achieving the "process context until we arrive at a display controller" concept could get icky. Consider the basic situation Panels is in right now - the menu routing system does basic $_GET['q'] processing with its loaders, then hands it over to Page Manager, which does its secondary round of context object loading, applies its selection criteria to determine the final display controller/configuration, then passes off to the render pipeline. Those are two discrete bits of routing behavior, both of which are prime examples of what we'd want to be moving up into this system. But these two bits of routing logic are also (in principle at least) loosely coupled. So if there's just the one Dispatcher object, then it would have to munge both of these in-principle-independent routing behaviors together into a single class. aka, tight coupling. So it could be difficult to swap out, f.e., just the $_GET['q'] processing portion in a subclass. Lots of code duplication, runs us into PHP's inability to modify existing class definitions, blah bah blah.

I can think of a few sorts of solutions. If we think the router is something that really won't be re-plugged very often/that we can write approaches that handle each request-ish type (basically, the HTTP vs. JSON vs. ReST list given above), then maybe tight coupling & big fat classes is OK. If small pieces loosely joined is preferable, then big classes won't work, the Dispatcher interface would need to be substantially different, and there'd need to be some core logic that decides how to wrap the pieces together.

In fact, even if we do go with tightly coupled, big fat classes, then there STILL needs to be some wrapper logic. In the Phase 3 description of the "general mechanism," there's no talk about how which of the Dispatcher plugins gets picked (not picking on the writing here so much, just wanting to draw attention to the fact that this area has not at all been considered yet); when should the default one be used, versus something defined by (say) services? Where is the configuration for that dispatcher-selecting logic going to live? Is that something that will be set by...modules? Something in settings.php? Really it's the philosophical question that matters - what's appropriately in the domain of "picking a Dispatcher" vs. the routing work the Dispatcher itself does?

So, there's no phase specifically for tackling the dispatcher issues, which I think is a problem. The Display Controllers phase doesn't technically rely on this, but being that it's a step higher in the process, I think it should be addressed first so that we don't go pushing logic into our display controllers that ultimately belongs in the Dispatcher. So unless there are objections, I'll add a (brief) phase for the Dispatcher preceding the DCs, and put a distilled version of this up there.

what's appropriately in the

Posted by sdboyer on April 23, 2010 at 4:48pm

what's appropriately in the domain of "picking a Dispatcher" vs. the routing work the Dispatcher itself does?

Wanted to try to express this a little better. If different the difference between pluggable Dispatchers is primarily the difference between Drupal acting like a different type of application responder (HTTP/HTML, AJAX, SOAP, etc.), then that implies the logic for selecting the 'application' actually lives in core. If all of that decisionmaking is encapsulated within the Dispatcher, though, then there'll need to be inner layers for routing.

Hmm. My mind is gravitating towards something like a Dispatcher interface with major 'families' of implementations (one for an HTTP server, one for ReST, etc.), some of which may require complex multi-phase routing underneath, and others of which may not. Dunno if that's actually what we need but the architecture for it just kinda snapped into my mind :)

Whatever we go with, I think I'm tending towards thinking of Dispatcher as first needing to select a type of Application, then perform routing logic within that application.

OK, I take back the last bit.

Posted by sdboyer on April 23, 2010 at 7:54pm

OK, I take back the last bit. It is clearly and demonstrably a bad idea to represent something as potentially murky as 'application type' in a hard-architected outer layer. There's no way, or real reason, to clearly delineate something like that out from the individual working steps like $_GET['q'] processing & context data object switching; it's quite reasonable to suppose that those operations could be necessary for determining 'application' type. So we probably shouldn't be talking about application type until we're at the DC, or at least into some of the routing logic bits in the Dispatcher.

Which takes my thinking back a bit, and means that I think our major problem here is deciding on what's going to determine the logic executed in the dispatcher.

Wow

Posted by eigentor on April 23, 2010 at 6:05pm

Holy cow, this sounds visionary. If come true, it sounds even better.
Blocks and Panels meet in the middle... hehe. Well this is a front-end-guy's view, but we will also benefit a lot.

Life is a journey, not a destination

Actually not...

Posted by chx on April 23, 2010 at 10:12pm

we agreed on the step 1 being a simple context_get('question', $arg1, $arg2, ...) and not some superOOP nightmare (again).

Irrelevant.

Posted by eaton on April 24, 2010 at 12:36am

The difference between context_get('question') and $context->question() is a nonstarter at this point: the precise syntax of the call will reflect what makes sense, and the internals will be implemented with the tools that make sense. The critical decision points we face right now relate to the underlying architecture, spotting design pitfalls, and properly separating the phases so that we achieve recognizable, usable benefits at each point without painting ourselves into a corner.

Energy focused on that is most useful right now: do you have any feedback on that? (as opposed to OOP vs. Procedural call syntax preferences)

One interesting

Posted by merlinofchaos on April 24, 2010 at 1:16am

One interesting extension:

Functions like drupal_set_http_header() and drupal_goto() and drupal_not_found() will necessarily have to become an extension of the context object. After all, if these variables should only be read from context, they should not be written to directly. Perhaps this will all be part of a response object that also includes the page array.

The more I think about this,

Posted by merlinofchaos on April 24, 2010 at 5:08am

The more I think about this, there should be a RequestResponse object that goes with this. The response contains at least the following:

The page array
theme information
out of band data such as drupal_add_js(), drupal_add_css(), drupal_set_html_head()
State information such as response code (403, 404)
Any additional headers
Breadcrumb and active trail information

I think what we see here is that the display controller (which is a difficult name as it doesn't quite fit what we've been establishing -- I still like RequestHandler) returns this to the dispatcher. The dispatcher might then pass this on to something else, which would be today's delivery callback.

So I'm kind of envisioning (architecturally) (please note: When I refer to objects I am thinking in OO design but some of this could be procedural, either as an interim or intentionally, depending upon the needs. I'm talking systems here, not specific design)

Data objects

RequestContext: Contains all of the information stated in the arch document.
RequestResponse: Contains the page array and related metadata

These two items may well be contained together for simplicity. Perhaps $request->context and $request->response

Behavior objects

RequestDispatcher: Creates the initial context, handles routing, chooses an appropriate RequestHandler, contains the analog of menu_execute_active_handler(). In the initial versions it would be bootstrap + menu_execute_active_handler(). May be multiple behavior objects.
RequestHandler: Generates the data in the RequestResponse

Two additional behavior objects may be needed, and I am less certain about these at this time:

RequestRouter: Essentially this would be the data currently in the menu_router table. In particular it will have the page callback, plus contain the ability to manipulate a URL into context objects properly.
RequestRender: Would be the final piece, truly this is just the delivery callback. It takes the RequestResponse and actually renders it back to the browser. It would be the *only* piece allowed to actually print. In fact, we could theoretically enforce this by using the ob_ functions though that wouldn't stop direct manipulation of headers I don't think.

I also think Crell and I have

Posted by merlinofchaos on April 24, 2010 at 5:34am

I also think Crell and I have crossed streams on where the dispatcher actually is. He mentioned that menu_execute_active_handler() is the display controller. But if it is, that leaves a big void for everything that happens prior to it. Perhaps this is his intent.

More musing

Posted by merlinofchaos on April 24, 2010 at 6:43am

Here's some really pseudo pseudo code to illustrate the basic page construction flow using the above model:

<?php
// index.php
include 'request.inc';

// Gets context information from HTTP -- things like drush would use
// an alternate context object. This should load $conf settings.
$context = new RequestContextHTTP();

// Allow the context object to choose a pluggable dispatcher object.
$dispatcher = new $context->get('dispatcher');

// Figure out how to route the context and get information.
$router = $dispatcher->get_router($context);

// Load a renderer, based upon context information. Typically this will
// render either HTML or JSON but additional renderers can do whatever
// other services we need. RSS, SOAP, etc.
$renderer = $dispatcher->renderer($context, $router);

// Allow the dispatcher to hand off to a RequestHandler
$handler = $router->handler($context, $renderer);
$dispatcher->route($context, $renderer);

// Return whatever data the renderer wants.
$renderer->render($context->response);
?>

Notes:

Bootstrap actually happens in the context object in that model. Drush would, for example, have its own context object that does what it needs to do without an HTTP request.
Install would probably have a different context object. Alternatively, install could happen automatically because the context object could detect the not installed state and load the RequestDispatcherInstaller which contains all of the routing necessary to do install.

This could be done by using a bootstrap object as well. In that case, the context object is pure data, and bootstrap becomes hardcoded pluggable (i.e, not changeable via the UI or settings but changeable via the entry point).

The renderer appears to be loaded early. This is just so that the request handler has access to it if needed; the renderer will be necessary to call theming functions, I think.

It actually is kinda

Posted by sdboyer on April 24, 2010 at 9:10am

It actually is kinda refreshing to see code after banging my brain against this for a while.

The primary issue I have here is that it effectively means the way you make routing pluggable is to define a different context class and use that one instead. I don't really like that coupling - I'd rather see more of the meta-logic governing routing in procedural glue code, not encapsulated within the context object.

Not at all. It can delegate

Posted by merlinofchaos on April 24, 2010 at 2:52pm

Not at all. It can delegate that to the context discovery system which is a registered thing. Or it can pull a value from $conf. We are limited in plugin discovery mechanisms to find the dispatcher because of how early in the page we are, though.

Yeah 100% agreed. I was fried

Posted by sdboyer on April 24, 2010 at 3:35pm

Yeah 100% agreed. I was fried when I posted that.

Very nice

Posted by chx on April 24, 2010 at 5:57pm

and not Drupal. But sure, go ahead.

Stop exploding because I used OO.

Posted by merlinofchaos on April 24, 2010 at 6:58pm

Mentally replace terms like 'behavior objects' with 'systems'. Note that OO behavior objects are especially good for plugability.

The same model, more or less, in procedural. It's kind of skipping the part about having a separate RequestRouter object, which I think is a nice addition.

<?php
// index.php
include 'request.inc';

// Get a fresh context object.
$context = drupal_load_context();

// Load context information from HTTP -- things like drush would use
// an alternate context object. This should load $conf settings.
drupal_bootstrap_http($context);

// Allow the context object to choose a pluggable dispatcher object.
// This would be, more or less, menu_get_active_handler()
$callback = drupal_get_dispatcher($context);
$output = $callback($context);

// Render the output based on things we know from the context:
$callback = drupal_get_delivery_callback($context, $output);
$callback($context, $output);
?>

Regarding context

Posted by dgoutam on May 2, 2010 at 4:47am

A transport neutral context is what i would like to see.

I am very new to drupal architecture but from my past experiences I have seen when we need to consider a context of an application it is always better to have one which is protocol neutral. By saying protocol neutral i really like to mean is that a context which will be
loaded by the by the bootstrap should be abstract one and depending on the client it should be able to identify the actual needed context which should be the subclass of abstract context.

WebContext
CliContext

Java Servlet works like

Posted by sylvain lecoy on April 23, 2011 at 11:18pm

Java Servlet works like that,

when you subclass a Servlet and implements a doGet(HttpServletRequest req, HttpServletResponse res) there is the request (with the same information as you guys specifies in context), and the response.

Comments

Posted by gdd on April 24, 2010 at 3:10am

To be pedantic for a moment, there is no such thing as a 'REST response'. REST is an architecture which takes data available natively in HTTP and derives context from it to respond with appropriate data. Sound familiar? This proposal is already heading towards the direction of turning Drupal into a great big REST server implementation (optionally pluggable to change it out into something else.) Also in the REST world, form submission handling is not just via POST but also the PUT and DELETE methods (which is how it creates context for create/update/delete actions.) This could be useful to implement because you can create context based solely on HTTP with a minimum of added information.

So for reference, in case it is useful, here is how things work in Services right now. There are Server, Authentication, and Service components, all of which are pluggable (each is a module.) Server components handle requests and rendering, Authentication handles what it sounds like, and Services are the getting and returning of data. Services registers /services in hook_menu, then the individual Server components register a path under that (so XMLRPC lives at /services/xmlrpc.)

The addition of the standard context object is a great improvement I may steal for the next version of Services we're currently building. It could be a good way to work some of these issues out. We also don't separate the rendering mechanism into a separate pluggable at the moment it is handled in the Server modules. Servers that implement multiple rendering mechanisms (like the REST server) do so internally as they wish. For instance the REST server pulls it out of the URL (services/rest/node/1/xml vs services/rest/node/1/json.)

I don't want to feature creep but pluggable authentication would be a great addition to this in my opinion. Damz brought this up to me in SF as well. This could lead to, for instance, the ability to create a whole area of your site that is a public REST API using oAuth, along with another area that is a public REST API using no authentication. This is seriously powerful stuff for sites wanting to expose content machine-readably. OpenData++. Where does authentication fit into this architecture right now? If Phase 3 replaces hook_menu(), then it appears to go there, although putting it under a section called Display Controllers does not seem right.

Just some thoughts and data.

Pluggable authentication: think WSGI

Posted by fgm on April 26, 2010 at 8:39am

One thing we might want to consider regarding pluggable authentication, and more generally any pluggable middleware would be a WSGI-like interface (like WPHP), which would allow such middleware to be completely pluggable, and not even necessarily in Drupal or even in PHP.

Naming & OOP

Posted by andypost on April 24, 2010 at 3:16am

Good trend, where is a link for block TNG presentation?

I think $context->request and $context-> response more natural names for properties.

We could use Factory pattern for properties and lazy loading

Before dispatching context should care about altering (duplication I suppose) initial request data to provide overrides (custom_url_bound)

Not sure about mixing dispatching and execution of what (context, request) - something like $application->execute()

As Eaton stated above (and I

Posted by gdd on April 24, 2010 at 3:27am

As Eaton stated above (and I completely agree with) we should not worry about implementation details at this point and worry more about architecture and concepts.

This is sounding really

Posted by eojthebrave on April 24, 2010 at 5:57pm

This is sounding really awesome. I don't really have the background knowledge to be able to provide much useful input on design of the system but do recognize that the problems being addressed are super important. I'm sure I'll have more comments once I've had a chance to wrap my head around this a little more.

Really awesome job!

Posted by dixon_ on April 24, 2010 at 6:20pm

Really awesome job to have this battle plan up for early discussion! I can't wait to participate and review the concept once I've wrapped my head around this!

Context object and mocking

Posted by Crell on April 24, 2010 at 7:49pm

This is some dummy code that I put together with Eaton Wednesday night to demonstrate how the mocking would work:

<?php
class Context {
  function __construct($overrides = array(), $inner = NULL) {
    $this->values = $overrides;
    $this->inner = $inner;
  }

  function get($foo) {
    if (!empty($this->values[$foo])) {
      if (isset($this->inner)) {
        return $this->inner->get($foo);
      }
      else {
        $this->values[$foo] = $this->magicallyGetFromRegisteredRespondersSomehow($foo);
      }
    }
    return $this->values[$foo];
  }

  function mock($overrides = array()) {
    return new self($overrides, $this);
  }
}

$c = new Context();
$m = $c->mock(array('primaryNode' => node_load(5)));
?>

Now one routine can pass a mocked context object to a nested child routine, and the child routine is non-the-wiser. Also, because we pass the resolution for non-overridden values all the way up the chain it need only be loaded once, and is then available to any other routines anywhere (give or take additional mocking). That reduces loading overhead.

i've been doing something similar

Posted by adrian on April 24, 2010 at 9:57pm

Been playing with something similar with the new drush 4.0 code i've been playing around with. each site alias, is a record represented by a file. and the options/arguments can specify other site aliases.

So i have a mapping table in the context object mapping which of these are OID's. I'm currently playing with the magic getters and setters, and a static object cache. So d('@aliasname'), will load and instantiate one and only one copy of the object, and you returns the object directly. So you can do d('@aliasname')->property->method(). The __get on the core class just returns d($name) if the property is in the OID mapping table.

I find the code becomes really expressive, but have not even considered speed testing anything. it basically maps to something like $node->uid->name, where the 'uid' property is actually doing user_load($this->properties['uid']) silently.

My test code is here, but it's not even remotely complete or properly working or anything : http://git.aegirproject.org/?p=provision.git;a=blob;f=provision.environm...

Use a factory instead?

Posted by mbutcher on April 26, 2010 at 2:17pm

Warning: Lots of off-the-cuff pseudo-code is coming.

As I understand it, the suggestion is that there is a single Context instance, and that mocks would wrap the context.

Might a cleaner way to do this be something like this:

Context is a factory or an abstract factory
Context::instance() returns some implementation of the interface ContextInstance (or ContextInstanceInterface if you want to insist on a bizarre convention)
Some feature of the Context is responsible for determining which class is to be used as an instance of the ContextInterface

Now you gain a whole bunch of interesting options, such as...

The ability to set up one *type of* context for testing, and another for regular use. This is determined by how the context is bootstrapped. In testing, for example, bootstrapping loads the testing context, which is probably much more mutable to allow testers to overtly set up the environment. Then Context::get($variable) returns the mock -- no wrappers needed.
The ability to create a totally pluggable context system, where multiple different implementations could exist. (This could be really useful to something like Drush, which might be able to directly control many aspects of its context with a custom context implementation)
The ability to use either a Singleton ContextInstance (in one of the very few positive use cases for the Singleton pattern) or not -- we have options.
The ability to decorate or wrap contexts, since you can have ContextInstanceA serve as a wrapper around ContextInstanceB. This, for example, might let you do things like wrapping an ImmutableContextInstance with a MutableContextInstance, thereby treating some data as immutable, and others as mutable. (In context management, this can actually turn out to be tremendously helpful)

Now, instead of directly using Dependency Injection on every method and function that needs access to a context, we can code using the Context object directly, but without locking ourselves into a very narrow definition of a context.

In a drastically simplified OO idiom, I'm suggesting this:

<?php
class Context {
  
  protected static $contextInstance;
  protected static $contextConfig;

  // Presumably called once during bootstrapping.
  public static init($config) {
    $this->contextConfig = $config;

    // Do whatever else we need...
  }


  public static function instance() {
    // Probably lazilly load a context instance.
   // $this->contextInstance = ...

   // Now return the instance
   return $this->contextInstance;
  }

  /**
   * A convenience function for getting context data.
   * 
   * @param $contextVariableName
   *   The name of the context variable to fetch from the underlying instance.
   */
  public static function get($contextVariableName) {
    return self::instance()->get($contextVariableName);
  }

  // We might also have....
  public static function setInstance($replacementInstance);
}
?>

We could even go one step further and turn a Context into a chain-like controller, where we can add different context instances to the Context. In this case, the main Context becomes a marshal for access to one or more different context instance backends, which it traverses like a chain. In other words, Context has a list of context instances that it treats on a prioritization model (e.g. highest weight is checked first).

Then, we have a model more like this:

<?php
class Context {

  public static function __construct()

  public static function addContextInstance($additionalContext, $weight = 0);

  public static function removeContextInstance($contextToRemove);

  public static function get($variableName);

  // ...
}
?>

Given the model above, bootstrapping a context could look more like this:

<?php
// Global vars like $_SERVER stuff are handled 
Context::addContextInstance(new ImmutableContextInstance(), 10);

Context::AddContextInstance(new MutableContextInstance(), 0);

Context::addContextInstance(new OtherContextInstance(), -2);

// Thousands of lines of code later, we can do something like this:

Context::get('myVar');
?>

Now in that last line, Context::get() will loop through the contexts (in order of weight) and return the first non-NULL value that it finds when doing a lookup on myVar. In this way, we get very cheap and extensible context chaining without the overhead of DI, and without much conceptual difficulty.

I know it's all a little rough in the sketch above, and I feel like I've abused statics a little too much, but the basic idea is workable, and has been used in many systems in the past. I think it's a far safer bet than locking into a single monolithic conception of a context -- and it's certainly more drupalish to allow this kind of flexibility. (I dare say that the chaining idea could even be accomplished with hooks, if we wanted.)

Blog: http://technosophos.com
QueryPath: http://querypath.org

not too shabby...

Posted by ronald_istos on April 26, 2010 at 2:42pm

Hi,

this looks an awful lot like Dependency Injection (which is great for me) - especially the notions of a factory / assembler that goes away and gets you the right implementation.

For anyone wanting more info on dependency injection: http://martinfowler.com/articles/injection.html

Question is, as Crell points out a few posts down, do we want to go the full blown pluggable way straight away or should we do the monolithic object to start with (its ugly but its ok if we know its ugly) and then make it more sophisticated. Perhaps the "simple first, elegant afterward" approach would get us there quicker and with less friction (that inevitably will be caused by introducing huge changes all at once). We may also be in a better condition to have a truly elegant design only after we have better understood exactly how and when does Drupal use context - which right now seems to be scattered information across a few people's brains.

Also we don't want to end up like PHP6 - or at least the version of the story about php6 stalling that I heard! Apparently, they wanted to do something really ambitious (in their case Unicode support everywhere if I am not mistaken) and that stalled the whole PHP6 release to the point that now they are trying to figure out what to salvage and what to throw away ( http://lwn.net/Articles/379909/ ).

It is great to have the grand architectural discussion up front but then get to it via appropriate iterative steps that provide an incremental improvement at each step and allow others to realise what is going on and speak up if they are hating it or have a better idea.

a prototype

Posted by dikini on April 26, 2010 at 3:49pm

In my opinion a prototype, based around the essential part of core, or even just install.php and update.php should do as an evaluation step. The rest will follow from seeing what the thing will look like.

Yes, there will need a lot of convincing to be done that this is the best thing after sliced bread and southpark.

So step 2

Posted by chx on April 25, 2010 at 1:15am

If you want a local context object then you better pass it to every function otherwise you need to figure out whether there is a possible code flow which includes firing a hook because, of course, every hook then needs that context. So, if you happen to link ... then l, url and we are at drupal_alter which surely wants a context.

And then I was put down by saying, we can have an interim where if you don't pass in a context then the called function can go shopping for a global context. For example if the function mymodule_foo() calls drupal_alter then the first argument can be just drupal_get_context(). Whe the transition is complete then it'll be become mymodule_foo($context) and can pass the local context to drupal_alter.

Now, this interim looks like it makes this plan executable but it's actually a gigantic hack and superb ugly.

Can we just change every Drupal function without analysis to receive and pass the context as first argument?

I agree

Posted by dmitrig01 on April 25, 2010 at 1:17am

I agree

Not really

Posted by Crell on April 25, 2010 at 9:43am

Can we just change every Drupal function without analysis to receive and pass the context as first argument?

That's an even bigger hack, and is certainly likely to cause more issues than it solves. As we discussed last week, not every function needs a full context object. In fact, most probably don't.

I actually think the process of determining what parts of the system need context and which don't is a very important exercise for us as it forces a re-examination of several very important architectural questions. (See my Objectifying PHP / Classic Drupal session from Monday, particularly the part about making cleaner separation between components that should not be tightly coupled.)

funnily enough, I agree with

Posted by dikini on April 25, 2010 at 3:56pm

funnily enough, I agree with you.

As far as I understand context is a kind of a state, that is a map (array/object - doesn't really matter what, just in drupal currently we use arrays for that) which allow us to do key - value lookups.

The good thing in this proposal is the single point of reference. Threading a state through every significant function call is not a hack, it is a very sensible, time honoured strategy, and it is very similar to calling methods of an object - this is the implicit context, just solve the mixin problem in php, similar to extend in jQuery. Not sure it is a wise thing to do the oo way, as too many changes at once are too much trouble.

Actually, thinking a bit more about it, I think the drupal_function( $context, ... ) pattern will make even more evident the biggest, in my opinion, problem we have in drupal - we can't know reliably the order in which hook callbacks are executed. This is the single biggest problem of writing composable modules, as evidenced by the proliferation of contrib modules solving slightly different problems. Sometimes you just need to enforce a sequence of callbacks, implementing a hook, coming from different modules. Contexts won't solve it, but at least you can easier debug this situation. And this will highlight it. (And no, weights won't do - they are arbitrary and a pain to work with.)

So I think drupal_fun($context, ...) is better for the time being. If turned into some kind of object + methods later, good. But do it in small steps, pretty please :)

edit corrected some typos, sorry

Not just key/value

Posted by Crell on April 25, 2010 at 8:20pm

Treating context as just a key/value store is insufficient. If a dumb key/value store is all we needed, well, that's $GLOBALS. We need a way to cleanly handle derived contextual information, lazy-derived information, potentially even clustered information (eg, information relating to a given organic group that is part of the context, of which there's potentially multiple elements). A simple key/value store quickly breaks down there as not sufficiently flexible or extensible.

fair point

Posted by dikini on April 25, 2010 at 8:42pm

I'm simplifying it a bit more than needed. Although it doesn't change the interface of the beast much. As far as I understand this proposal, correct me if I'm wrong, you want a single incrementally built context, which should contain all required information, derived from the request and configuration. This will provide a single point of reference for all kinds of entities and other drupal artifacts.

So far I'm quite happy with that. Maybe it is off topic here, but I think this should be considered together with a response object, as Earl suggested on irc, containing the results of calls like drupal_add_js, drupal_add_css(),...

If that is the case, I think it would be good to bundle the context, response, and maybe an accumulator for intermediary state in an aggregate object. Then the drupal calls like invoke can have access via a single point to all required state. This will help not just with testing, but will allow invoke to properly control collisions in hook output. My usual example will be form_alter in different modules wich update the same form fields - now we just let it be and pray, while this would allow sensible arbitration by invoke.

Not even close

Posted by chx on April 26, 2010 at 4:53pm

If you have a function then it can do lazy loading as necessary quite unlike $GLOBALS.

A little thought in a diff way

Posted by dgoutam on May 2, 2010 at 4:59am

Y/es you are absolutely right thats a big task. But can't we provide th default in bootstrap and lets leave it to the module developers whether they need the context or not. The default loaded context is always available within the drupal it is upto the implementors choice whether they need the help of context or not by just providing the drupal_get_default_context().

And let there be a provision of enriching the default context with the prototypal property of moduleContext which could be injected by the module.info file.

So by that a module developer could easily put his own context related stuff via moduleContext and could also use the default Context.

So, step 3

Posted by chx on April 25, 2010 at 1:20am

Step 1 is creating a new API. That's a small doable patch. Step 2 (which is step 3 in this plan for i do not know what reason) is a bunch of parallel patches indeed changing the mess of context to the API committed in step 1. Step 3 is the apocalypse patch where every Drupal function call changes. Then we disagree.

Sort of.

Posted by eaton on April 25, 2010 at 4:54pm

Step 1 is creating a new API. That's a small doable patch. Step 2 (which is step 3 in this plan for i do not know what reason) is a bunch of parallel patches indeed changing the mess of context to the API committed in step 1. Step 3 is the apocalypse patch where every Drupal function call changes. Then we disagree.

Even as one of the primary advocates of the underlying context system, I agree that the 'and suddenly everything is decoupled and uses context' portion of the plan is still very tentative, with many unknowns and risks. While I feel those first two stages are very solid and have important wins for everyone, we need to tread lightly as we get closer to that 'branch point' in the work.

One of the benefits of the phased approach is that we can make it through those early phases in the plan and decide we need to put the breaks on -- and still have improved the system dramatically. Hopefully we won't need to, but it isn't an elephant we have to eat in one bite.

Most of the way

Posted by Crell on April 25, 2010 at 8:26pm

Another advantage is that it is quite possible that we will never be able to get all of Drupal to be entirely either context free or context-injected. However, if we can get, say, 70% of Drupal to be either context-free or context-injected, that's still a massive massive win that gets 70% more testability, pluggability, mockability, and flexibility. The remaining ugly parts are smaller and easier to insulate, which is still a major step up from where we are now.

Even if we can't eat the elephant's nose, there's still a lot of tasty meat on it. :-)

Concrete example

Posted by chx on April 25, 2010 at 8:42pm

If you want to change the signature of drupal_alter to get context injected then you need to change url and every function calling url. There is no simple or easy way to do the refactoring you suggest. The problem is that you need to jump from zero to say seventy percent in one step. If you put context into every function then you neatly jumped across all chasms in one. Refactoring / cleaning is a totally different issue...

Misunderstanding

Posted by eaton on April 26, 2010 at 1:29am

If you want to change the signature of drupal_alter to get context injected then you need to change url and every function calling url.

drupal_alter() is not inherently related to the question of context -- it is a mechanism by which we allow other modules to change a data structure before it is used. Some implementations of drupal_alter() will need contextual information, and they will be able to use the context system -- just as they use things like og_get_current_group() and so on.

Then I totally do not get it

Posted by chx on April 26, 2010 at 4:56pm

Wasnt the proposal that we have "local" contexts and that somehow needs to reach the functions that need context and if said function is called from a hook then how else it will get that context if not from the invoker?

Yep.

Posted by eaton on April 26, 2010 at 5:27pm

Wasnt the proposal that we have "local" contexts and that somehow needs to reach the functions that need context and if said function is called from a hook then how else it will get that context if not from the invoker?

Yeah, you're correct -- the idea of one level of the Drupal rendering pipeline being able to 'mock up' a context object and pass it down to something that it wraps (like, say, a block) is part of the later phases.

Our two options are to either do something ugly-ish (like, say drupal_push_context($context) and drupal_pull_context() to add something to a 'context stack' and remove it, but ew) or allow all functions to accept a context object. The latter is definitely heavy; the idea of getting and setting a context when the 'wrapping' needs to be done is appealing to me for the early phases, at least. It's sort of a context version of ob_start() and ob_end().

I have the sneaking suspicion that Crell wouldn't be cool with it, but it could be a shim solution for the issue of wrapping/etc in the very short term.

unalterable contexts?

Posted by dikini on April 25, 2010 at 4:57pm

The context object will not be alterable. Once a given element of context has been determined, it will remain constant for the remainder of that context object's lifetime. The context object will, however, provide a mechanism to wrap itself in a mock context object that allows for overriding of selected context elements at creation time only.

I think removing the mutability makes very little sense in this context. It is better to require that the functions using context objects are referentially transparent - that is if we call them with the same arguments, they will return the same result.

The context is a state carrier, as such it makes sense for it to be changing during the lifetime of a request. Think of it as the stuff passed through unix shell pipes, in our case there is no concurrency, so everything is passed between the stages via the context pipe. It can be a very useful optimisation for reducing the overall memory use. The evolving state of the request, which at the end is the response that needs to be sent to the client. As such it should provides appropriate getter and setter methods, but they should be 'sexless' - independent of the processing phase in which the request is at the moment. It is a neat design trick, as we can minimize the slots used in the object - hook_invoke_xyz and dispatch are the same thing, they differ only on when they are called, so on entering a new stage, you call the dispatch/invoke method, on exit, you replace it with its the continuation, i.e. what to call next. This is good for composition and modularity.

It will allow a lot of other interesting future optimisations - for example 'hook result fusion', sorry, I just coined the term to be, the technique I have in mind is a variation of stream fusion (it is a haskell specific technique, but definitely applicable to drupal with contexts). And probably much more declarative style, favouring configuration and pattern matching to the callback soup we seem to favour.

Off topic, sorry for the multiple edits of my previous comment, I just can't spell today =(

Different scope of context

Posted by Crell on April 25, 2010 at 8:40pm

I think you're envisioning the context object as more encompassing than proposed. I actually don't think that the "response object" should be part of it. I'm not sure yet how to handle that beyond the infinite-nesting-blocks approach used by Panels Content Panes and the Blocks TNG presentation at the Dev Summit. It's not an all-encompassing state carrier, but a "this is the environment in which we are running" indicator.

From that angle, immutability becomes very important because you don't want a block in the left rail changing the context out from under a block in the right rail. That introduces very weird dependencies that force blocks to care about their rendering order, which we want to avoid.

What we do want is mockability. That is important for testing and also for local-scope changes. Eg, a block in one location should think the "current node" is something different than it is, so we can show a field from, say, an OG that contains a given node.

yes

Posted by dikini on April 25, 2010 at 8:49pm

I run wild sometimes. But now is the time to safely do it, even if it is thrown out with the water. After chatting with Earl and Dmitry in irc, I'm thinking that a separate response and maybe state is a better thing. The are intricately linked, but definitely different. An immutable context does have the benefit of eliminating the various hook races, a problem I think is high time we solve. But that is for a different, synchronised with this, discussion.

yep thats the thing

Posted by dgoutam on May 2, 2010 at 4:50am

+1 from me.

Additional background

Posted by Crell on April 25, 2010 at 8:58pm

The original "Blocks TNG" proposal from the Dev Summit: http://www.garfieldtech.com/blockstng.odp (I may move this elsewhere later for posterity.)

The "Classic Drupal" session from Monday, with video: http://sf2010.drupal.org/conference/sessions/objectifying-php

ContextService

Posted by ronald_istos on April 26, 2010 at 12:32am

This is a very interesting discussion and I think that it's complexity probably merits breaking things down a bit so as not to confuse issues.

Crell did a great and necessary job in setting out a long term plan but before going into the relative merits of and possible ways to resolve step 2 and beyond it is probably worth getting a clear idea of how to resolve 1. Whoever is thus inclined can keep the other steps in mind when proposing solutions for 1 but the final solution for 1 should stand on its own merits as pure good design that is modular and forward thinking.

Now, if I got this right the first pressing concern is creating a shared context that is somehow a) not simply a $GLOBALS by a fancier name but something that can do some secondary investigation on our behalf and provide answers and b) built in such a way as to also enable better testing, introduce flexibility and lead to a better overall Drupal architecture.

If the above is correct we are essentially looking at context as a ContextService rather than a ContextRegistry. It is not just an array of values but something you ask questions of and, given that some answers may probably be derived in different ways, they may also have differing implementations (e.g. a live one vs one that provides lots of test context for ...testing).

Now one way to do this is via Dependency Injection, which despite its ugly name is really about just making dependencies explicit and local rather than referring to an obscure registry. Functions would make explicit what "stuff" they need and some context service object would be responsible for going away and getting that "stuff" for them.

That may sound overkill for something as simple as what is the current user and more often than not the function would simply get back some sort of standard context object but a little hard work up front will pay a lot later on as you are able to plug in and out how thus context is determined and also send back pure test data, etc.

The typical OO way to implement dependency injection is using interfaces and an assembler class that goes to get the right interface implementations and passes that back to the function in question.

In a Drupal setting implementation will be interesting as we should avoid making code overly complex and even harder or more obscure to read.

Passing context as an argument to all functions that need it is relatively limiting, why not have the function go ask fir the context it needs but non to a dumb registry but an intelligent context answering service.

Hope some of above makes sense :-)

You're on the same page

Posted by Crell on April 26, 2010 at 5:48am

Although the proposal above doesn't include the phase "Dependency Injection" (how did I overlook that?), all of the passing of context objects above is a form of dependency injection, quite deliberately. At least at this point it's injection of just those bits that are common to the overall Drupal context rather than allowing some components access to the database connection and others not, but it's a great first step. :-)

See the two links I posted above for more background on the sorts of coding patterns I'd like to see us using here.

calling functions with context

Posted by dikini on April 26, 2010 at 6:11am

I;m not sure that having a get_context() function buys you much. You are making the function dependent on a context, which will come and bite you later. I do think, that the alternatives are better - either having the context explicit in the arguments of the function or lifting the function we want to call into the context. They are very similar as end result and both require more typing than desired.

<?
$result = some_function($context, $arg);

$module->with_context($context)->some_function($arg);

$state->with_context($context)->some_function($arg);

$context->some_function($arg);
?>

The first kind of call is simple and interface wise a simple evolution of current drupal. If some_function is a state transformer, that is it returns a some kind of state object, we can do method chaining, with the results, threading the context is a bit of extra typing, but not much of pain.

The second option is slightly more involved. It requires equipping a module with a method lifting contexts into scope. It is limiting, as it is harder to utilise cross model collaboration without too much extra piping.

The third option is similar to option one, but it requires the called function to be a method of some class, i.e. modules converted to classes. It might be too much, but then again, it is a big cleanup compared to the names we use now.

I can't see a sensible way of doing the third option, unless you stuff all function into the context as methods, and that buys you nothing.

Being explicit is good - it documents applications. It offers less surprises. We do need to make sure the protocols used in the calls are sensible.

My personal favourites are 1 and 2 - they are siblings, and the allow forming request-response pipelines, which is very good, as it opens the ground for a lot of smart optimisations.

EDIT: still learning to type

Stop gap

Posted by Crell on April 26, 2010 at 6:48am

The global accessor function is deliberately a stop gap mechanism. It's not where we want to eventually end up.

The goal there is to get the context object in place, with the registration mechanism, and get core using it. That forces us to think about how we're using context and where, and removes a lot of old ugly code.

Once we have that in place, then we can start injecting the context object rather than requesting it. And we can do that in phases, working our way down from the top-level (the dispatcher and display controller) to the components they call (blocks, or whatever they become), and so on down the line. That allows a step-wise conversion, getting us more injectability/testability/coolness in a steady, achieveable stream. Some steps will undoubtedly be larger than others, but it's still less work than trying to do everything in one fell swoop.

Modules becoming classes is dead-on-arrival and shall not be mentioned again. :-) Modules are simply a convenient way to package and deliver related functionality.

The global accessor function

Posted by dikini on April 26, 2010 at 9:05am

The global accessor function is deliberately a stop gap mechanism. It's not where we want to eventually end up.
Unfortunately temporary solutions tend to last forever. And I think in this case it breaks modularity.

Once we have that in place, then we can start injecting the context object rather than requesting it.
Be careful with that. I think, even long term, it is good to be explicit in the use of context. It is a relatively expensive operation, and as a rule of thumb, it is good
for such things to use expensive typing and be explicit, so that the programmer understands that there is something to pay for the use.

Modules becoming classes is dead-on-arrival and shall not be mentioned again.
I know. I think it is a bad design anyway. Messy, and doesn't give much more than a convenient namespace.

The testability is a good thing. Injectability - not so sure. With the current drupal design we have a kind of dependency injection by way of hooks. This is what it is - automatic callback discovery, where the choreography is done, mostly, by core. I think there is benefit in going further than the current proposal. One of the logical directions, based on current drupal tech, is unifying the page array and form apis, making them completely defunctionalised and driving the content build process from that. A sort of an interpreter for a page description language. In OO term this object is just an iterator, and the pipeline of callbacks is applied appropriately to each element of the iterator. The context is the static component of the 'iterator'. The response is its output. If you prefer functional designs this is a classic build/fold loop. The latter reveals nice properties of the process, which will allow the elimination of most intermediate state automagically :) This paragraph is off-topic, but I'm writing it as an attempt to clarify where I'm coming from.

sorry, I meant to say 1 and 3

Posted by dikini on April 26, 2010 at 6:12am

sorry, I meant to say 1 and 3 above

Back on track, please

Posted by sdboyer on April 26, 2010 at 7:17pm

Discussion is getting off track, IMO. What first, and still, needs to be adequately addressed is WHAT each layer in this system is responsible for - context, dispatcher, display controller, and maybe a responder - we're getting way too much into how. And pedantics.

Nice write-up ... nice dialog

Posted by stephen.moz on April 29, 2010 at 4:08pm

High-level architectural plans like this always encourage a certain amount of "bikeshedding" in the conversation, but despite that fact it is encouraging to see this dialog going on.

It would be nice to see Drupal do less "re-inventing" of what native PHP 5 and above can already do.

(see e.g., [http://www.garfieldtech.com/blog/php-magic-call] could do away with a lot of "switch-statements-used-as-method-dispatch-code" in Drupal )

Practical examples using real code can help illustrate the point. Looking at the approach of nearly every other mainstream Web development framework based on PHP can help illustrate the point.

Is it really against the Drupal culture to move forward in this direction?

I quite like the magic methods

Posted by adrian on April 29, 2010 at 4:27pm

But apparently they can be quite slow (or so i've heard).

Also, unless you have an object instance around, you probably want to be using a static to access the methods, and unless you depend on php 5.3 with late static binding (which behaves how you would expect __call to work) and the __callStatic method , stuff can become very weird.

Yes.

Posted by chx on April 29, 2010 at 4:42pm

Yes.

Edit: this is the answer to 'Is it really against the Drupal culture to move forward in this direction?'

well, yes

Posted by dikini on April 29, 2010 at 5:42pm

The write-up is fine, but as Adrian says, magic methods have some weird behaviour. Don't want to expand on it here.

Drupal is doing magic which is not generally used in other php frameworks and apps. If you want to see OO similar in spirit to to what what Drupal is doing look further afield - .NET Rx and its interaction with Linq is the closest I can think of.

I would love to see more OO used in Drupal, as it simplifies the syntax, makes it easier to comprehend, but without sacrifices to the modularity which we have now.

I would love the problems what we have in that respect cleared. They are related to the gratuitous use of state, drupal_goto, and other sins of the same family. In that respect the part of the proposal, maintaining an immutable Context object is a step in the direction.

My objections are against get_context style functions - those can be easily abused, add an un-nessesary dependency on a yet another api, reduce, albeit slightly, the ease of testing. I know that the intent is for this to be temporary, but I don't believe we should add temp artifacts just to remove them later.

Anyway, if we are to move toward the logical conclusion of the better apis in drupal - fapi, dbapi, views we should completely de-functionalise them. I would speculate, that it will force the repair of a lot of ugly code in contrib and make a lot of happy bunnies as a result. We do need to improve the compositionality of modules and components. At the moment it happens by accident, but the potential, and the changes to achieve it are not that big ;) But I digress...

And there you are wrong

Posted by chx on April 29, 2010 at 7:05pm

I would love to see more OO used in Drupal, as it simplifies the syntax, makes it easier to comprehend, but without sacrifices to the modularity which we have now.

You can't do that because PHP does not let extend a class once it's defined. This is the fundamental problem and going around it requires a drastically different design to what we have now.

Not sure...

Posted by dgoutam on May 3, 2010 at 1:42am

Not very sure about "PHP does not let extend a class once it's defined"...

See it here at Dynamically Add Functions to PHP Classes

Using the __call() magic

Posted by sdboyer on May 3, 2010 at 3:13am

Using the __call() magic method to simulate modification of class definitions is a far, far cry from a real, architecturally respectable solution.

Worthless + slow

Posted by chx on May 3, 2010 at 3:40am

You can't access private/protected plus magic methods are slow...

Not entirely true

Posted by Crell on May 3, 2010 at 4:42am

Your flippant dismissal of an entire branch of PHP development is really disappointing.

First of all, the fact that you can't access private/protected properties is not a deal breaker. There are many many use cases where you don't need to, and in fact it's good if you don't. Extended Context is, arguably, one of them since we don't want an extended context to suddenly become mutable when we expect it to be immutable. So in this case it's a feature, not a bug.

Secondly, there are ways around that. I implemented one three years ago: http://www.garfieldtech.com/blog/php-magic-call

The only catch is private arrays, but as above that's not a fatal issue, and there's probably ways around that, too. (If we used direct properties on the facade object rather than routing through __get(), that would probably resolve that issue.)

Third, "it's slow" is not a useful statement unless you can quantify it. Giant masses of arrays are slow, too, depending on what you're doing with them. Some of the things we do with arrays are horribly slow, for certain definitions of slow.

I have previously quantified the speed trade-offs of __call() and friends: http://www.garfieldtech.com/blog/magic-benchmarks

In short, while __call() does have a non-trivial performance cost, call_user_func_array() has nearly the same cost. Guess what function we execute probably over a thousand times in a given Drupal run? call_user_func_array() Is the basis of the entire hook system. __call() is no slower.

So dismissing the entire concept of __call() as "worthless and slow" is a statement both ignorant and naive.

re: "It's slow": the relevant

Posted by sdboyer on May 3, 2010 at 3:23pm

re: "It's slow": the relevant comparison is the speed of calling a literally defined userspace method, versus the cost of calling a secondary function via the __call() method. Your own benchmarks use cufa() for that second call internally (which doesn't necessarily have to be the approach) but clearly are around twice as expensive as a single cufa() call.

We use cufa() a fair bit in drupal-land, sure. ~40 calls to it in core. On some requests those get hit frequently, creating the 1000+ calls you describe; other times the number of calls is much smaller. But what's being discussed here isn't a scattered set of calls to cufa() that occasionally get hit; we're talking about center-stage, critical path architectural foundations that get hit every time you take a take a proverbial step. We'd be idiots, I believe, to put any more logic into userspace than we absolutely must - which takes big userspace patterns like that off the table for me. But hey, if it could replace our endless big arrays...maybe.

Regardless, the only thing that's REALLY "disappointing" here is that we're wasting yet more time indulging intellectual-handwaving-discussions on implementation details that do nothing to advance agreement on or understanding of the broader architecture, which is still in question.

Dammit.

/me slaps self for contributing to essentially OT discussion.

Or: Class mixin chains via code generation

Posted by donquixote on October 3, 2010 at 2:45am

See http://groups.drupal.org/node/96849

Whether OOP is is simpler

Posted by matt2000 on April 30, 2010 at 3:24am

Whether OOP is is simpler syntax or more comprehendable is a huge matter of opinion.

i would argue that poorly-written and/or poorly-organized functional code is much more comprehendable than poorly written OO code. And since open-source in general and the Drupal ecosystem especially encourages contributions from less experienced developers, we should cater to the lowest common denominator. I think the success of Drupal as a development platform as compared to other (OOP) CMS's proves my point.

OOP might make more sense to Drupal devs who come from a Computer science background, but I contend that functional makes more sense for Drupal devs who come from a traditional web design (HTML/CSS/Javascript only) background.

Put another way, better to create something easily abused than something not easily used.

But I digress. This should not devolve into another OOP vs functional debate. We're obviously already on the path of incorporating OOP where it makes sense (low level components like DB abstraction), and we should continue to do so. But we should never discount the valid point of the additional challenges of OOP for many developers.

Drupal.org user profile
Drupal Micro-blogging: http://twitter.com/matt2000

Translation

Posted by Crell on April 30, 2010 at 3:40am

"Here's all of my opinions on the subject... But we shouldn't discuss it." Nice. Very nice.

We should use whatever tool makes the most sense to use for the problem at hand. Anything else is noise. Now can we get back to architecture and not complaining about OO?

"Here's all of my opinions on

Posted by matt2000 on April 30, 2010 at 3:59am

"Here's all of my opinions on the subject... But we shouldn't discuss it."

Haha. Not my intent at all. Glad to discuss it in another thread. And I wasn't complaining.

We should use whatever tool makes the most sense to use for the problem at hand.

Here we agree completely.

Drupal.org user profile
Drupal Micro-blogging: http://twitter.com/matt2000

:)

Posted by dikini on April 30, 2010 at 8:57am

i would argue that poorly-written and/or poorly-organized functional code is much more comprehendable than poorly written OO code.

I totally agree with that statement. With a risk of getting into a semantic bog, drupal is not functional, although it would benefit from more referentially transparent by convention approach.

What classes and objects can do is offer a useful interface to an underlying machinery, but that is very hard to do. Especially in php. Especially if you get design inspiration from php staple OO like SPL.

An even more declarative approach will be good. It is hard to get right, just look at the evolution of fapi or php template. But as you've said it is for another thread.

Having a context object - (a php object, array, magic mushroom, I don't care what atm), with some of the properties described would be of a big benefit. Even if it just improves the code hygiene, although that is not the only benefit. It seems that there is a tentative agreement on that. The major opinions diverge at the later stages.

So what are the options?
* modify required functions to comply with a fun( $context, ...) protocol
* add a drupal_get_context() temporary function
* do magic patching of the context class and lift the required functions into the context

My personal favourite is the first option. It is not a hack - it is an explicit statement, which makes the interface obvious to the programmer. I don't see major problems with that bar interface inconsistency between functions which require context and those that don't.

The second option is a hack in my opinion. And a bad one to boot. It introduces a side channel, it is not obvious from looking just at the function documentation that the function uses a context. It breaks modularity, by introducing dependencies on an 'undocumented' feature.

The third is too fragile in php. Probably a major hack, but can be done in conjunction with the first option, for simplifying the external interface, with some speed concessions

Context stack

Posted by Crell on May 2, 2010 at 1:32am

So further up, eaton threw out the idea of a context stack to handle legacy procedural code. He's right that it kinda rubs me the wrong way, but it does offer a potential way to allow mocking to carry through to legacy code.

After some discussion with eaton and merlinofchaos in IRC, I did a trial implementation just to see if I could do it, and while I'm not entirely happy with the resulting code it is possible to do. In short, whenever you create a new mock context object you get back not one but two objects; the new mock context object and a tracking object. The context object is also pushed onto a central (global?) stack. The tracking object works much the same way as transactions do in Drupal 7. All it really does is wait to go out of scope, and when it does it pops the related context object off of the stack. That's necessary because of exceptions: It's possible for the process to suddenly bubble up "out of band" above where a given context object was mocked, and we need to auto-pop the mocked context objects along the way. context_get() then returns whatever the top-most context object is.

Such an approach would have a number of important implications we need to consider:

The context object would actually be an object. I am fairly sure that the language semantics of arrays and objects make an array-based implementation of the context object if not impossible than so amazingly ugly that it would be foolish to even consider such an approach. (I don't actually consider this a downside and I think we will have to do this anyway, but I mention it for completeness.)
context_get() now handles mocked context; vis, if you've mocked up a context object then functions that call context_get() will get the mocked version. This is arguably a good thing, with the caveat in the next item.
That probably means that context_get() will never go away. Procedural code will just start using it and we won't be able to get rid of it. Although that reduces the amount of code that has to be rearchtiected, I am still not sure if that's a good thing or not. We desperately need to revisit some of the decisions we've made in the past that have led to our current dependency spaghetti, and making it too easy to not need to robs us of a golden opportunity.
Because context_get() is so easy, we would need to establish a coding standard for when it may or may not be used. Specifically, *any use of context_get() in an object rather than passing it in the constructor must be considered a bug.* To be honest I question our ability to hold to such a standard when it's so easy to "just call it", and our track record for doing something right rather than "the simple way that works now" is not very good.
As dikini notes, it introduces another side channel we need to keep track of. While forcing the context object to be read-only does largely avoid side effects (which are generally a very bad thing in any architectural design), it doesn't entirely. It could introduce more "mystery magic", and that's something we want to avoid.

In short, then, a context stack is a potential approach, but it has some serious implications we need to think about; basically I would only support it on the condition that we all agree, explicitly and officially, that it is still a temporary measure and "let's convert this to the new system so we can eliminate a context_get() usage" is always treated as a valid reason to refactor something. "No, that's bloat" would not be accepted as a counter-argument, because context_get() would have to still be understood as a crutch and hack that is itself bloat. (Lines of code are not the way one should measure "bloat".)

I am not too confident in our ability to make that commitment.

you just about persuaded me :)

Posted by dikini on May 2, 2010 at 8:33am

Although you lost me in the implementation details description. For the sake of the argument I'm going to broadly call the non-module parts of drupal the runtime. So if we make sure that only the runtime can modify the context directly, and the modules might possibly ask for it to be performed on their behalf, then it is probably ok. It still rubs me the wrong way, as in being non obvious, but we can require documentation as a half hearted measure. The implementation details of context will still be opaque to the clients, so even if the first implementations are not good, we can always change that. The fact is we do have an ambient environment, the runtime, in which the code lives in. Specifying what it can be and having control over it is a good thing. The client interface is what matters most and has to be done mostly right as the cost of getting that wrong is high. Rewrites are painful.

From testability point of view this is a great improvement. The single point of reference guarantees that the testing harness has control, so it can split the dependencies of the client code and mock the world.

get_context() is not my preferred solution, see above, but I could live with it, I'll just have to remember that it is part of the ambient environment. Call me old school, but I prefer that to be explicit - helps code reading. PHP doesn't have helpful type signatures to help with implicit parameters :(

By the way, what is so heavy about passing it as an argument? In php, due to COW, arguments are always a "reference", so you probably don't mean that. I presume that this refers to the proliferation of arguments we've seen in the past. Which is ugly, I agree. The other side effect is the existence of different function signatures - but I find that good, it is documentation. If something does not need it at all, then it won't require it. The rest is interface, or call protocol clean up.

I wholeheartedly agree that we need to revisit some decisions made in the past. This is part of it, right?

just a clarification

Posted by dikini on May 2, 2010 at 8:36am

from the point of view of the client of a context, context stack and context are one and the same. That is the reason avoid the use of the latter above.

I have been strongly thinking

Posted by merlinofchaos on May 2, 2010 at 4:05pm

I have been strongly thinking that a context stack is an acceptable solution that will allow us to transition smoothly, yet provide the mutable environment we need by mocking new contexts.

I realize that we have to accept that context_get() will probably never be able to go away, but we should be able to limit it's usage to things like evaluated code and distant hook invocations where modules are failing to pass context along as they should.

agreed on evaluated code.

Posted by dikini on May 2, 2010 at 4:25pm

agreed on evaluated code. that is a very strong case. we should consider the second one as a bug, but it does make sense as an interim solution. the more I turn the context stack in my head, the more I think it is as the best compromise we have at hand.

Evaluated code?

Posted by Crell on May 2, 2010 at 5:46pm

I'm not sure I follow what you mean by "Evaluated code". You mean eval() code? Doing anything to support that scares me. :-)

Also, I think it's important to distinguish between mutable context and mockable context. Those are different things. What we want is an "immutable, easily mockable context object". That lets us have the reliability of mutability and the testability of mockability, both of which we want.

evaluated code == php filter

Posted by merlinofchaos on May 3, 2010 at 12:56am

evaluated code == php filter == drupal_eval()

Temporary measures

Posted by eaton on May 6, 2010 at 9:12pm

In short, then, a context stack is a potential approach, but it has some serious implications we need to think about; basically I would only support it on the condition that we all agree, explicitly and officially, that it is still a temporary measure and "let's convert this to the new system so we can eliminate a context_get() usage" is always treated as a valid reason to refactor something. "No, that's bloat" would not be accepted as a counter-argument, because context_get() would have to still be understood as a crutch and hack that is itself bloat. (Lines of code are not the way one should measure "bloat".)

Indeed. This is difficult, but in some ways I feel that consideration for the developer community means we must make some concessions. Even in the Drupal 4.6 to 4.7 transition, we had four months or so where a temporary formapi_legacy.inc file was provided, with legacy versions of our old form-building functions that worked but spat back admin notices warning that 'legacy code' needed to be converted.

Aspect could be the other option

Posted by dgoutam on May 2, 2010 at 5:10am

If we could use run time weaving of the context via aspect programming paradigm we could get the context what all we need but without the pain of changing the much of a code base.

Huh?

Posted by Crell on May 2, 2010 at 5:42pm

You'll have to explain what you mean, since I've never seen "Aspect oriented programming" thrown around with actual concrete examples of what it means; at least not in a language anyone uses in the real world. :-) I have no idea what you just said.

Thanks

Posted by dgoutam on May 3, 2010 at 1:55am

Yes you are right crell, It was my mistake. I should have provided more details with my suggestions.
There are real implementations around the world of software engineering and they gracefully exists. We have practiced it in our various software projects which are in java for our enterprise clients successfully.

In the world of php this AOP also exists.
Aspect-Oriented Programming for PHP

We're kinda doing it

Posted by Crell on May 3, 2010 at 4:47am

chx pointed me at these links as well:

http://msdn.microsoft.com/en-us/library/aa288717(VS.71).aspx
http://blog.jonnay.net/archives/637-Aspect-Oriented-Programming-in-PHP-a...

It sounds very very much like what we're already doing in Drupal 7 with the SelectQuery extender objects, give or take some implementation details. I'm still iffy on extender objects, as they're the best solution we have found so far but still not as clean as I'd like. Maintaining interface declarations when wrapping is the hard part, since PHP doesn't have, as far as I know, a mechanism for creating a new class at runtime to model the class being aspected (or whatever the phrase is). If we don't need a complete override, though, it may be possible to use a variant of that approach to register extended context callbacks. See also: http://www.garfieldtech.com/blog/php-magic-call

Now then, back to the context stack concept and its pros and cons...

Lambda functions in Phase 6

Posted by tamasd on May 6, 2010 at 4:45pm

I have a comment for phase 6:

We can allow to use lambda functions (introduced in PHP 5.3) there. So besides the classic hooks, there would be "temporary hooks" (I think this maybe not the best name), which can be registered like this:

<?php
if($some_rare_condition) {
  $context->registerHook('some_hook', function($context, $arg0, $arg1) {
     // do something here
  }
}
?>

Where ->registerHook takes two arguments: the hook name and a callback function.

Then ->invokeHook() can go through these registered hooks and the classic hooks (persistent hooks), which can provide some flexibility and maybe performance boost (less hooks are defined globally).

Premature

Posted by Crell on May 6, 2010 at 4:54pm

As sdboyer has noted above, it is way premature to be discussing language-version-specific implementation details of a late-phase part of the process. Things that get hashed out in the earlier phase implementations will likely change later phases considerably.

Next phase of the process

Posted by Crell on May 10, 2010 at 5:47am

OK, after discussing this plan with Dries at CMS Expo in Chicago and getting a green light to continue working in this direction (yay!), I am going to break up the discussion a bit. This thread has gotten long enough, and there's too much side chatter going on to be able to track the important bits.

I think we're all on the same page about the overall plan, though. Now we can start talking in a bit more detail about each phase separately. See the new posts in this group. Let's focus on those now and try to stay on topic. :-) We'll leave this post here for posterity.

High-level goals

Implementation plan

Phase 1: Context Object

Step 1:

Step 2:

Step 3:

Step 4:

Phase 2: Extended context

Phase 3: Display controllers

Phase 4: Clean up hacky systems

Phase 5: Blocks TNG

Phase 6: Hooks and the Rabbit Hole

Phase 7: Functionality access

Phase 8: Profit

Comments

Group organizers

New groups

Group notifications

Hot content this week