Plugin research results

Events happening in the community are now at Drupal community events on www.drupal.org.
Crell's picture

For those who are researching other projects and how they handle plugins, please post your research results in this thread.

Note: If you are not talking specifically about the analysis of other projects and how they handle plugins, please stay out of this thread.

There have been way too many off-topic tangents lately that are only distracting us from the work that needs to be done.

Some off-topic comments have been unpublished. Please stick to presenting reports of other systems and evaluating the pros/cons of those systems.

Comments

How does Plone do

pounard's picture

Today I spoke with a colleague who is actually a Plone contributor about the concept of plugin. After some talks, explainations, adjustements (we are working on different on really different technologies here), we finally understood what each other called a plugin.

As a reminder, Plone is a Python CMS working over the Zope application server. Its business is basically the same as Drupal, which is managing content, but its architecture is radically different (because of the application server facet). But in the end, under the hood there is a lot of pieces of API that actually implements in their own way, adapted to the technology, high level abstract concepts we also find in Drupal.

In Plone, they actually have a plugin concept that looks like a bit what Crell defines here, they call it "Components". A component in Plone is a specific object (Python is highly dynamic, so they do not talk about class definition here, but about objects because their signature can change over runtime) that contracts with a specific interface. In python, the pragmatic language-based interface does not exists natively in opposition to PHP interfaces, so interface concept is materialized at runtime by a specific framework (given by Zope itself if I'm not wrong). It actually is no different from PHP interfaces except for the dynamic aspect. It's here a specific implementation of the design by contract pattern.

Modules in Plone define their own interfaces and components (objects that contracts with those interfaces) in an XML configuration file, it's static information.

In order to use components in the code, they created a component registry (which is also a factory, as in the factory pattern) able to spawn these objects on demand. It's called the z3c.baseregistry (I took this later statement from Plone documentation and not from my talk with my colleague). When a module needs an object that implement a particular interface, they ask the core framework for the specific registry that handles this interface, and ask the registry for a component. As I understood (I might be wrong) they actually do not ask for "this particular implementation" but for "any implementing component/object" more often, which means they do not have a total user control over the implementation to use at runtime.

Now back to the discussion I had with this colleague, he finally said that this component factory (he calls it a factory more often than a registry, but it remains both in the implementation) has a generic implementation. This means that any module that defines interfaces will implicitely create a registry, and a module that declare components contracting one of those interfaces will implicitely register it, as soon as both are being registered in a specific configuration file. Then, the site integrator defines components wiring altogether within configuration files.

So, they do not use plugins as CTools would do, in a sense that if we look at the Views module, components (or plugins) being used are determined by a complex UI (and have a dynamically stored configuration), therefore you can have on your site a lot of instances of components contracting with the same interface but using different implementation running altogether at the same time.

He said to me that the reason why they do not go further than providing the factory (registry) itself is because managing the UI in a generic way is nearly impossible. Each interface is spawned by a different module with a different matter of business, or at least often being used for many different use cases, and because those use cases cannot be generalized therefore UI neither can be. The only thing those components have in common is the fact that they all contract with an interface ("an" as undertiministic because they do not all answer the same interface) and the fact that they all can be requested using a generic factory that itself has its own interface.

This is a part of my research. Hope it helps.

Forgot some links:
http://plone.org/documentation/manual/theme-reference/buildingblocks/com...
http://plone.org/documentation/manual/developer-manual/generic-setup/ref...
http://davisagli.com/blog/registering-add-on-specific-components-using-z...

I'd try to find more descriptive documentation about this, I'll ask my colleague, and I'll ask him to read this to ensure I'm not all wrong.

Pierre.

Thanks for this, it sounds

neclimdul's picture

Thanks for this, it sounds very useful despite being tied to some very specific parts of the Python language. I think this would be more useful though if you kept your discussion to the system you're discussing rather than comparing it to systems which you seem to not be intimately familiar with. Plone's components sound very much like some concepts I find very compelling about CTools and plan to discuss. For example, its XML sounds a lot like how CTools uses its PHP array definitions and its registry system sounds like its API for retrieving objects and functions.

And he's right, managing the UI in a fairly generic way is nigh impossible. That's why CTools plugins makes almost no effort in this respect. Its up to the plugin implementation to handle this, ie views and views_ui to handle implementing the management interface, bundling, and configuration storage of the plugins. Its also why, despite earls best efforts, views UI is more tightly tied to views than would be the case in an ideal world.

Python is much different yes,

pounard's picture

Python is much different yes, but language is not that revelant, I tried to focus on the pattern itself. Thanks for feeback.

Pierre.

This is a very important

eaton's picture

This is a very important point that we tend to be very bad at internalizing in the Drupal world:

"managing the UI in a fairly generic way is nigh impossible."

It's (relatively) easy to create generic interfaces for managing data if we're willing to accept that we're just building a streamlined version of PHPMyAdmin. Simply exposing a simple editing UI to modify the "raw" information in a known data model is something that's been done. The challenge is in building UX that makes sense for the different needs of various kinds of components: the admin screens for the Drupal Menu System are different than the admin screens for user notification emails because "generic is hard."

Leaving it up to the plugins to manage their configuration make sense precisely because we don't have any idea what kind of crazy things they might be working with. We can get some consistency by establishing standards for where some kinds of config screens should be accessible, but anything beyond that starts feeling like a tar pit for developers.

So we need a couple of tiny

yoroy's picture

So we need a couple of tiny snowmen for each of those situations. Example use cases. I realize working on UI implication is still very early days. Please add examples, use cases or challenges to http://groups.drupal.org/node/137469 if you run into them. Thanks!

(What am I doing in the Butler discussion group anyway, right? :-)

Some adjustements

pounard's picture

Here is another URL about Zope implementation (which is the official documentation of zope.component package):
http://pypi.python.org/pypi/zope.component

Their "plugin registry" (which in fact is a "component" registry) takes a lot more uses cases I told in my first post.
The global "interface" registry uses factories behind, one for each interface (we could compare with plugin interfaces in the case of our problem). Each factory per default uses a default implementation, nevertheless, any module that declares an interface can also provide its own factory as long as it is compatible with the default one).

The components registry handle at least two use cases. Factories can provide what they call "adapters", in their case, an adapter is a component that can be instanciated only once site-wide (e.g. could be the database backend for us, even if it's not totally true with D7), this adapter is then a singleton. Or they can also provide named components (e.g. gives fresh new instance, the real implementation is given by a machine name) where those components once fetched are always fresh new instances and not singletons.

There is some other goodies and features, I didn't had time to look at it all.

If you except the higly dynamism of Python which allows them to modify objects at runtime, the rest of their design is quite simple, efficient, and I think it's a pretty good basis for the registry part of what could be the plugins TNG in D8. That's why in a moment were I bored, I did this simple implementation that is working on basic uses cases.

Pierre.

The purpose of this post.

neclimdul's picture

This is not a research result. Please read [#136089] to understand what we're looking for. Specifically this post is for aggregating the results of the plugin reviews and discussing them. If you have ideas for a implementation to review or have experience with one of those systems and wish to help, contacting one of those people is a better way to help further this project.

Also, please understand that views has always been a prime use case. It was discussed half a dozen times in the previous thread, is shown in Crell's presentation, and is something most people discussing this for the past 2 years are very familiar with. They are also built using CTools plugins in 7.x-3.x and though not listed, there is a team is preparing a review of them as well.

As far as UI/configuration/form ideas please be aware as pounard pointed out in this thread and has been largely agreed upon, there is not generic plugin UI. There's no way to do it. And while views has good ideas on it, and that pattern could be used by plugins, there is no way to generically implement it. There is no need to discuss this further.

Symfony2

sdboyer's picture

OK, finally writing up my take on Symfony2. Sorry it took so long. Also, sorry to anyone who actually knows this system - I'm just dabbling here, so I'm quite positive I've glossed over tons of big important parts of symfony2. Hopefully I didn't get anything hugely wrong, but please do correct me.

First, it's really important to understand from the start that symfony2 is a framework. The baseline expectation is that you interact with it from the command line, hand-modify files, and generally do a lot more tweaking of the low-level bits of your site's code than we ever expect you to do in Drupal. And while Drupal is strengthening its framework elements (of which this initiative is a huge part), but it's not our starting point. Generally speaking, we try to prioritize site builders tweaking pre-existing parts from the web UI. Frameworks, symfony2 included, are oriented towards expert developers trying to tie together loosely-coupled parts.

Symfony2 is centered around building controllers that are responsible for serving the different pages of your website. There's no required core concepts like "nodes," or even "users," - all of that is optionally added on top. And to manage all that optional-ness, they put MASSIVE emphasis on code generation so that it becomes easy to write exactly the controllers (and other code) you need for your particular case. These all get wrapped up into units they call bundles, which do a lot of specifying exactly what underlying code is needed for the bundle to operate. To quote them:

A bundle is similar to a plugin in other software, but even better. The key difference is that everything is a bundle in Symfony2, including both the core framework functionality and the code written for your application. Bundles are first-class citizens in Symfony2. This gives you the flexibility to use pre-built features packaged in third-party bundles or to distribute your own bundles. It makes it easy to pick and choose which features to enable in your application and to optimize them the way you want. (source)

The main application routes to a bundle's controller, the bundle's dependencies get lazy loaded as needed, and ipso facto. Since these bundles all have their own configuration and are often full of generated/custom code, AND since symfony makes such extensive use of autoloading, the notion of a 'plugin' in symfony is inherently quite different than that which we have in Drupal. From what I've gleaned, the level of abstraction, interacting-through-interfaces, and totally ubiquitous autoload in symfony sorta makes plugins out of both everything and nothing. Code that might otherwise be thought of as "plugin/pluggable" is defined explicitly, retrieved through intelligent autoload and strict filesystem layout requirements (that follow the PSR-0 standard), and then compiled into your app config.

Nevertheless, I think there are some parallels to be drawn and lessons to be had. I've got three biggies.

  • Symfony gets so much for free from its strict adherence to PSR-0. Like so, SO much. There are drawbacks - a lot of individual files to load (though symfony is capable of writing a single 'cache bootstrap' file with all that code), fairly complex naming conventions, and a LOT of abstraction. But it can just call a class from almost anywhere, and with a *very* small amount of autoloader code (maybe 100 lines of logic and 20 or so base PSR-0-compliant base search locations, each attached to a particular namespace), can then autoload one of ~4300 classes & interfaces. And that's all with *nothing* in the database - symfony doesn't require one to operate.

    Now, I don't think that there's value for Drupal in trying to fully adopt PSR-0. As we've already discussed, it's less performant as well as just incompatible with how Drupal is laid out. However - and this is a point Larry and I have agreed on - PSR-0 was adopted as a standard for good reason, as it *really* works well for small, self-contained components (which is why it works well for symfony - that framework is basically just a shit-ton of such components). So making it possible to selectively utilize the standard would be excellent - especially because it could solve that pesky (read: HORRIBLE) problem D7 has with not being able to rebuild the registry. This is consistent with the talk of allowing multiple discovery/registration/loading methodologies for plugins, so I'm all well in support of that. Honestly though, this is one of the most striking things I found about symfony - there is seriously beautiful harmony that arises from the fact that all the code they could possibly need is just there, with such minimal work required.

  • The second big lesson is their service containers. These are dependency injection containers that act as a service locator for the entire application. They encapsulate services - which probably are the things most analogous to what we're calling plugins (interchangeable bundles of logic that all serve the same basic purpose). The base system constructs the service container object; code asks the container for objects, and only *then* are they lazy-loaded.

    There is a ton of potential for a service locator. Just tons. So, it's not surprising that we've already discussed them here. It's pretty clear that it should be separate from context, which is really about retrieving *data*, whereas a service locator is about retrieving *logic*. Unfortunately, that is something of an artificial distinction that I think can quickly devolve into that old grey area of "content vs. config." But if we decide to limit our service locator to locating our plugins, that could be a good dividing line.

  • Finally, even though a config-light all-PSR-0 will almost certainly never work for us, I think we should try very hard to shift the information required for plugin discovery, and maybe even some plugin conf, up into the config system. Having that stuff in flatfiles on disk (that can be dynamically updated) means we can fork our critical path really, really early. Sky's the limit from there.

I think there's a lot of other stuff we can borrow from symfony2, actually. Their Components are things we ought to at least look through for solutions to common problems, and compare what we have. But it'll be particularly important when we get to the point of defining our low-level outermost request controllers, as well as when we start revisiting the routing problem (since the menu system is almost certainly going to go from being the routing system to being just one of many).

Nice

Crell's picture

Thanks, Sam. That's a great summary.

I can see the argument for using PSR-0 within /includes, or whatever that turns into. For modules, I don't think it would work at all unless we rewrite all of Drupal from scratch (even more so than we're already planning to do :-) ). I also would love to see as much "state" as possible pushed down to disk configuration via the Configuration initiative's work. The more knowledge we have about Drupal with just fopen(), the more we can optimize those crucial early steps in the request.

I'm also warming to the idea of having a central service locator for plugins. I hadn't originally wanted to go that far quite yet, bit the more I think about it the more it makes sense to plan for it. That would give us much better separation between systems, which in turn forces us to write more modular code, which in turn makes it more robust and testable.

dependency injection

donquixote's picture

dependency injection containers

My question would be, does a consumer component (such as a controller with page callbacks) get the full container injected, or just specific services?

It's pretty clear that it should be separate from context, which is really about retrieving data, whereas a service locator is about retrieving logic.

This is just one way to draw a line. There is another dimension, that is, where does the data or service come from, or depend on? Some data comes from the request, and thus is different on every page view. Some data comes from the session, and is thus always the same for this one user, during this one session. And some data comes from the database or from a file, and is thus the same for every user, on every request, at a given point of time. Services might depend on one or more of these categories of data.
This distinction might be more important than a distinction of data vs logic.

An interesting question would be, does symfony has anything like a "data locator" (for "context" data), that is separate from the service locator?

plugin

More and more I get the idea that the term "plugin" is rather arbitrary and meaningless.
What is the benefit in using this term?

Definitions

Crell's picture

"Plugin", as we're using the term for Drupal, is defined here: http://groups.drupal.org/wscci/definitions

All words have only the meaning we give them, and that is the meaning that we are giving them. It is the most appropriate term within the existing body of software engineering.

As for where to draw the line, "where it comes from" is, actually, one of the things we do not want a system to know. It just knows that it is getting data. We actively don't want it to know from where, because if it knows/cares then it makes it harder to cleanly mock it, test it, or reuse it in alternate ways. But data and logic are a very clear distinction, far far clearer than content vs. configuration (both of which are subsets of data).

Very interesting sum-up

sylvain lecoy's picture

Very interesting sum-up

Open Services Gateway initiative and Equinox

sylvain lecoy's picture

Open Services Gateway initiative (commonly called OSGi Alliance) is an organization created in 1999. OSGi is a Java-based framework targeted for use by systems that require long running times, dynamic updates, and minimal disruptions to the running environment. The core of this specification is a framework who defines a management model for the application life cycle, a service repository (registry), a runtime environment, and modules.

The framework implements a model of dynamic components which can be installed, stopped, started, updated and uninstalled remotely without a needing to reboot, if this part is very similar to modules in drupal 7 where you can administrate them remotely, however the registry service allow bundles (a package containing both applications and components for deployment: a bit like our tar file) to detect the addition of new services, or suppression of them, and to adapt correctly. The original objective was focused on services gateways but it expanded a lot more. Specifications are now used in mobiles application, the Eclipse IDE (Equinox), automobiles, industrial automates, PDAs, grids, and more interestingly application servers.

Equinox

The equinox/p2 provisioning framework is like our old good update module but has a lot to offer end users. This tool focuses mainly on builder, and handle the installation of the product, but also the build of profiles. The Equinox/p2 framework introduces terms to describe the way eclipse components are packaged:

  • Installable units (IU) - Metadata that provides information, such as name, version, and requirements. The interesting thing is the dynamic loading of requirements thanks to repositories. P2 can handle dynamic checking of all your repositories to grab and install a dependency for instance.
  • Profiles - Profile is a set of plugins and features, which can allow different instances, or configuration of Eclipse but using common plug-in files. This is a very interesting concept while thinking of multi sites.
  • Features - Technically a feature is a set of plug-ins, and features, which requires specific versions which works together. Conceptually, its very similar to a configuration in component oriented engineering, it can encapsulate other components, connectors, and configurations as well. When a feature is deployed, it contains only link to plug-ins (which resides in their own plugins/ directory). Features are deployed in features/.
  • Plug-ins - A plugin is a basic brick of a component, which provides functionalities and extension points. It can be a Core plug-in, a UI plug-in, a Service plug-in, an Engine, a Toolkit, Common or Internal files. It can also be a Client: the product itself (in Drupal basically all modules whose not provide an API are Client plug-ins, those are not extendable).

Installation of Eclipse through p2installer

One very interesting thing about this provisioning component is the ability to install the whole platform through a lightweight Standard Widget Toolkit (SWT) graphical installer which allows you to install Eclipse without downloading the entire package first.

After downloading the installer, run the p2installer application. The installer appears, and you will have the ability to choose either a stand-alone installation or a shared installation. The entire IDE, including plug-ins, will be installed into one folder. In contrast, the shared installation puts plug-ins in a common folder you can share across instances of Eclipse. You need to accept each plug-ins and features licences before installing and this is very interesting when you think about Drupal distributions, the open-source platform can allow enterprises benefits from distribution precisely at this point.

If you agree to the terms of the license, accept them, then click Finish
Only local images are allowed.
Authority: http://www.ibm.com/developerworks/opensource/library/os-eclipse-equinox/index.html

Equinox Incubator - Provisioning concepts

  • Agent - The provisioning infrastructure on client machines is generally referred to as the agent. Agents can manage themselves as well as other profiles. An agent may run separate from any other Eclipse system being managed or may be embedded inside of another Eclipse system. Agents can manage many profiles (see below) and indeed, a given system may have many agents running on it. There is no such thing as "the p2 agent" or a bundle that goes by that name. This is because p2 is modular and what you need to run on an embedded device is not the same than what you need on a desktop or an autonomous server.
  • Artifacts - Artifacts are the actual content being installed or managed. Bundle, Entities, Modules, Themes, Theme Engines are example of artifacts.
  • Artifact Repository - Artifact repositories hold artifacts
  • Director - The director is a high level API that combines the work of the planner and the engine. That is, the director invokes the planner to compute the provisioning operations to perform, and then invokes the engine with the planner's output to achieve the desired profile changes.
  • Engine - The engine is responsible for carrying out the desired provisioning operations as determined by a director. Whereas the subject of the director's work is metadata, the subject of the engine's work is the artifacts and configuration information contained in the IUs (Installable Units) selected by the director. Engines cooperate with repositories and transport mechanisms to ensure that the required artifacts are available in the desired locations. The engine runs by invoking a set of engine Phases and working with the various Touchpoints to effect the desired result.
  • Garbage Collection - Element of repositories (metadata and artifact) can be garbage collected by tracing reachability from a set of known roots. For example, the set of all profiles managed by an agent transitively identifies all IUs that are currently of direct interest to the provisioning agent. Similarly, the IUs identify the artifacts required to run the profiles. Any IUs or artifacts that are not in the transitive list are garbage and can be collected. Note: In PHP the concept of garbage collection is quite curious, but think that a server instance is always up, and once you loaded admin tasks, you need to freed them when you leave your administration panel to let client applications run smoothly. Applied to Drupal this is equivalent to not load all modules which are ticked in the admin panel but only modules which are dependency for a given Installable unit (or module).
  • Metadata Repository - A metadata repository holds installable units.
  • Mirroring - The basic operation of distribution is mirroring. The key here is that metadata and artifacts are not downloaded, they are mirrored. The subtle distinction is that local mirrors are a) simple caches of something that is remote and b) potential sources of further mirroring. This means that locally held information can be deleted and replaced as needed by re-mirroring. Similarly, having local copies act as mirrors opens the path to natural peer-to-peer distribution. Note that metadata and artifacts are quite separable and having a IU mirrored in one repository does not imply that the associated artifacts are in/near/beside/... that repository.
  • Phase - Provisioning operations generally happen by walking through a set of steps or phases. At each phase a particular kind of activity takes place. For example, during the Fetch phase, the various artifacts required for the operation are Mirrored while during the Configure phase IUs are woven into the underlying runtime system by their associated Touchpoints.
  • Planner - The planner is responsible for determining what should be done to a given profile to reshape it as requested. That is, given the current state of a profile, a description of the desired end state of that profile and metadata describing the available IUs, a planner produces a list of provisioning operations (e.g., install, update, or uninstall) to perform on the related IUs.
  • Profile - Profiles are the target of install/management operations. They are a list of IUs that go together to make up a system. They are roughly equivalent to configurations. When an IU is installed it is added to a profile. That profile can then be run and the artifacts associated with the installed IUs executed (or whatever). Later the IU can be uninstalled or updated in that profile. The exact same IU can be installed simultaneously in many profiles.
  • Touchpoint - A part of the engine that is responsible for integrating the provisioning system to a particular runtime or management system. For example, the Eclipse Touchpoint understands how Equinox stores and manages bundles. Different platforms have different Native Touchpoints that integrate with the Windows desktop, RPMs, various registries etc.

There is a lot about provisioning, but next will talk about the runtime. We need those principles and concepts to understand the runtime.

Architecture

sylvain lecoy's picture

Architecture

Any framework that implements the OSGi standard provides an environment for the modularization of applications into smaller bundles. Each bundle is a tightly-coupled, dynamically loadable collection of classes, jars, and configuration files that explicitly declare their external dependencies (if any).

The framework is conceptually divided into the following areas:

  • Bundles - Bundles are normal jar components with extra manifest headers. JAR: Java Archive. Bundles are Drupal Modules.
  • Services - The service layer connects bundles in a dynamic way by offering a publish-find-bind model for objetcs.
  • Services Registry - The API for services management (Registration, Tracking, Reference).
  • Life-Cycle - The API for life cycle management for (install, start, stop, update, and uninstall) bundles.
  • Modules - The layer that defines encapsulation and declaration of dependencies (how a bundle can import and export code). This is very typical to Java which allows class loading.
  • Security

Only local images are allowed.
Authority: http://en.wikipedia.org/wiki/File:Osgi_framework.svg

Bundles

A bundle is a group of Java classes and additional resources equipped with a detailed manifest MANIFEST.MF file on all its contents, as well as additional services needed to give the included group of Java classes more sophisticated behaviors, to the extent of deeming the entire aggregate a component.

Below is an example of a typical MANIFEST.MF file with OSGi Headers:

Bundle-Name: Hello World
Bundle-SymbolicName: org.wikipedia.helloworld
Bundle-Description: A Hello World bundle
Bundle-ManifestVersion: 2
Bundle-Version: 1.0.0
Bundle-Activator: org.wikipedia.Activator
Export-Package: org.wikipedia.helloworld;version="1.0.0"
Import-Package: org.osgi.framework;version="1.3.0"

Life-Cycle

The Life Cycle layer adds bundles that can be dynamically installed, started, stopped, updated and uninstalled. Bundles rely on the module layer for class loading but add an API to manage the modules in run times. Drupal modules relies on autoload capabilities when you declare your classes in the .info file for instance. The lifecycle layer introduces dynamics that are normally not part of an application. Extensive dependency mechanisms are used to assure the correct operation of the environment. Life cycle operations are fully protected with the security architecture.

Bundle State Description
INSTALLED The bundle has been successfully installed.
RESOLVED All Java classes that the bundle needs are available. This state indicates that the bundle is either ready to be started or has stopped.
STARTING The bundle is being started, the BundleActivator.start method will be called, and this method has not yet returned. For Drupal this is similare to the HOOK_init() method. When the bundle has an activation policy, the bundle will remain in the STARTING state until the bundle is activated according to its activation policy. This is extremely powerful for Admin Only modules, which can then never be started for an anonymous request.
ACTIVE The bundle has been successfully activated and is running; its Bundle Activator start method has been called and returned.
STOPPING The bundle is being stopped. The BundleActivator.stop method has been called but the stop method has not yet returned.
UNINSTALLED The bundle has been uninstalled. It cannot move into another state.

Only local images are allowed.
Authority: http://en.wikipedia.org/wiki/File:OSGi_Bundle_Life-Cycle.svg

Below is an example of a typical Java class implementing the BundleActivator interface:

public class Activator implements BundleActivator {
        private BundleContext context;

        public void start(BundleContext context) throws Exception {
                System.out.println("Starting: Hello World");
                this.context = context;
        }

        public void stop(BundleContext context) throws Exception {
                System.out.println("Stopping: Goodbye Cruel World");
                this.context = null;
        }
}

It feels like you've put a

sdboyer's picture

It feels like you've put a lot of effort into all this, so I feel a bit bad saying this. I also don't know if Crell solicited this information from you - if he did, I'll shut up. But...I struggle to see anything but the vaguest relevance to Drupal's plugin problem. But the core problem is really right in your first paragraph:

OSGi is a Java-based framework targeted for use by systems that require long running times, dynamic updates, and minimal disruptions to the running environment.

  • Long-running times are not a requirement for us. In fact, they're more like a pipe dream, if they're anything at all. Drupal is only just now beginning to explore long-running processes.
  • We do not require dynamic updates to our codebase. Codebases operate statically, and updates/deployments are handled more or less manually.
  • We do not require minimal disruption to the running environment. Sure, it'd be nice, but realistically, that's why you have staging servers.
  • More generally, we're not Java. Analogues to other high-level scripting languages, say Python/Ruby/Perl, are good, but Java tends to approach & solve problems very differently.

So yeah, if this is just being put here for reference, OK - but reading these posts, I feel a bit like I'm attending a lecture. I believe the goal of this thread is to relate other systems to the specific plugin problems we're trying to tackle in Drupal.

I think there is a lot to

sylvain lecoy's picture

Crell didn't asked me for this, I posted on myself because I think this is great value. According to his saying

Getting input from a variety of different backgrounds and perspectives is exactly what we should be doing right now.

I think there is a lot to learn from OSGi. If the language is different sometimes you can solves problems by looking to the foreigner.

And you don't have to shut-up, you have your own opinion about this, that's a good thing. But I hope this is not just because its from "Java" that you don't want to see benefits, maybe its also my fault not having highlighted enough the possibilities. Let me give you some examples and I'll provides more application if needed and obviously, if welcomed by you guys.

For instance the start level of a bundle is a very interesting concept. Let's start with the basic, it will sounds familiar with module weight.

A start level is simply a non-negative integer value. The framework has a start level, called active start level, that decide which bundles can be started. Bundle themselves have associated bundle start level which is used to determine when a bundle is started. When booted, the Framework monotonically goes through each start level and starts relevant bundles (all the way until the active start level is met).

In the end, start levels are there to simply determine the start order of bundles.

  • Start order within the start level is indeterminate!
  • A bundle start level of 0 is associated with the system bundle and can’t be changed

You can put modules for admin tasks at a very high start level, then they wont be loaded for anonymous visitors. It’s important to treat start levels as a management issue. In one system, a bundle could be running at start level 1 and in another system it could be at start level 42. This all depends on who is bootstrapping the system… so this means you shouldn’t worry about start levels at development time ! For analogy, the context object could have an active start level, then the framework would load all relevant modules, this active start level would depend of the user (anonymous or admin), of the website caching policy for instance, of the path of the request, on the browser characteristics, ... And then load the minimal amount of code needed for that request, translating all inputs into a simple integer value which integrate already with our current module API.

Also, its true that I speak a lot about provisioning, but its because as you said java and php are different; I think we can use OSGi as a deployment model, that's why I introduced the terms proper to this framework because I find the ideas very interesting:

Imagine Drupal 8, instead of asking clients to download a 4MB archive, then unzip, then upload profiles distributed by a Drupal company, going through each dependencies and download them to the all/modules/ directory to install you profile, then run the installer. You takes a 200KB php file which is your installer, taking care of downloading the last version of drupal, give you an UI to download the profile that you have bought from a company who gave you a license to use. You then enter the repository URL and select the last "Drupal Multimedia and blog Platform". In this repository there is dependencies to contributed modules that are not part of the package, but the provisioning component will download them (taking care of grabbing the required version certified by the company) on drupal.org and deploy them. Also, a required module was OAuth, which uses the PECL Native Extension, with Touchpoint, your installer will be able to ask your Server to deploy a new PECL package. Imagine the power of such a distribution system.

It would be very cool to do a

neclimdul's picture

It would be very cool to do a lot of that but its sort of out of the scope of providing a plugin system for Drupal. I don't know how any of that would happen without a complete refactoring of almost every way Drupal operates. There are likely some useful ideas here we haven't seen yet though so thanks for adding to the discussion.

Its not out the scope at all,

sylvain lecoy's picture

Its not out the scope at all, speaking about a plugin system involves speaking about the whole life cycle of plugin, including distribution of them. We don't need that much refactoring, and I believe this would be a mistake to do so, but look, we have already auto installer for modules, you put an URL for a tar archive and drupal auto deploy it, we are not too far to do the same with core, why speaking about a complete refactoring ?

For runtime issues, we have a module weight already, moving this weight to bundle start level is just a variable to rename, all the hard work will be in classify modules with metadata (admin, UI, Service, Field, ...) and compute proper active start level. The concept is simple and based on existing, but not simple to implement.

I've often turned to various

cleaver's picture

I've often turned to various Java architectures when thinking about how to solve Drupal problems. What I run up against is the fundamental difference in the way a Java webapp and a PHP webapp work. There's a real difference between a Java app running in a container where lifecycles can be minutes or hours and PHP where everything is bootstrapped and completed (hopefully) in well under a second.

Concurrency and managing the lifecycle is less of a concern in Drupal as I see it. There might be bits and pieces that we could apply to Drupal, but I'm not sure what. Maybe there would be the most benefit in the deployment scenario you discuss, but that's a separate task.

Did you work on OSGi ?

sylvain lecoy's picture

Did you work on OSGi ? Eclipse based server ? Or just classical EJB (EJB3, JPA), and J2EE applications ?

There is one flaw that I proposed to address, the fact is that Drupal currently load all your activated modules, even if its a Views UI or a field UI, that the request made by a visitor will not use at all. Let's put the provisioning away because you all agree that it is a separate task (while I not fully agree) but what about the runtime solution with a 'module start level' and a 'drupal start level' ? Its fully doable with the current architecture. Modules are not described by a manifest, but an .info file, its clearly the same. Also, modules are defining hook_, in OSGi they speak about extension point. Note the big difference is when a module implements an extension point, it is declared in the manifest, and not resolved at runtime (then cached). That mean the developer has to do an extra effort but it then worth it.

A last thing, I hope the goal of this thread is address seriously the problem of modules in Drupal, and not finishing on one more cache layer to hide the new complexity introduced. If we want to do it better, we'll have to reconsider some architecture key points. Eclipse is solid and extensible because it was well thought out and clearly OSGi played a part in that success. To not investigate the applicability of their patterns would be an oversight.

Startup level is highly

pounard's picture

Startup level is highly doable, modules could start at various checkpoints during the bootstrap, even after. For example, if I bootstrap Drupal using an 'ESI' dispatcher (which means no blocks, no page), modules that specifically alter pages or provide blocks do not need to be loaded.
This only require a new parameter to a generic modules_load_all() method, such as: module_load_all($checkpoint).
Then, module developers would have the responsability to tell Drupal when it should or not load the module.
The pitfall here is that many module would probably tell Drupal to load at "bootstrap" or "init" because they don't care about doing finer tuning, but review would help that :)

Pierre.

As I said previously, It’s

sylvain lecoy's picture

As I said previously,

It’s important to treat start levels as a management issue.

At development time, no body should care about this. What should care developers is simply help the dispatcher, context, or whatever you call it to do the right decision. They need to add meta-data about what their module do. For instance there is key features which can help the decision process:

  • Field module (and which file to load to activate the field feature)
  • Administrative UI (those are eating memory and can be easily avoided if not an admin)
  • System module (those have always a startup level of 0 and can't be changed)
  • API module
  • Content module (those are loaded only when saving/editing content: for instance Pathauto, xmlsitemap, ...)

The feature module allows exporting "Features", for instance a slideshow is a feature composed of one or more image profiles, the image module, a content type,and the jquery update module. When you know that a page will display such a feature, by reading meta-data you automatically know that you need to load the image module, the content type API, and the jQuery update module. So for that type of request, those modules will have a start level set to the global start level, and their dependencies resolved with lower start level. I don't know how we can do at the fine grain level but if we do it right I think it can help a lot with performances. I think the basic concept is simple enough for everyone to understand, and the decision making can be tricky but exciting.

We have already a taxonomy about modules on drupal.org (file management, content, Admin, ...), if we unify this taxonomy to make it a basic concept of the .info file, associated with best-practices guide and a good review process, then I think it will be not big changes for developers.

Also about the hook system, the problem can be to not load a module which contains an essential hook. We have three ways to process:

  1. It was on purpose, the developer wants this to be hooked in certain conditions
  2. Parsing the whole module files and dynamically discovers hooks implemented
  3. Defining "extension point" in the .info file. Those will be registered only when reading an .info file and then activated if so

Dynamic dependencies

pounard's picture

Dynamic dependencies introspection is an heavy task, you have two ways of doing it:
- At module install/enable : which forces you to store some kind of heavy cache (it may have a huge impact on performances, don't forget that PHP scripts work in share-nothing and a cache get is an extensive I/O over a remote server).
- At runtime : compute only what you need, but may be even worst on performances (if dependencies are too complex to compute)

Any cache get (heavy I/O) at bootstrap will make it slower, the more stuff you have in the code itself, the more your bootstrap will run fast, that's why I like the checkpoint idea because it only add a single string column (eventually) in the system table, and do not add any kind of dynamic runtime check or the need for new caches.

So hardcoded fixed dependencies can be a good thing, this where the PHP runtime is totally different from a persistent software such as Java VM running apps and where the full OSGI model about dynamic dependencies might be a huge FAIL for Drupal.

Pierre.

I do not fully understand

sylvain lecoy's picture

I do not fully understand your checkpoint idea, you mean you let the developer telling if yes or no the API can load its module ? Are you not afraid about the uncontrolled complexity that every developers can introduce in their checkpoint routines ? It can be worst than if we fully controls the process isn't it ?

That's true that such a system would largely benefit on some APC or memcache to have at proximity the information needed. But it worth a shot don't you think ? To me, "loading 50 modules", and "compute (or get from cache) a startup strategy, then load 15 modules" is something that can compensate the overhead introduced by the former. You know, there is a lot that I learned from Drupal, and specially about its API. And what I found out, is that most of performance issues involved compute-bound application bottlenecks (mis-conceived API, code duplication, hazardous data flow), not I/O-bound or network-bound performances.

But that's something which should work closely with the Context API, how do they bootstrap their object ? How do they manage caching (if there is) ? This (to me) will work in synergy with Contexts.

I don't know much about the bootstrap process I have to investigate. Do it load from Database the list of modules to load ? In what this is different to load a startup strategy, if both are composed of one single request ?

I meant that if core has

pounard's picture

I meant that if core has multiple ways of bootstrapping (one for page build, one for ESI include, one for specific blocks) then you could just awake modules that are meant to interact with this particular "action/checkpoint", without going through field API introspection, implemented hooks introspection, etc, it was just a (maybe bad) idea.

That's often true that misconceived API are the worst bottleneck. Core by itself does really a lot of SQL queries, even without any contrib modules. Each cached stuff you fetch will add another one.

On OOP oriented systems, module loading has no importance because a good autoloader will load any class only when needed. It's not about I/O because using an OPCode cache, this particular I/O (PHP file loading) is fully skipped.

But it worth a shot don't you think ? To me, "loading 50 modules", and "compute (or get from cache) a startup strategy, then load 15 modules" is something that can compensate the overhead introduced by the former.

I agree, but when looking to OSGI, it's a fully featured, powerful tool, that probably compute its dependencies at runtime because a persistent software can afford to have a longer bootstrap. But in PHP, if your runtime dependency check algorithm is longer than the require_once() calls, then you are doing something wrong.

Drupal 7 bootstrap uses the system_list() module, that in best case does on cache fetch, and in worst case does a SQL query and build the module array list. When bootstrapping, this SQL query does a WHERE clause in order to fetch "bootstrap" modules only. When the full core is bootstrapped, the exact same algorithm is re-done, without the "bootstrap" clause, which doubles the SQL query or cache fetch (which is bad, because in order to save some bytes of memory during bootstrap, they re-do the exact same job once fully bootstrapped, which probably consumes more memory doing the SQL query or cache fetch than they saved at bootstrap time).

Pierre.

The start-up strategy can be

sylvain lecoy's picture

The start-up strategy can be cached as well, but in a different manner since there will be a lot of different strategies, mainly based on the path (admin/*, node/*, ...) but on the role as well (anonymous, administrator, ...) and so forth, which is... on the context.

In best case its fetched from the cache, in worst case build the module array list with their 'module start level'. The key is to find an algorithm which performs. In the worst case we can still load all the modules and skip the strategy builder step, queuing this with the attached context and compute it later, as a batch operation for instance.

That mean in the best case the best configuration is loaded, in the worst case it load all modules as we were used to. The context object is saved, and queued into a batch operation which will be performed on the next cron run.

Do you think this is doable ? Also, can you elaborate a bit more with your $checkpoint idea ? I'm interested.

Just telling that instead of

pounard's picture

Just telling that instead of having only "bootstrap" modules, and other modules, it could exist "bootstrap" modules, "init" modules "dispatcher" modules (at menu router dispatch time) or "api" modules etc.. It's a start from on-needed basis module loading. Not efficient though but seems better than just "bootstrap" and "others".

Pierre.

Yes that my original idea of

sylvain lecoy's picture

Yes that my original idea of having a starting level.

We got such a taxonomy about modules already (api, ui, system, ...). If we translate it correctly in starting level for a given context then we are.

bootstrap

catch's picture

'bootstrap modules' exist because of page caching. We eat the extra cache get on full bootstrap, to avoid loading the full list of enabled modules for cached pages.

This is largely because Drupal can't figure out whether it's going to serve a cached page or not, without hook_boot() having run, so there is a circular dependency at the moment.

What's even better, in Drupal 6, we query the list of bootstrap modules from the database twice each request - once for hook_boot(), and once for hook_exit(), I didn't spot that one until two months ago.

Fully agree this is very, very bad architecture.

I've been thinking a lot about lazy loading of modules the past few months, it would be hard to come up with something faster than just loading all of them, but there is more than just i/o to calling require_once() - you have to run define() for all constants, the function and class lists have to be copied into shared memory, this is a measurable hit when running APC and having lots of modules enabled, although I need to do more to get solid numbers on the various bits. It'd be good to discuss where to take that in irc next time we're both around.

I think this "problem" is

donquixote's picture

I think this "problem" is already fixed, or the fix is planned.
What does "load a module" mean? It means, you include / require_once the *.module file, and maybe fire a boot or init or whatever hook implementation, if exists. The file inclusion has two purposes:
1. Expose hook implementations for function_exists() / module_implements().
2. Expose public API functions, to be called by other modules.

Somewhere in the d.o. core issue queue, I read about the plan to put specific hooks into specific files (or do we already have this in D7 ??), which will only be included if this specific hook fires. The same can happen for theme implementations, page callbacks etc.
Having this file structure, the only thing left for *.module files is public API functions.
We can expect that many modules will not need any *.module file, and thus don't need anything to be done on bootstrap.

It would be great to allow

pounard's picture

It would be great to allow modules without a .module file, this is a good way to go. For example field type providing modules basically don't need it at all. Most modules don't need it at all, maybe except pure procedural API modules that needs to be loaded soon enough for others to use it. OOP API modules don't need it because of the autoloader.

Pierre.

Fully agree with that.

sylvain lecoy's picture

Fully agree with that.

Auto loading is sexy, but it wont address the problem deciding who need to be loaded and when, will it ?

What part of the problem is not solved?

donquixote's picture

Autoloading does the job for classes.
The "dedicated file for a group of hooks" does the job for hook implementations.
For callbacks (page, theme) we already have the possibility to specify file locations.
What remains is public API functions. But we might even call these indirectly.

We can scan the hook and autoload information (which module has which files + explicit autoload information + whatever), and then cache this stuff in reverse lookup lists.

What else are we trying to solve with the "different types of modules" approach? What else is there about "loading a module", aside of loading php files, that is not already lazy-fired?

Cache hook lists is maybe a

pounard's picture

Cache hook lists is maybe a bit over-engineering I think. You risk, on a long term usage, to consume more CPU trying to load files that running the real code behind. That's why I like the OOP autoloading. Hooks "a la" Drupal 5 (meaning defining it with only a require_once and a my_real_hook_impl() call) is probably faster than trying to do an intelligent-file-loading-huge-cached-registry-that-must-be-initialized-at-bootstrap.

Pierre.

You risk, on a long term

donquixote's picture

You risk, on a long term usage, to consume more CPU trying to load files that running the real code behind.

Not at all.
We have two lists (cached in files, ideally):
1) Which module has which ".inc" files. We only store those that are relevant for hooks. We can lazy-load this list when the hook is fired, but imo it won't hurt much to load it altogether on startup.
2) Which hook is supposed to be implemented in which file?

Now compare this to:
- As far as I remember, the symfony autoloader has a cache file with a list of all class locations. This can be expected to be bigger than the proposed hook cache.
- In Cache module_implements(), pwolanin is asking to have a cache of all hook implementations. This is more than I was asking, and still I don't think it would hurt.

Ok for auto loading, but you

sylvain lecoy's picture

Ok for auto loading, but you then consider that modules are OO based ? Is there any draft or working version of such a re-factoring ? Would love to see it. For sure having auto loading capabilities over modules will reduce some problems but maybe introduce newer. For me its not obvious until you hard-code the dependencies in your code, which is a very very bad practice. Do we have good Inversion of Control Containers in PHP ? When I specify an interface in my client module (say IOAuthAdapter), how would I know that the OAuth native PECL Extension will be injected, which mechanism will instantiate and inject the OAuth object instead of an OAuthPHPLib object ? See we still need a loading strategy unless instead of using an interface (IOAuthAdapter) you hard-code your dependency for a specific implementation.

Maybe I'm not seeing something obvious so please tell me what is your point and illustrate if possible because I'm confused :-)

Also, there is no "dedicated file for a group of hooks", this is just a convention to separate a field definition from a module, but if you look at the .module file, there is this line "require_once blabla.field.inc".

For callback we can specify a file location, but the *.module is still loaded. Moreover menu builder don't load specified file before checking the access, meaning that we still consider *.module files loaded prior to the specified file callback. See this issue: http://drupal.org/node/929506 for more info.

Also, there is no "dedicated

donquixote's picture

Also, there is no "dedicated file for a group of hooks", this is just a convention to separate a field definition from a module, but if you look at the .module file, there is this line "require_once blabla.field.inc".

I am talking about things that are planned / discussed.
http://drupal.org/node/557542#comment-1984972
The solution over there looks quite convincing, and will solve most of the "which modules to load on startup" problems.
I see little left to discuss for this group about "modules on startup", especially for this particular thread.

EDIT: actually.. hook_hook_info() already exists in D7!! See Crell's latest post here, he says it all.

Autoloading resolve all

pounard's picture

Autoloading resolve all problems you can have with classes loading, it's the best way to go. But for procedural code, it doesn't resolve any problem at all since you are calling functions that must be defined.

Pierre.

Still does not resolve the

sylvain lecoy's picture

Still does not resolve the problem for interface related dependencies.

Simplenews is a good example as it can render its mail with HTML. Two modules can work, MIMEMail and HTMLMail. It means that in a OO-centric approach, Simplenews will use a interface (say IHTMLMail), and the dependency will be resolved at runtime.

How do you resolve this particular dependency ? How do you know if you need to inject a MimeMail instance or a HtmlMail instance ?

Larry has directed you to the

sdboyer's picture

Larry has directed you to the definitions page in a couple places already, so I'll re-refer. Much of what's being discussed again here is stuff we already have some increasingly-solid ideas on. On this specific question - how do we choose which actual object instance to inject? - see the 'Mapping' definition.

I read this one, but I'm

sylvain lecoy's picture

I read this one, but I'm afraid that it still does not answer my question, 'mapping' definition just present the process. How do you know which one to map (or to inject, or to instantiate - does not matter) ? At runtime ? In a configuration ?

EDIT: ok its depending on the context object. Thank you sdboyer.

Context + configuration (as

sdboyer's picture

Context + configuration (as in the config core initiative), but yeah. So, some combination of hardcoded defaults, module-specified preferences, and site-specific configuration. The specifics of which will be encapsulated in context/config - atm I'm imagining that the config itself will probably end up attaching itself to context. In the tinkering ctools folks (really, neclimdul) have been doing, there's also a notion of a provider - an object which controls discovery/registration, loading, and instanciation of plugins with their appropriate configuration. That didn't quite make it into the definitions, but I think, given some of the ideas you've been putting forward here, you might be interested in how that idea evolves.

So what worth this group if

pounard's picture

So what worth this group if code is already been written, and pretty much advanced as I read it. I mean I don't have time to waste if someone that doesn't even come and talk here already pretty much written all of it.

Sometimes, the Drupal do-ocraty seems more like "thanks for coming, we already did it our way".

Pierre.

I've been working on code for

neclimdul's picture

I've been working on code for my own use in evaluating the needs of plugins and to provide reference on what a CTools implementation of plugins might look like in core. While I'm actively trying to use the ideas put forth in these reports, it is not a golden implementation or even recognized implementation. Crell hasn't even looked at it yet(to my knowledge anyways, it is public).

It has been of great help to me in identifying problems we're going to run into with the way Drupal and has sparked several discussions and provided interest from a number of people so I think its worthy that there's code for us to refer to but its disposable and quite possibly not a final implementation.

Furthermore, being as CTools has been fighting a bit of an uphill battle for even consideration, we're fighting that same battle to be compared against these externally implemented plugin frameworks on equal footing so having a in-core implementation roughly implemented I feel helps us do this.

I've been burned by Drupal's do-ocracy far too many time so I totally understand your concern but its not the case this time. Its just a miss-understanding of my(our) intentions with the sandbox.

Ok thanks for answering to

pounard's picture

Ok thanks for answering to this. No harm done.

I looked at your code by the way, seems oriented arround a variation of the service locator pattern, where the provider and the discorery objects acts like a registry and factory together, code is rather clean. One thing I do not like is the fact that the procedural accessors almost always end up by loading the full plugin registry, it might be something really heavy and not really scalable once many modules will provide their own.

EDIT: Another remark is that you use a lot of metadata inside huge arrays where most of this data has no real use outside custom providers (such as class and stuff). This seems redundant with what can be done using properly OO code. The global "factory registry" (i.e. the procedural code mostly) shouldn't care about what is made and will give custom factories/registries. This is a totally "drupalistic" way of doing thing, but something that can be done otherwise quite easily.

Re-EDIT: The need to have one huge descriptive array that acts like a global registry of plugins denotes the past errors of CTools and such, when the number of plugins grows, the site began to suffer performances problems because of these. I saw a site that loaded thousands of default panels each client hit (loosing seconds to do it) because of this need of a complete registry. This is something that probably should not be duplicated.

A lot of criticizing, don't take it personally, your code is nice, I like it (most of it). Good usage of OO for sharing various custom implementations code (such as info based and file based implementations).

Pierre.

Thanks! That's very

neclimdul's picture

Thanks! That's very constructive.

I've put it in an issue on the sandbox so I can think about it, comment etc without getting OT here.

Nice, I'll look into it.

pounard's picture

Nice, I'll look into it.

Pierre.

Totally agree with cleaver,

pounard's picture

Totally agree with cleaver, Java application runtime workflow is not a good example. But where I'd agree with the OSGI summary is that some details and patterns can be good.

When you say "under a second" you probably should tell "under 100ms". Drupal bootstrap is already, today something like 5 times too long compared to any a lot of other PHP software (even running heavy frameworks such as Zend).

Pierre.

Lazy-loading

Crell's picture

Java is an interesting case. Syntactically, it shares a great deal with PHP, especially where OO is concerned. It's runtime, however, as others have noted, is so completely different that it's difficult to draw anything from it.

What Sylvain is describing is actually two separate problems: 1) Automatic installation dependency resolution and 2) lazy-loading.

1) For automatic installation dependency resolution, that's currently being handled by install profiles and distributions, which can now be packaged on Drupal.org as a single, targeted install. I don't foresee that changing much in Drupal 8, but if it does it would be separate from the goals of WSCCI so I would say that is off topic for this group, and certainly for this thread.

2) Level-based autoloading is an interesting concept. As pounard notes, we currently have multiple bootstrap levels, although we have different module loading rules only for 2 of them (bootstrap and full). That's not particularly efficient. However, the complexity involved in trying to cherry-pick lots of different explicit levels is mind boggling. To try and put that work on the site administrator is a non-starter, as most will simply not have that level of expertise. As Sylvain notes, that's not something that can be decided on at development time. Given how dynamic PHP is, I am not convinced that a static list is even possible at runtime. That leaves automatic detection, aka true lazy-loading, as the only option.

And, in fact, we already have systems in place for that. Two, in fact: autoloading an hook_hook_info().

For autoloading, yes, it only works on classes and Drupal 7 is, currently, a largely procedural system. However, most of the WSCCI work is currently slated to be OO-centric. (See the definitions page at the top of this group's home page.) As a result, assuming we're successful, a LOT of Drupal code will be moving to OO patterns, especially anything that leverages plugins or those bits that move into being context handlers. That should greatly reduce our code weight if we allow those classes to lazy-load.

hook_hook_info() was added late in Drupal 7 as a way to define "this hook may be in this side file instead of in .module". It is not being leveraged much in core, mostly due to its late addition and developer resistance/weariness late in the cycle. Combined with caching of module_implements() information, it allows us to break up modules a lot more than we do now if we are smart about it. Then, hooks would only be loaded on the occasion that they are called, which could be any time.

That would give us essentially "level-free lazy-loading", which is the ideal. In order to do that, however, we need someone to own the process of more fully implementing hook_hook_info() throughout core in Drupal 8. I fully support that, but it's out of scope for WSCCI. See:

http://drupal.org/node/983470
http://drupal.org/node/977052

There was also a related issue to make modules require only a .info file, not a .module file, so that they would have zero load impact unless one of their hooks was called. That sadly didn't make it into Drupal 7, but I'd love to see it in Drupal 8. Again, someone needs to own that but it's not really within scope for WSCCI right now.

http://drupal.org/node/340723

Sylvain, if you want to try and tackle that problem I am all for it. The paths are already largely in place, they just need someone to walk them. :-) That's not really something for WSCCI to deal with directly, though, as we'll be relying primarily on autoloading to handle level-free lazy-loading.

Let's get back to discussing our primary comparison projects.

I wasn't aware of this

sylvain lecoy's picture

@donquixote and @Crell I wasn't aware of this hook_hook_info().

However, most of the WSCCI work is currently slated to be OO-centric.

Are you refering to OOP Hooks / plugin objects ?

we'll be relying primarily on autoloading to handle level-free lazy-loading

No I think you are right, different explicit levels is mind boggling, that's why it is exciting, but I'm happy with level-free lazy-loading. If we have lazy-loading there is no need to introduce level-start isn't it ?

Where the discussion is set ? Would love to contribute/help.

Just to add a few materials to OSGi here is a look at a plugin manifest.

Below highlighted the plugin-content section:

  • Dependencies, of course
  • But interesting, the libraries as well

The Extension / Extension Point Content (similar to *.api.php files, but declared in manifest)

  • Extensions: the hook implemented
  • Extension Points: the hook defined


Only local images are allowed.

Having those in the info file can reduce the introspecting compute time and be more efficient at hook discovery and registering mechanism.

Off-topic

Crell's picture

No, but with all context handlers and plugins being objects vast swaths of Drupal become objects without us touching hooks. There may be cause to change the way hooks work, but that's not what we're going for right here.

Also, we're now again vastly off topic. Please keep this thread just for discussing the research results from other projects, not for speculating on how we could ravamp hooks.

I'm trying to keep on track,

sylvain lecoy's picture

I'm trying to keep on track, you said we need to evaluate the pros and cons of a particular plugin system.

I noticed that it could possibly enhance the hook performance.

Hooks are not plugins

Crell's picture

Hooks and plugins serve two very different purposes. The proposal is not to replace hooks with plugins, but to replace the current myriad of unstandardized one-offs with unified plugins. Revamping hooks is off topic for this thread.

Zend Framework summary

justinrandell's picture

Sorry for the lateness!

Note these notes are based on Zend Framework (ZF) 1.x.

Overview

ZF is a low level framework. Although it has tools to generate a recommended application structure, most of its components can be used stand-alone. It is expected you'll be building a custom application and writing a bunch of code if you use it.

Zend applications can be built any way you please. A Zend application (via Zend_Application) is really just some convenience configuration on top of a 'blessed' directory structure, but nothing requires your application to be built in the same way.

ZF is even more oriented to custom coding and configuration, and has even less out-of-the-box support for higher-level concepts than Symfony2 does.

ZF's concept of a module is really just a 'blessed' directory structure, and Zend applications can have any directory structure they like, so this is just convention with a matching configuration.

There are two types of plugin systems within ZF - one is basically a class-loader, and one is based around the request cycle within the Zend_Controller component.

Plugin loader

http://framework.zend.com/manual/en/learning.plugins.intro.html

ZF's Zend_Loader_PluginLoader component is really just a class autoloader with some sugar on top. Directories and class name prefixes are registered with the plugin loader, allowing easy loading and overriding of plugins throughout the framework.

Controller plugins

http://framework.zend.com/manual/en/zend.controller.plugins.html

"The controller architecture includes a plugin system that allows user code to be called when certain events occur in the controller process lifetime. The front controller uses a plugin broker as a registry for user plugins, and the plugin broker ensures that event methods are called on each plugin registered with the front controller."

ZF controller plugins operate a lot like hook_boot, hook_init, hook_exit etc in that they are called during stages in the request. They are completely unlike Drupal in that they need to be explicitly registered. The plugins are all class based, and its easy to see how we could implement all hooks this way.

Summary

To be honest, I don't see too much to take away from this research. Zend is way, way at the other end of the spectrum from Drupal, so this is not 100% surprising.

If there's anything we can gain from looking at Zend, its the composability of different components. ZF makes it very, very easy to use parts of the whole without buying in to the whole deal.
While we probably can't go too far towards this with Drupal, keeping the different bits of Drupal less tightly coupled is a worthy goal.

ZF is a pure framework, it do

pounard's picture

ZF is a pure framework, it do not surprises me, I did more or less the same research on my side and I ended up with the same conclusion. Therefore, they have a lot of extremely good component for smaller pieces of API.

About the Front Controller plugin, I think it fits what we would need for "context plugins". A context plugin in Drupal would inherit from input data, and input data comes from the controller which carries the request. Pre-dispatching is probably the right time where to spawn and build these contextes.

Pierre.

Symfony2 alikes

kika's picture

http://flow3.typo3.org, - core of Typo3 next generation
http://silverstripe.org/sapphire-introduction/ - core of SilverStripe

Musings on Kohana

manarth's picture

Late, but hopefully better than never.

Oh, this is based on Kohana 3, which was pretty much a complete rewrite from Kohana 2.

If the question is: what constitutes a 'plugin' in Kohana, it's difficult to pin down, so here's a rundown of the key concepts in Kohana.

Classes/Code segmentation/inheritance and the cascading filesystem

Almost everything in Kohana is a class.
Kohana code is split into 3 sections: application, modules, and system.
Module are essentially a way of encapsulating a set of classes, in a way that allows them to interact with Kohana. This boils down to a standard way of organising the class files (and supporting files, such as config, etc).

The 'Cascading filesystem' is Kohana's basis for allowing core classes to be replaced/extended.
The cascade is application > module list > system.

For example, if you try to use the helper class 'Url', Kohana will first look for a file in these locations:

  • application/classes/url.php
  • modules/module_name/classes/url.php
  • system/classes/url.php

The files found in the system folder are sparse, each containing a simple class definition: class Url extends Kohana_Url {}.
This pattern is used throughout Kohana: the 'core' class is empty, and the actual code is found at system/classes/kohana/xxx.php, with a class named Kohana_xxx. This allows Kohana to to make 'Transparent extension' easier: your class at application/classes/xxx.php would define class xxx (completely replacing the empty class found at system/classes/xxx.php), and choose whether to extend Kohana_xxx (which would be auto-loaded as needed), or build all its functionality from scratch.

Handling URLs

Kohana's Route class is invoked during bootstrap to define the mapping of URLs to methods of a controller class. Route URLs are defined as regexes.

The Request class is called to start processing an incoming request: it iterates each of the defined routes until it finds a route whose regex matches the URL. It then instantiates the relevant controller class and invokes the method.

This abstraction allows Kohana to do HMVC (Hierarchical MVC).
For example, a home page which displays latest news, a recent tweets widget, and recent comments: the initial request would be to the path '/' which might map to a homepage controller. That could then invoke sub-requests for '/news', '/widgets/twitter', and '/comments'. On a core Kohana build, this would simply look up and invoke the relevant controllers. With a minor change, each of the widgets could be delegated to an external server, and the Request handler would make an HTTP request for those widgets.

What can Drupal use?

Kohana's cascading filesystem allows all native classes to be extended/replaced, including the Route class. Drupal's initial route-handling is hard-coded to the Drupal menu system, which accepts a URL as input and responds by invoking a page callback function. REST, JS/AJAX callbacks, and content-negotiation are all more complex as a result. Drupal would benefit from the concept of a front-controller, which - by default - might be a class/file which provides Drupal's standard menu system, but it should be easy to swap the core front-controller for an alternative.

Drupal already uses the cascading filesystem concept to a certain extent: for example, sites/xxx/foo overrides sites/all/foo.
Around 28% of Drupal's core code is non-modular, not overrideable (approx, based on byte-count in /includes vs byte-count in /, for files ending in module, php, inc, or install). Kohana's ratio is 0.2% (index.php); every other core feature can be easily replaced/extended by an application. I'm not sure that Drupal would lose anything in moving core functionality from /includes to a structure that could be overridden.

Kohana's class-orientated structure also makes it much easier to make minor changes to core functionality, without needing to copy/paste swathes of code, but Drupal moving to class-based coding isn't something I'm expecting to see in a hurry!

HMVC would make things like ESIs and parallel handling much easier, and using a common API (Kohana's Request class) for both external HTTP requests and internal sub-requests potentially allows it to easily switch from internal processing to off-loading to an external server. On a similar topic, David Strauss has got some interesting research on both Kargo-event and another queue-based system (using Beanstalkd) that tries to emulate the event-driven approach of Node.js to process sub-sections of a page in parallel.

What's a Kohana plugin?

As a final summary, a plugin/module for Kohana is:

  • The use of Kohana's cascading filesystem and auto-loading
  • A directory-structure convention (required when interacting with other modules, but otherwise optional)
  • An arbitrary class, which need not conform to any API beyond class-naming to allow autoloading
  • A Route/Request/Controller pattern (and standard API), to make use of Kohana's HMVC practice

--
Marcus Deglos
Founder / Technical Architect @ Techito.

Web Services and Context Core Initiative

Group organizers

Group notifications

This group offers an RSS feed. Or subscribe to these personalized, sitewide feeds:

Hot content this week