Putting it all together

heyrocker's picture

This document is a proposal for a high-level gameplan for the configuration management initiative.

Measures of Success

If we succeed, we should have achieved several major goals:

  • Drupal configuration information will be able to be managed through standard tools.
  • Drupal content will be able to be staged between servers without ID collisions.
  • Drupal content will be able to saved programmatically without relying on the Forms API and without a loss of functionality.
  • Configuration data is available to Drupal early in the bootstrap process without booting the database.
  • Configuration is revertable, either through the UI or code.
  • Links/References between content and configuration, or content and content, or configuration and configuration, can exist without being broken by being moved between servers.
  • The system will be flexible enough to handle complex use cases, including ones we can't predict
  • The system should cause no significant performance regressions, or even better improve performance and scalability overall.

Content VS Configuration

FIGHT! There has been a lot of talk over the years (much of it by me) about what is and isn't content and what is and isn't configuration. While I have long been an advocate of the concept that Drupal does not meaningfully make this distinction, I have had a hard time imagining an architecture under which this can work reasonably. There has been talk of having a master interface from which anything persisting data implements, but for the time being this is controversial and honestly I don't know if it causes more problems than it solves. Therefore, for the purposes of this discussion I propose the following idea:

* If it is an Entity it is content, if not it is configuration *

Ultimately, I think this gets to the crux of the problem, and I think we can get where we need to with this definition. If we have some brilliant ideas out of the blue, I don't think it will be too much work to refactor what follows as necessary in order to match what needs to be done. Also, by drawing this line in the sand now, we can get to work on some more isolated pieces of the problem and learn and iterate along the way, which I think will be very useful. Already research and experimentation has brought me a long way but at some point we do need to actually get to work too.

So what pieces need to come into place to make this all work?

Content

There are basically two hurdles to be overcome on the content side.

1) We need UUIDs in core and we need Entity support for UUIDs. I have already written a proposal for this which can be read and reviewed at

http://groups.drupal.org/node/145614

2) We need to be able to save entities from code reliably, without loss of functionality and without invoking Forms API. This should happen without a loss of functionality (IE we should be able to get the same validation, and all data should be submitted properly). The main thing I would like to see happen here is that validation (aka form_set_error()) be revamped such that error setting and displaying become separate interests, and thus we can use the same APIs for form submissions as well as saving from code (with the addition that form submissions will have some display code for their errors.) I haven't really thought through an architecture for this, and I'll want to get someone who knows Forms API much better than I do involved. Volunteers?

The rest mainly centers around getting logic out of form submit functions, and removing the reliance on environmental data that may not always be available ($_GET, etc) The former is just cleanup and testing, the latter is Larry's job :)

Configuration

I have also written a separate document that details my plans for configuration data at

http://groups.drupal.org/node/149744

One thing this document doesn't address is one of our biggest problems - when a piece of configuration references a specific piece of content, how do we resolve that reference given that IDs may break between servers? What we need is something in the config data, perhaps a specific implementation of the config interface, that specifies 'I am a reference to an entity' and includes the entity type and uuid of that content. When a module sees this, it knows that it can go and pull that content out based on the identifying information. This should work out quite well, as long as we always use UUIDs for these references. If more operations need to be done on these content items, they can be resolved using the APIs that will be created during the UUID phase.

Put It Together

You'll note that none of this changes the user experience of deploying Drupal sites much at all, and that is by design. I feel it is important to get these underlying technologies and APIs in place before moving on to what we do or don't expose to users as a part of core. In fact, if this was all we got done in the D8 release cycle I would actually be really really happy and consider the project a success. That said, there should be nothing in the above that prevents any of the currently implemented contrib modules (like Features or Deploy) or anything people are considering down the road from being implemented. Spotting some holes? Now is the time to speak! We really want to get some reasonable amount of consensus around this before our sprint in Denver June 8-10 where we can hopefully start scaffolding things together.

Comments

Essential piece missing

sun's picture
  1. What I'm seriously missing in all of the exportables/configuration/whatnot discussions is: Maintenance updates

    If you put configuration for my module into code and my module is not able to hook_update_N() it, then the entire system and idea of configuration in code blows up and turns into an epic #fail.

    This is the topmost, first-of-all, crucial requirement that is going to be the primary benchmark against any implementation proposal for core. I actually don't really care what we're going to do under the hood. But configuration has to be update-able in a sane way from within update.php. Drupal's #1 principle is: Your data is safe.

  2. Because of 1), I can only repeat my support for Entity API module's idea of handling exportable configuration entities. By keeping and syncing them also in the database, all of our existing tools and workflows still apply. My module is able to perform maintenance updates without having to spend a single thought on whether things may live in code or in the database.

Daniel F. Kudwien
unleashed mind

"Exportable configuration entities"

eaton's picture

Because of 1), I can only repeat my support for Entity API module's idea of handling exportable configuration entities. By keeping and syncing them also in the database, all of our existing tools and workflows still apply. My module is able to perform maintenance updates without having to spend a single thought on whether things may live in code or in the database.

That's exactly how CTools' exportables framework operates. Modules don't have to think about whether something lives in code based exports, or admin/builder override tables. The API assumes that code captures defaults, the DB captures overridden versions, and a module that says, 'Get me X!' gets the proper 'X' regardless of where it resides.

I'm not suggesting that we should simply grab CTools' mechanism and reuse it, just pointing out that there is nothing inherently "entity-oriented" about that override mechanism. We still need some sort of line in the sand to distinguish between what stuff the 'configuration management' system can punt on, and what stuff it needs to be able to capture.

At least IMO, of the biggest divides in configuration is "stuff that impies schema changes or data alteration" and "stuff that just toggles or directs existing behavior." The former is where hook_update_n() is also the only really solid solution; 'exporting' content type definitions has always been a snake pit for that reason.

We still need some sort of

fago's picture

We still need some sort of line in the sand to distinguish between what stuff the 'configuration management' system can punt on, and what stuff it needs to be able to capture.

Using the same storage system for configuration and content doesn't mean we can't draw that line - we can still draw it at a higher level.

The point is we need a storage system for both, configuration and content. While the actually used storage backend implementations might (or might not) differ, the needs are the same. By creating two totally separated worlds we not only have to solve the same storage-related problems twice (like reacting on updates), it also unnecessarily duplicates the interface (= API functions, hooks, ..) to module developers. That means, modules have to do either extra work to support both worlds, or stick with one - what will just lead to an increase of the gap between both worlds.

Instead of drawing a line in the sand and putting a wall around it, we should try to minimize the gap and ease building bridges between both worlds.

I pretty much think that

pounard's picture

I pretty much think that manual coded variables override is pretty much for strongly experienced audience, and your module wouldn't break without the developer to do bad stuff. Anything will break with a developer doing bad stuff whatever is the effort you make to attempt to save your data, an insane man will always find a way to blow them out!

I think the targetted audience of hardcoded variables is pretty much the same that people that extensively use features, it's pretty much about configuration and not always about data structure. It's probably meant for development and staging more than for end user :)

What I mean here is that most experienced developers would probably use their own hands and develop the associated hook_update_N() in their custom module somewhere if they really break things badly (it's their choice to assumre).

It doesn't exclude either some hooks that could be, for example, hook_variable_changed($original, $new), all you have to do to be able to do this is probably just keeping a cached variable storage somewhere and a simple hash of it and test it when your cache goes down (mostly at rebuild time then) and fire the hook if you find differences.

Pierre.

?

heyrocker's picture

I don't get this argument at all, perhaps you can explain it in more detail? Why wouldn't your module be able to hook_update_n() if its configuration is in code? There's an API for managing the configuration, the API writes to and reads from the file, your update function uses it. It's no different than if it was in the db (which it actually still could be since we'll have pluggable storage.) You will not have to spend a single thought on it at all.

There's an API for managing

fago's picture

There's an API for managing the configuration, the API writes to and reads from the file, your update function uses it. It's no different than if it was in the db (which it actually still could be since we'll have pluggable storage.) You will not have to spend a single thought on it at all.

This sounds very much like where the entity API is heading to.

Related Comments

tinyrobot's picture

Here are a couple of comments related to the configuration storage discussion, and unifying how to handle it.
http://groups.drupal.org/node/149744#comment-500634
http://groups.drupal.org/node/149744#comment-500624

x

If you put configuration for

donquixote's picture

If you put configuration for my module into code and my module is not able to hook_update_N() it, then the entire system and idea of configuration in code blows up and turns into an epic #fail.

If we treat those files as a simple data storage with read + write, then update should just work.
The file will be parsed into a nested array structure, then updated, then stored to the file.

Unsupported manual hacks, comments, or a custom order of settings, might get lost in this process. That's the price.
As a consequence, we can not make config file editing something recommended.

Yes this is true, I am

heyrocker's picture

Yes this is true, I am assuming you don't manually update these files. If you do you're on your own.

Content VS Configuration What

donquixote's picture

Content VS Configuration

What if instead we just ask "specific to this site instance" vs "to be migrated and deployed all over the place" ?
This distinction would be less artificial, and illustrates that anything can be a candidate for migration..

entities or configuration

Gábor Hojtsy's picture

Entities or configuration sounds like it could be a really great thing. There are interesting cases though. How do you imagine re-implementing our menu structure? Is that going to be configuration or entities? A forum structure? (Currently that is made up of taxonomy terms and therefore entities). The multilingual initiative that I'm working on would really do well with a definition of a set of objects it needs to work with. If all it is going to be entities or configuration, our life there is much easier. If not, we need one-off implementations for whatever is not of these two.

There's been talk of making

catch's picture

There's been talk of making menu links entities - separating them out from the router system entirely.

That would then mean having to fit the router into the configuration system.

I think translation is a good

fago's picture

I think translation is a good example that separating content and configuration in 2 totally separate APIs creates extra-work for stuff that needs to be able to handle both. Instead of solving it once for our storage API, we might have to solve it twice.

multiples right now

Gábor Hojtsy's picture

Well, for translation, right now (D6, D7) we need to support dozens of different systems (think how your site name is saved, how your contact categories or aggregator feeds or field configuration is saved), while all these need translation and preferably in a unified but context sensitive UX. Now, if we can get down to just two different systems that need support, that is already leaps and bounds ahead of what we need to work with now :) Not that needing to support just one system would be even better. :)

now that I've been thinking more about this

Gábor Hojtsy's picture

Now that I've been thinking more about this, I'd like to throw my hand up for reusing fields as much as possible for the UI of configuration. Not sure how can a views UI be made with fields, but all the core configuration pieces I can imagine being done with fields. I've in fact wrote a big piece about how i18n needs to reimplement lots of fields (and where it does not re-implement it all, it misses crucial functionality), so if fields are not used for the configuration UI, we'll need to reimplement fields for them in contrib anyway :| http://groups.drupal.org/node/152929

Update: I've posted a better discussion piece at http://groups.drupal.org/node/154434 please follow up there.

Is it in the plans to load

indytechcook's picture

Is it in the plans to load the configuration based upon the context? My main beef about the variable system is that it's all loaded on every page request. Seems wasteful IMHO.

Also, It feels like people are sticking configuration into Entities due to the CRUD and other helpers provided by the entity module. Make the configuration a standard implementation with a plugable (not necessarily Larry's plugin system) storage mechanism seems like a huge win to me.

Whatever you do with your

pounard's picture

Whatever you do with your Drupal site, variables are what makes the site what it is. Loading variables on demand instead of loading the cached whole would probably a really bad idea, you'll always end-up by loading almost all of them whatever you do.

The idea of loading them per big business oriented chunks may sound a bit more delightful, it should nevertheless remain an optional and configurable behavior because overall variables usage will be totally different for each site.

Pierre.

Content and configuration

Damien Tournoud's picture

Are we back in content vs. configuration debate again? It's the best recipe for a debacle.

Everything that is modifiable by the UI needs to be deployed in a consistent way. There is no way to properly determine what is content and what is configuration, so we should just stop trying to make a line in the sand. I thought we agreed on that and I am very disappointed of where this is heading.

Damien Tournoud

Who agreed?

heyrocker's picture

Nobody ever agreed on that, in fact the topic has been immensely controversial since I started pushing it a year ago, and even those who did agree didn't agree on how it should be architected (is it all entities? is there another class above everything? is it your document system?) This line makes sense, it will move us forward, and we will learn a ton that we can possibly build something new off of in the future.

While I don't think its the

fago's picture

While I don't think its the right decision, I agree we need a decision so we can move forward. I hope we are able to re-evaluate that at a later point. I fear this leads to quite some duplication and an increased separation though.

If it is an Entity it is

fago's picture
  • If it is an Entity it is content, if not it is configuration *

I don't like how this sentence puts the entity API in a corner without any discussion (of where the entity API goes).

We need to be able to save entities from code reliably, without loss of functionality and without invoking Forms API. This should happen without a loss of functionality (IE we should be able to get the same validation, and all data should be submitted properly). The main thing I would like to see happen here is that validation (aka form_set_error()) be revamped such that error setting and displaying become separate interests, and thus we can use the same APIs for form submissions as well as saving from code (with the addition that form submissions will have some display code for their errors.) I haven't really thought through an architecture for this, and I'll want to get someone who knows Forms API much better than I do involved.

As the restws module proves, the entity API module has solved that already for d7. It doesn't feature a single validation API that magically is available to forms, but still it is solved.

One of the problems I have

heyrocker's picture

One of the problems I have right now is I have to work with what I've got, and what I've got is an incomplete entity API in core. I'm not going to plan for the work that others may or may not do. There were a couple things I was going to do in Drupal 7, and someone said 'Oh you shouldn't bother with that, that code is going to get completely reworked' so I didn't. Then those projects never happened and now we're stuck for another version.

The advantage of the approach I am taking is that we're building from the bottom up and learning as we go along, and in the meantime we can also keep an eye on the work going on elsewhere. So far as I can tell there is still a lot of discussion going on relating to what that API will look like and how it will get implemented. If we get a good performant entity API into core, and it appears that there will be advantages to using it as a master item for persistable data, we can probably add it on top of what we've built reasonably easily. If that doesn't happen, then we lose nothing, and we still have the system we're building and we will learn an immense amount from building it.

However, if we start planning now for entities everywhere, and that project falters, or the implementation doesn't work the way we need it to, then now I'm stuck with something I can't use at all and have to start over from scratch with months of time down the river.

Design from the top down, build from the bottom up. This will bring us a system we can build progressively and iteratively and which will be more likely to be shippable when we want to deliver D8.

pwolanin's picture

So, catch and I proposed making the core entity API more consistent and complete in our core conversation in Chicago. Honestly, I think that's a bit of a pre-requisite for success. I understand that you personally don't want to do that work, but perhaps you should make it part of the roadmap here and push those of us who are interested in it to get it done sooner than later?

Point by Point

eaton's picture

Drupal configuration information will be able to be managed through standard tools.

I'm assuming this means things like text editors or IDEs or what not? Or Drupal API calls?

Drupal content will be able to be staged between servers without ID collisions.

This would definitely be a spectacular win; realistically I think the content/configuration split and sup being less of a system-breaker if this is in place. That distinction is important but up to this point, it's been magnified because non-configuration elements are sitebound.

Drupal content will be able to saved programmatically without relying on the Forms API and without a loss of functionality.

A++++ WOULD BUY AGAIN. There are few things in my entire life I regret more than drupal_execute_form(). Faking form submissions allowed us, temporarily, to work around certain system limitations in Drupal. Unfortunately, it also cemented a lot of terribly bad programming practices. Biting the bullet and splitting our forms from our CRUD would have been much better, and if we can do that now it will be a fantastic win.

Configuration data is available to Drupal early in the bootstrap process without booting the database.

Also a big plus one. We're used to this today in the form of the settings.php $vars array, and we all understand how useful it is. It only works for variable_set() data, though, and that manes much more has to live in DB tables or post-bootstrap module-handled config files.

Configuration is revertable, either through the UI or code.

Features, Views and Panels exporting, and other modules that follow the paradigm have demonstrated the importance of that model. The trick is, as I mentioned earlier, handling the special 'configuration cases' that have sweeping impact on data schemas and other structural information. FieldAPI fields are not something that can be easily 'reverted', unless I'm missing something important. Figuring out how to cordon off those cases is important, I think.

Links/References between content and configuration, or content and content, or configuration and configuration, can exist without being broken by being moved between servers.

DRUIDs FTW.

The system will be flexible enough to handle complex use cases, including ones we can't predict

This one is sort of like asking for more wishes, and I fear that it could become an excuse for massively over-engineering things. I think that by emphasizing small, effective systems that can be used together smoothly (DRUID handling, convenient persistence/overriding/bundling of configuration, etc.) we can avoid locking ourselves in. The biggest danger, I think, is building a 'totalizing' system that attempts to anticipate every possible angle in one API. That's far MORE likely to result in forehead slapping down the road IMO.

The system should cause no significant performance regressions, or even better improve performance and scalability overall.

I think the biggest chance for that is the ability to shift more configuration handling earlier in bootstrap, before the database and other expensive storage mechanisms come online.

Very interested in seeing where things go; I think the direction that's being taken is very promising.

A couple remarks

yched's picture

A couple remarks :

Configuration is revertable, either through the UI or code.

[edited out - was a consideration about the 'override / revert' model, but reading http://groups.drupal.org/node/149744, I see you guys have sorted this out already]

The main thing I would like to see happen here is that validation (aka form_set_error()) be revamped such that error setting and displaying become separate interests, and thus we can use the same APIs for form submissions as well as saving from code

That's one thing I think we got right in D7's Field API. Field validation triggers an exception holding an array of errors. If within a form submit, the errors are distributed back to the relevant widget. Entity types could theoretically trigger field validation during their saves if they want to - even though no actual entity type currently does AFAIK. Works for field only, of course.

Per discussion in IRC today,

pwolanin's picture

Per discussion in IRC today, catch and I would like to work on http://drupal.org/node/1018602 and related issues early in this initiative so we can move towards a unified set of APIs and code for entity handling and not hack uuids and other stuff on top of the currently broken and disparate APIs.

Deployment & Build Systems & Change Management

Group organizers

Group categories

Group notifications

This group offers an RSS feed. Or subscribe to these personalized, sitewide feeds: