Building a Drupal.org deployment pipeline

sdboyer's picture

Leading Drupal's migration from CVS to Git made something eminently clear to me (and eliza411, the migration's PM): there's a lot that's broken about how we manage, maintain, and improve drupal.org (and its subsites). Since then, I've been gradually chatting up more and more people with the idea that we could build a structured, participatory model for updating and adding new features to drupal.org. And that if we do it right, it could become a best-practice model for (Drupal) site management (open, participatory) workflows.

There's a nice, articulate big picture summary & justification for this project that eliza411 is currently writing up, which'll be the unifying thing we rally behind. In the meantime, though, the document she and I drew up over the summer is still more-or-less accurate. Maybe the biggest scope difference is that we think this should extend beyond just d.o - if we do this, it should really be done for *.d.o.

Bottom line, I've now talked about this idea with way too much to keep it under wraps any longer. So I'm posting this brief overview in order to kickstart public discussion and progress. Which, by the way, I'm hoping will go like this:

  1. Once eliza411 has gotten the new-and-improved overview written, we can fight about the broad goals and scope there.
  2. We collectively generate a backlog of tasks that accomplishing the full scope will require.
  3. We gradually morph this backlog into a collaborative scope of work document.
  4. We have regular IRC meetings to help hash things out. I've started #drupal-devops for this purpose as I can't think of another channel it belongs in, though if folks think #drupal-infrastructure is fine, that'll work. I'll keep devops open since I think it's a worthwhile channel for us to have, anyway :)
  5. We pair the scope of work with the overview + some other goodies, turning it into a proposal that we can bring to the DA for funding, and/or Kickstarter or this new http://www.drupalkata.com/community-initiatives/ idea.

That collaborative backlog generation should take place in this shared google spreadsheet. If you want to help out, give me a ping in IRC and I'll grant you some perms to edit the doc. Actually, I'll give you access to the whole collection, we're putting all the docs related to this project in there. Please note, however, that everything in that backlog will eventually be translated to issues on drupal.org. That spreadsheet is merely a convenience for collaborative brainstorming right now.

Basically, we're talking about building a deployment pipeline, per the ideas discussed in the very hot-topic devops book, Continuous Delivery. For anyone seriously interested in this, I STRONGLY encourage you to read the sample chapter they've made available online. There are some things that I think don't quite fit for our requirements, but on the whole it's an excellent breakdown of the value to be gained here.

That spreadsheet has already been divided into the six work areas that Melissa and I believe adequately encapsulate the overall goals of the project.

  1. Bazaar -> Git: drupal.org infrastructure primarily uses bzr at the moment. We would shift over to git in order to keep volunteers from having to learn yet another skill, especially one that is unlikely to transfer to new
    Drupal projects. This area of work also entails choosing a hosting solution for our Git repos, as well as a branching strategy/rules.
  2. Puppetize Prod: Pervasive use of configuration management is essential to replicability. Puppet is the system the OSL has chosen, and we are following suit. So one of the first steps is necessarily rolling our existing
    infrastructure over into puppet.
  3. Sanitization: While we want to make this process as public as possible, there are some limits. Some data (e.g., user emails), some settings (e.g., settings.local.php) and some code (e.g., Bluecheese...maybe) should not leave the drupal.org servers. However, some components require some normally
    sanitized data to function (e.g., Git integration requires emails for commit/user mapping). This work area is about figuring out where these limits are, and how we accommodate them.
  4. Environments: The meat-and-potatoes of this whole project: provisioning new, full-stack VM clusters for development, testing, or staging, and either locally (on a developer's own machine) or centrally/shared (on drupal.org-owned servers). Vagrant (so Virtualbox) will be used for the local instances; central is TBD. We'll need to keep an eye on keeping the builds componentialized, otherwise a full-stack local environment with all *d.o properties could mean >15G of local space.
  5. CI and QA: Drupal.org has no integration tests, regression tests, etc. This lack of QA is a huge reason (at least for Git) why we don't see more progress. This area of work is about building out a real suite of tests for
    all *d.o sites (that participate) and building a real CI process using those suites.
  6. Process: The other areas of work are relatively separate work blocs; this is about putting them together so it all runs like a finely tuned machine. How & when code moves through dev/qa/stage/prod; how new folks get set up with environments; this sorta thing. The goal is to create a clear process with good answers to the major user stories, and clear expectations around how and when different things happen within infra.

eliza411 and I have tried to provide a good structure for this planning process, but we're figuring this out as we go along. So if folks have any suggestions about techniques for collaboratively defining & scoping an overhaul like this, they'd be welcome - though we're going to play a balancing act against things that would derail this overall effort.

Comments

These are all good things.

drumm's picture

These are all good things. I'd like to see this existing infrastructure communication used for everything, including IRC room. Siloing efforts off to other places is not something I like doing. The existing infrastructure team will be participating heavily and continue to be using these things.

Distributing Bluecheese as a regular theme won't happen since that would mean GPLing the CSS and other non-PHP stuff. We would lose control over our branding and legal ability to deal with copycat sites. That means we unfortunately have to control read access to it and anything that uses it.

I quietly started adding scripts to http://drupal.org/node/107028/commits. I plan to do more whenever I touch things.

These are all good

sdboyer's picture

These are all good things.

Glad to hear it :)

I'd like to see this existing infrastructure communication used for everything, including IRC room. Siloing efforts off to other places is not something I like doing. The existing infrastructure team will be participating heavily and continue to be using these things.

Yes, you've expressed this concern before. And to the extent that it makes sense, we will. Certainly the IRC room is fine, though as I said, I'm gonna keep -devops around since I think it's a good channel to have, regardless. However, when other communication routes make sense, I'm gonna opt for those. Relevant to this end:

  • In the Git migration, we tried scoping projects using only the issue queue. It worked really poorly; a spreadsheet particularly made it easier to a) guarantee that multiple people were looking at the same listing data at the same time (can take wrangling with the issue queue views), b) made it much easier to just delete stuff that we decided was irrelevant.
  • infra is intimidating. The IRC channel, the issue queue, everything. I still feel that way, even as an official team member with my picture in a diagram. There are a lot of people who'd like to help here, and we could REALLY benefit from the assistance, so ensuring there are low-pressure environments in which they can participate is important to me. And let's be clear, infra is intimidating for a reason - we have real, critical uptime responsibilities to the community. At any given moment, the channel or the queue can become the place where we need to deal with a crisis. Mixing high-priority critical communications with provisional, brainstormy ones in an amorphous team environment is a recipe for discord.
  • We can use other areas - like this very comment thread, and other posts on g.d.o - to hash stuff out, figure out what the really good ideas before they get pushed into those high-priority discussion spaces. We're no doubt gonna have a lot of ideas we generate and throw out here, and we want to be able to do that without damaging the existing infra signal/noise ratio.

Distributing Bluecheese as a regular theme won't happen since that would mean GPLing the CSS and other non-PHP stuff.

Yeah, that's the basic understanding I have; certainly distributing it as a regular theme from d.o won't happen. So what we do need to figure out is the conditions under which we can allow its use in local dev vms. Really, it's a discussion that deserves its own thread. Whatever we come up with, it'll need to end up at a strategy where people can have at least vaguely functioning local vms.

Integrating google travel reservation API to a travel site

Canyor's picture

I am interested in starting a travel reservation site.
I am looking for a developer who would integrate either google reservation API or yahoo or both.

Anybody out there?

Good to see Bluecheese is out

mgifford's picture

Know it's not perfect yet, but happy to see movement in this:
https://drupal.org/project/bluecheese

[Archive] Drupal Association improvements to Drupal.org

Group notifications

This group offers an RSS feed. Or subscribe to these personalized, sitewide feeds:

Hot content this week