What's your process for change management for migration of very large drupal installations?

We encourage users to post events happening in the community to the community events group on https://www.drupal.org.
amatusko's picture

Hello everyone,

I have been "messing" with drupal for a while now, and have now taken my first job as a 100% committed Drupal developer & tech lead for a company that builds massive websites for large federal initiatives that involve dense structures, complex role management schemes, and mountains of content (often 10,000+ nodes).

So, one of my first tasks as we get started is to work with the site building team and sys admins to create a solid process for performing migrations from dev >>> staging, sometimes staging >>> dev, and, of course, the eventual staging >>> production.

I have done a bit of cursory reading on the topic, and it appears all / most of the popular solutions are a bit hack-ish (so to speak).

Can anyone tell me what sort of procedure they have for managing something like this? We use git, and I am in the process of getting the site building team comfortable with the command line, drush (w/ aliases, etc) + git, but there is that ever nagging problem of configuration vs content. It's a tough problem - and when you have 10,000 nodes or more, the "hairy-ness" seems to become near unmanageable.

Can anyone tell me about their process and whether they have had any luck with executing the process?

Some of the tools I am becoming aware of here include:

Aegir - is this effective? How can it help me compared to other solutions?
Hudson - I'm not versed on this yet, but have heard it mentioned as a solution for change management

Salt - I recently heard an interview on the FLOSS weekly podcast about this tool, and it looks pretty cool. I'm still having a bit of trouble seeing how this would really help me vs a solution using separate drush aliases for @dev @stg @prod, etc + git. Am I missing something here?

Oh, yes, and features. It has its uses, but I'm not seeing it as a rock solid solution, but more of a hack. Am I just too ignorant of its wonderful deployment uses?

Also, my company has invested in the Acquia Enterprise level suite, which I have yet to even start using. Are there options here as well?

Please feel free to offer suggestions if I am missing any useful technologies here. This is obviously an ongoing issue, but I know the real technicians out there still need a way to solve problems like this (even if they are manual).

So, what is your process - if any - for accomplishing this? Where does it work well, and where does it lack (if anything)? If the process needs to be done iteratively via some sort of automation, is there perhaps a way to use an instance of open atrium to manage the workflow between the platforms? I plan to set up OA to bring out bugtracker, git links, project documents, etc, and I'm wondering if there's a way to wrap this process up into a series of steps that follow a specific pattern.

Thanks!

Drew

PS this is one of my first questions to the community, so I apologize if I missed any of the posting guidelines.

Comments

2 separate things

sanguis's picture

I don't have much to tell you about the tools in question, but I can tell you that your questions can be divided into 2 separate categories.
The first is configuration/development environments. The Second is content staging/live environment.

For development, I am a strong git user. And Features is way more then a hack. If you implement post deployment hook (via git or a Jenkins script) to revert enabled features, and update the database, then you can never have to make manual tweaks to a live/staging sites database configurations.

For the content question there a number of models that can be taken in to consideration.
One site with strict publisher/copywriter roles.
2 separate sites running as a live config with a feeds/services config allowing one to create content on the staging server and then push it to live as a flag.

All the methods are well documented and have both prose and cons. I would just encourage you to look at them separately.

I use daily a combination

Alexander Allen's picture

I use daily a combination of:

  • Jenkins to manage deployments
  • Features for Software Configuration Management, e.g., managing exportables such as Contexts, Views, and Content Types, amongst others.

Jenkins scripts are flexible, you can run any commands you want in them for your deployment (such as reverting all Features). And so far I have been able to use Features with both Jenkins and the Acquia hosting stack without problems. Overall it's the solution I recommend to everyone, but others might have other workflows as well to contribute.

Jernkins/Hudson is just a

cyberswat's picture

Jernkins/Hudson is just a dummy that executes deployment/rollback in your script(s) so the challenge is more along the lines of what those deployment and rollback processes are actually doing.

I can't speak about Context/Views/CCK etc, because we don't use those for performance reasons, but I can speak about general process. I think that when it comes down to it, your change management is ultimately successful or not based on the process. Here's a few of the tenants we try to adhere to:

  • If your rolling out changes manage the changes in smaller pieces. It's simple math to me ... the more your trying to do at once to a production site, the more likely you will have problems. If you can break your release cycle from something that happens every month or couple of months to something that is much more rapid the size of the release tends to decrease making it more manageable

  • Use $conf religiously. If you have a set of configurations for your code that are necessary they should be stored as variables. This practice should be enforced during code reviews to mitigate the size of it when you hit release time. This gives you the ability to change key components quickly without code. Example: you have a function that performs actions based on certain referrers but you don't want it to operate on 3 specific referrers. Place the referrers as an array that can be overridden using $conf in your settings files.

  • Love Drush and remove access from devs to your staging/testing environments. Basically everything that needs to be changed on a site needs to be encapsulated in a drush command. If you are adding indexes, adding content, adding exportables, or anything else you need to weigh both a deploy and rollback strategy. This one also comes down to process planning because you need to contrast your dev time to completely unroll the changes created by a drush command vs. deploying and then moving forward with possible bug-fixes ... does the business want to invest the time in managing every piece of data? Is there content that needs to be moved over to the live site? Then export that content so that Drush can either create it or import it.

  • Use automation ... to me, this is beneficial to process enforcement. You will know if a dev makes something that can not be easily migrated to your production site because you've blocked their access to the qa/staging environment. Your automated scripts will either handle 100% of the new feature rollout or it will break. Automated testing should be used to test both the deployment of your feature as well as the rollback if your business the rollback is worth the expense.

There are a bazillion tools out there each one likely more shiny and sparkly than the last ... but what it comes down to is these tools only ever work well when you know your process inside and out.

New to Drupal - searching for the most viable CMS

KillerKlown's picture

Yo, Ssssup, I am currently building a proposal for a large scale CMS handling site that couples with a neuro-learning concept I am toying with for Dimensional BI Framework applicability. I am a complete idiot with the drupal system, but, am starting to see that this concept of the core tables building core tables via dynamic generated control is pretty cool. Traditionally I workd on a similar concept but nothing near this complex via XML and C++ where our DB connection and ETL layers were dynamically built via a fact table setup by the Admin console. thats enough about my excitement.
I am building a CMS site and I need to build an OLAP DB to Matrix performance tools.. How and what is the best method for Extracting Client input data into a backend DB for processing.
Can you give me some examples please.

I really would like to get

atin81's picture

I really would like to get some information about this too.

Washington, DC Drupalers

Group organizers

Group notifications

This group offers an RSS feed. Or subscribe to these personalized, sitewide feeds: