Infrastructure Requirements
Damien is working on setting up three instances we can use for this project: redesign.drupal.org, staging-1.drupal.org and staging-2.drupal.org. If you want access to these instances, upload your SSH key on this thread.
Here are details that we'll need to discuss:
- What are our development needs?
- What do we do about managing the database?
- How will we handle configuration changes?
- How can we make the database available for developers using local environments?
Groups:
Login to post comments

Community instance requirements
After some discussion, this is my current understanding of what is needed for the infrastructure:
It appears as though what Damien is setting up with redesign, staging-1 and staging-2 will give us a good start, and we can evolve as need requires.
Developer testing
I think that starting off with two staging sites for now is fine. We can figure out if we need to grow if our team grows larger and starts having conflicts.
Managing database configuration settings
If we can, it would be great if we (as a development team) could work towards the process of eliminating the need to make changes to the database manually, and instead utilize update hooks for everything we'd need. Automatic updates, and passing around the database would be tremendously easier.
In using this process, we would never make configuration changes through the web GUI. Is this development process possible?
(feel free to port this over to a "development practices" thread as needed)
Redesign project is an "update"
I wholly agree that this should be approached as an update to d.o. In addition, we would be effectively constantly testing the update process during development, significantly reducing errors that will crop up on the final update to d.o.
Distributing a sanitized Drupal.org database
In order to enable local development, a sanitized version of the Drupal.org database will need to be available for distribution. I normally recommend putting it into version control, but at 600MB zipped / 1.7GB unzipped it is obviously way too large for subversion :) If we can set up an rsync method for contributors, it could speed up the download process. We could also possibly set up a torrent.
If we are successful in eliminating the need to manually make configuration changes by putting everything into update hooks, then all we'll need for a database is an unmodified-yet-sanitized version of the Drupal.org db. This copy of the db would be created:
Ideally we'd have two copies, just in case a configuration change causes a big problem that the redesign project has to compensate for. When using rsync, the download of these updated versions of the database would be significantly reduced. If we use a torrent, the contributor would have to re-download the database whenever they need the latest update.
Thinking longer term, having this distributable-version of the database available to the community could be useful for other needs outside of the redesign project.
DB staging requirements
How complicated or intensive are update scripts in both a best case and worse case scenario? Is it possible for a contributor to run these updates themselves?
In the scenario I was envisioning, a user would download the latest sanitized version of the d.o database as needed and run the updates themselves. If they do not need to have the latest version of the database, then they can simply run the latest updates. In development, we'd use a normal (if not more rapid) release process where each config change is incremental. This would reduce the download and restore process for the database to an as-needed basis, while updates would handle the rest. Is that possible with the type of development we're doing?
Sanitation requirements
What are the requirements for sanitization? At the least, we'd want to reduce the file size of the database. When I dump a database, I will keep the structure of the following tables, but truncate their data:
As you may have noticed, I do eliminate the search index. If it is needed, perhaps we could distribute the search index separately for just those who need it.
What other sanitation needs to take place?
Disk space
Disk space is apparently at an extreme premium. What can we do to reduce this need?
What are the infrastructure/system requirements for these instances? Is it possible to use another server for this development push? If so what is required?
For example, I have a shared host with lots of bandwidth and disk space, but low load requirements. If something like that can't work, can an EC2 instance work? I'm willing to pay for an EC2 instance for three months.
Disk Space and DB Requirements
Gerhard and I are working on getting this together. We will host 10-15 drupal.org databases on a special MySQL instance on db1.drupal.org. The databases will have certain easy sanitation done to bring them down to a reasonable size, but nothing that would require massive SQL statements to preserve referential integrity. The MySQL instance will be running 5.1.34+InnoDB Plugin+Barracuda file format+Compression to lower the size of the disk requirements. The MySQL instance will be niced higher than the production instances.