Best practices for backing up necessary site files before updating D8 using the command line

Events happening in the community are now at Drupal community events on www.drupal.org.
Momseekingbalance's picture

Any advice for backing up the site using the command line before updating D8? I figured out how to back up the entire site to a .tar.gz file (after backing up the database first) but that seems...excessive? Unnecessary?

That being said, it was still faster than backing up files using FTP.

Thanks for any advice out there!

Comments

Because have hear you are

cprofessionals's picture

Because I have heard you are using the cloudways platform, know that their UI has a Backup Feature in it.

Here are some other ways:

Understand there are two sets of data that need to be backed up: Files and Database (export).

If you are using FTP. Definitely Tar or Zip your Files before FTP'ing them. This will save HOURS. doing the same on the database expoert helps as well.

"There is a module for that" - Backup and Migrate module: https://www.drupal.org/project/backup_migrate

Found this on the web:
Via CLI
This method involves the use of a command line interface and is a bit more technical than the first method. However, it is more thorough in it’s application and less likely to succumb to errors. While the steps are pretty similar to the GUI method, it is incredibly easy to perform, even for those who aren’t too familiar with CLI commands. Basically, only one command is required for backing up the website’s folder and it’s database.

To backup your site’s folder, enter the following command without the quotes:
“cp -rp /path/to/drupal_site /path/to/backup_dir” where “drupal_site” is your site’s folder and “backup_dir” is the directory where you wish to copy it.

Similarly to backup the database, use the following command, again without the quotes:
“mysqldump -u USERNAME - p'PASSWORD' DATABASENAME > /path/to/backup_dir/database-backup.sql”, where “DATABASENAME” is your site’s database name.
Note: The mysqldump command is used to dump or copy the database to a specified location.

You can also use Drupal’s CLI i.e drush to make a backup of your database. To do that, the following command is used:
“drush sql-dump > /path/to/backup_dir/database-backup.sql”

All or some files?

jhodgdon's picture

If you have all of the Drupal source code and modules in a Git repository, you should not need to back up the entire document root. You would want to back up the sites/default/files directory (or wherever you have the public/private files such as images that you have uploaded). And you would want to make sure that all of the Git files have been committed and pushed to a remote repository.

That would cut the number of files you need to back up down to what the Backup and Migrate module would back up. However, you would definitely need to make sure you have your Git repo updated, so that if you need to back out the update, you could do that.

And yes, the database!

Thank you both

Momseekingbalance's picture

I appreciate the direction from both of you. Very straight forward. Two thumbs up!

A few pointers

jchristi's picture

I've written and re-written the backup and import script for Drupal database + files used by my team in a production environment (written as a CLI script in Bash). As jhodgdon suggested, we have no need to backup the entire drupal root directory, since our code is stored in git. I chose to forego the backup_migrate module also because I wanted a solution with as few dependencies as possible where php + drupal + drush was not required (the script requires only the mysql client, which comes with the mysql and mysqldump cli commands). This is especially useful if you want to run the script in a containerized environment with a minimal container image. YMMV, but this made backup and imports much simpler from a system administrator perspective (not necessarily a drupal-centric developer perspective though).

A few other comments/suggetions not already mentioned:

  • First and foremost, it is worth mentioning that since database and managed files are two separate systems (file system and database), there is no way to absolutely guarantee your backup will be perfectly in sync (short of hosting both database and files on the same host and using atomic file system snapshots). I have seen corrupted backups that did not restore completely properly but that is very rare. In those cases, we use the closest backup, which is usually not a problem since we take backups every 2 hours.

  • Since I'm completely bypassing drush and drupal, I need to have the database credentials stored elsewhere. This is not a problem since mysql client has the [client] ini config section in /etc/my.cnf or ~/.my.cnf, so do not have to specify passwords on the command line (insecure).

  • Depending on your use case, you probably do not need to backup every database table such as cache tables and other similar tables (batch, queue, session, watchdog, etc), since most of those either will re-generate or are environment-specific. This can greatly reduce file size as well as script run time.

  • If your drupal site generates image thumbnails and similar "generated" files that are saved into drupal managed files, you can optionally exclude these as well since they can be regenerated. This will save additional space on backup files. This can add up when you are saving and archiving backups on a regular basis.

  • At various times, we've stored our managed files in different backends, from the local file system to NFS to Amazon S3, which the script had to accommodate. At minimum, the script needed to accommodate specifying the paths to drupal public and private managed files. For importing files, using rsync instead of a naive copy will result in a much faster file importing experience (equivalent for S3 is the "aws s3 sync" command of the aws-cli)

  • Since the majority of drupal managed files are images, which already use their own compression algorithms (JPG, PNG, etc), running compression (gzip, zip, etc) on the files is mostly wasted CPU time for no size reduction. Therefore, I only compress the SQL file(s) for the database (huge file size reduction), and leave the drupal managed files as-is (your use case may differ though). I then archive (tar) those together to create my single backup archive file. My import script needs to understand this custom archive format in order to extract and import the database and files.

  • When you restore your drupal managed files, you will very likely have to change the file system user and group (chown -R myuser:mygroup /path/to/files) so that your web server user has the necessary read/write file permissions. If you run SELinux, you may also need to change the context (chcon) depending on where your files are located.

  • Performing backups and imports requires a very large amount of disk space in order to save and archive to / extract and import from a single large archive file. Ensure your environment has more free disk space than at least two to three times the size of your database + managed files.

  • Lastly, to speed up the run time, my script forks several processes to backup or import managed files and database tables in parallel. This is not strictly necessary but certainly helpful.

If all the above is making you reconsider writing your own backup/import solution, then perhaps a pre-existing solution like backup_migrate might be more appealing to you. If still interested, I can try to find the time to make my script generic enough to post to github (with a huge zero-liability, zero-support disclaimer).

Epic = homework but I love homework!

Momseekingbalance's picture

That was an epic answer!! My goal by the next meeting will be to understand the other half of the message that I confess is still over my head. But thank you!!!

May not be applicable

jchristi's picture

My use case is for a large company's corporate website, in which we take regular backups on a two hour interval. If your use case is a one-time backup before migrating from D7 to D8, then you can probably ignore most of what I wrote (hopefully others may find it useful).

If there is one piece of advice I could give you then, it would be to TEST that your backup can be imported and works before you proceed with your big migration. This is to avoid the situation of a failed D8 migration requiring the D7 database to be restored but you find out it is not working or you do not know how to make it work! Testing this requires that you have a separate hosting environment so as not to disrupt your production site.

Spokane, WA

Group notifications

This group offers an RSS feed. Or subscribe to these personalized, sitewide feeds: