Drupal 6.x. site Migration from MySQL to PostgreSQL (howto)

Alex Tutubalin's picture

I'm not a MySQL fun, so some troubles with MySQL database (MyISAM data corruption and too slow InnoDB) force migration of three existing Drupal 6 sites from MySQL to PostgreSQL. Since Drupal 6.2 some PgSQL-related bugs (sample) in core modules has been fixed, so I have a try.

Drupal 6.5 and PostgreSQL 8.3 compatibility

Drupal Core Modules

I've installed clean Drupal 6.5 on top of PostgreSQL 8.3.3 (with UTF8 database encoding) and played with it a lot. There is no heavy troubles, only minor ones:

  1. My database instance has non-UTF charset set as default client_encoding (in postgresql.conf). It is not possible to change this setting due to large amount of legacy code. It is much simpler to explicitly set connection encoding after connect to database.
    Patch: drupal.pgsql.diff.gz (see also my bug report).
  2. Blob data (in cache_* tables) is not escaped correctly. PostgreSQL requires 'E' prefix before backslash-escaped values. So, database log is filled with warnings (2 lines on each cached element).
    Patch: drupal65-pgsql8x-patch2.diff.gz and and bug report (with same patch attached).
  3. MySQL truncates too long string values (longer than specified in DDL), that violates SQL standards (but more 'user friendly'). PgSQL refuses such records and raises an error. This problem occurs in many places, such as too long module name passed to watchdog() function, too long translated string in Locale module and so on. There is no easy way to fix it, you need to examine database logs and fix Drupal code step by step. I've fixed just one module name ('comment_subscribe' becomes 'comment_subscrib').

Contributed Drupal Modules

I use only ten 3rd-party modules. Most of them works under PgSQL without problems:

  1. Admin Block
  2. GeSHi Filter и GeSHi node
  3. Image, Image assist, Image Attach, ImageMagick Advanced Options
  4. Advanced Help
  5. Site Menu
  6. Taxonomy Menu
  7. CAPTCHA pack, Random CAPTCHA
  8. Tagadelic
  9. Views

Comment Subscribe Module changes

The Comment Subscribe is working with PgSQL only after heavy rewrite. This module uses many non-standard MySQL features such as if/ifnul (instead of case/coalesce), concat (instead of || operator) and multi-table update clause.
Patch: comment_subscribe-pgsql.diff and bug report.

MySQL to PgSQL Drupal Migration

There is simple: you need to convert database data structures, copy data, than change data source in drupal configuration. Sounds easy.

In real life it is not so easy:

  • Backup and Migrate module does not compatible with PgSQL. Amount of required changes is not small.
  • There is several mysql2pgsql conversion scripts does not do the job (I've tried several ones).

So, I opt to alternate way: create new empty site with same data structures, than move data by special migration script.

In Drupal any module may have own data tables in database. So real migration procedure is:

  1. Backup your existing site! Both code and database !!!
  2. Install Drupal from source, add patches (see above), add modules and module patches (if any). If you already have running Drupal site, you need only PgSQL-compatibility patches (see above).

    Also, your existing site should have separate directory under sites/, not just sites/default.
  3. Create any non-used DNS name (i.e. www-new.mysite.com), create new PostgreSQL database, than install Drupal for this hostname on fresh database. You can supply any site data on installation phase, this data will be overwritten on data copy phase.
  4. Enable all modules, enabled on migrating site. Module enabling will create all tables/sequences in your PostgreSQL database.
  5. Run Magic Migration Script (see below). Your new site now have all data copied (with exception of cache_* tables and locale).
  6. If you use Locale module, export your Locale data on source site and import on new site.
  7. Change data source ($db_url in settings.php) in your old site settings.php to new database.
  8. That's all, folks!

Note: You should have shell-access to your server.

Magic drupal-mysql2pgsql.pl Script

Download it here: drupal-mysql2pgsql.pl.gz (do not forget to gunzip it).

The script requires Perl 5.x (I use 5.8, but Perl 5.6 should work) with DBI, DBD:Pg and DBD::mysql libraries.

Usage: drupal-mysql2pgsql.pl source-db dest-db, source-db - source MySQL database,
dest-db - destination PostgreSQL.

You either should passwordless access to both databases or should change the script to add your username/password. This is one-time use script, so I have not added many command-line parsing code.

How it works:

  • Both databases are queried for list of tables. If any table exists in source DB, but not in destination one, the script will warn you.
  • All data in destination tables is erased.
  • Most data from source tables is copied into destination. Locale data and cache not copied.
  • All sequencer initialized to max(column);

All destination data changes is done within transaction.

The script is provided AS IS. It works for me, you can change it as you wish.

Comments

The best thing i've ever found

nileshgr's picture

Thanks a lot for this man !!! The script rocks. Finally my long awaited desire got completed. I converted my site to PGSQL. With PGSQL it has become much more fast than with MySQL.

Though I don't know perl, I managed to edit the thing to my requirements coz I know PHP well + Google ;)

Drupal core patches necessary?

pauleb's picture

This howto refers to Drupal 6.5. There were the core patches necessary. But is this with Drupal 6.15 still true?
I'd prefer not to patch the core files because it could cause problems next time a Drupal update is on the schedule.

@nilesh.3892: Did you apply the patches?

@ Alex Tutubalin: Thanks for this howto!
Did your patch make it into the core?

No i didn't do so. I'm using

nileshgr's picture

No i didn't do so. I'm using 6.15 and not 6.5. I don't have any problems and performance is much better than MySQL on PG 8.4 and PHP 5.3 (mod_fastcgi)

How to make use of the script?

marko3's picture

Hi and thank you for the script, which I badly need to make use of!
Excuse me for my ignorance but as someone who knows no Perl and new to Postgresql there are a number of ambiguities which prohibited me to deploy the script. I hope you could clarify:

  • Is the patch still needed for Drupal 6.22?
  • How to make "passwordless accounts"?
  • If I want to use credentials in the script, how it can be done. Say I made a fresh install of Drupal with a Postgresql database 'PGDB' which can be used by 'PGUSER' with password 'PGPASS' and I have my original site in mysql 'MYDB' whose database credentials are 'MYUSER' and 'MYPASS' . So how can I implement these credentials in the database connect lines:

my $srcdb = DBI->connect("dbi:mysql:database=$ARGV[0]") or die;
my $destdb = DBI->connect("dbi:Pg:dbname=$ARGV[1]" ) or die;

And what will be my perl conversion command?

I really appreciate your further explanations.

Very limited script

jaakl's picture

This script is quite naive, assumes 1:1 matching data types. It did not work for me as anything nontrivial like binary data (e.g. cache table) and geodata does not work this way.

ps. if you need schemas with password, just modify first lines to something like this:
my $srcdb = DBI->connect("dbi:mysql:database=$ARGV[0]","root","root") or die;