Migrating a MySQL Drupal to PostgreSQL

tunic's picture

After several hours of uncertainty it seems that I have managed to migrate a MySQL Drupal to a PostgreSQL. So I want to share my experience with this group where I think could be useful.

There are several problemas to solve: table structure translation, autoincrement values / sequences, serialized data, etc.

I've tried several ways, but always I came up against some problem.

Database schema

MySQL and PostgreSQL use different data types for its columns, they define its tables in its own way, you can't use MySQl definitions.

One aproach it to install a clean Drupal with the codebase of the MySQL site and enable all used modules. But will not get the autogenerated tables form field API, for example. Sure, you may put all this fields in a Feature and use it to create those tables but is not very handy.

I've used a Ruby Script to do the job, mysql2postgres. This scripts cliams to migrate structure and data but at least when using PostgreSQL 9.x you'll have problemas with PHP serialized data.

So, use this script to replicate the same table structure of your MySQL site. This scripts seems to set up correctly sequences too.

Data migration

This is the more difficult issue. As I mentioned above there are problemas with PostgreSQL 9.x and serialized data, at least when you try to copy from MySQL. There's an issue for this: http://drupal.org/node/926636

So, what to do? I decided to trust in Drupal and use the drupal_write_record function to insert one by one every record from my MySQL database to PostgreSQL. Using this function I benefit form Drupal wisdom on inserting data into postgreSQL (if you rad the code you will see some tweaks when inserting into a PostgreSQL DB).

This is the sript that do the trick:

<?php
 
print "Migrating from MySQL to PostgreSQL!\n";

 
// Adding PostgreSQL database connection.
 
$pgsql_database = array(
   
'database' => 'YOUR_PGSQL_DATABASE_NAME',
   
'username' => 'YOUR_PGSQL_NAME', // assuming this is necessary
   
'password' => '', // assuming this is necessary
   
'host' => 'localhost', // assumes localhost
   
'driver' => 'pgsql', // replace with your database driver
 
);

 
Database::addConnectionInfo('PGMigration', 'default', $pgsql_database);

 
// Setup ready, let's loop over existing tables.
 
$tables = db_query('SHOW TABLES');
  foreach (
$tables as $table) {
   
migrate_table($table->tables_in_icomem);
  }


  /<
em>
   *
Select all records form the MySQL table and inserts them into the PostgreSQL.
   *
   *
Uses a modified version of drupal_write_record to insert.
  */
  function
migrate_table($table) {

   
$exclude = array('search_index', 'search_dataset', 'search_total');

    if (
in_array($table, $exclude)) {
      return;
    }



   
$table_info = array();

   
// Get table dataset.
   
$mysql_result = db_query("SELECT * FROM $table");

   
// Activating PostgreSQL
   
db_set_active('PGMigration');

    print
"Processing $table\n";

   
// Truncate PostgreSQL table.
   
db_query("TRUNCATE TABLE $table");

   
// Walk the MySQL dataset.
   
try {
      foreach (
$mysql_result as $mysql_row) {
       
drupal_write_record_no_autoserialize($table, $mysql_row);
      }
    }
    catch (
Exception $e) {     
      print
"There's no schema info for $table, skiping!\n";
    }
   
db_set_active();
  }


/</
em>
*
Modified version of drupal_write_record.
*
*
It doesn't serialize serialized fields (because as we are copying form one DB
* to anohter we don'
t unserialize data) and changes active connection to read
* schema form MySQL but weirte into PostgreSQL.
*/
function
drupal_write_record_no_autoserialize($table, &$record, $primary_keys = array()) {

 
db_set_active();

 
// Standardize $primary_keys to an array.
 
if (is_string($primary_keys)) {
   
$primary_keys = array($primary_keys);
  }

 
$schema = drupal_get_schema($table);
  if (empty(
$schema)) {
    throw new
Exception('No schema!');
  }

 
$object = (object) $record;
 
$fields = array();

 
// Go through the schema to determine fields to write.
 
foreach ($schema['fields'] as $field => $info) {
    if (
$info['type'] == 'serial') {
     
// Skip serial types if we are updating.
     
if (!empty($primary_keys)) {
        continue;
      }
     
// Track serial field so we can helpfully populate them after the query.
      // NOTE: Each table should come with one serial field only.
     
$serial = $field;
    }

   
// Skip field if it is in $primary_keys as it is unnecessary to update a
    // field to the value it is already set to.
   
if (in_array($field, $primary_keys)) {
      continue;
    }

    if (!
property_exists($object, $field)) {
     
// Skip fields that are not provided, default values are already known
      // by the database.
     
continue;
    }

   
// Build array of fields to update or insert.
   
$fields[$field] = $object->$field;

   
// Type cast to proper datatype, except when the value is NULL and the
    // column allows this.
    //
    // MySQL PDO silently casts e.g. FALSE and '' to 0 when inserting the value
    // into an integer column, but PostgreSQL PDO does not. Also type cast NULL
    // when the column does not allow this.
   
if (isset($object->$field) || !empty($info['not null'])) {
      if (
$info['type'] == 'int' || $info['type'] == 'serial') {
       
$fields[$field] = (int) $fields[$field];
      }
      elseif (
$info['type'] == 'float') {
       
$fields[$field] = (float) $fields[$field];
      }
      else {
       
$fields[$field] = (string) $fields[$field];
      }
    }
  }

  if (empty(
$fields)) {
    return;
  }

 
// Build the SQL.
 
if (empty($primary_keys)) {
   
// We are doing an insert.
   
$options = array('return' => Database::RETURN_INSERT_ID);
    if (isset(
$serial) && isset($fields[$serial])) {
     
// If the serial column has been explicitly set with an ID, then we don't
      // require the database to return the last insert id.
     
if ($fields[$serial]) {
       
$options['return'] = Database::RETURN_AFFECTED;
      }
     
// If a serial column does exist with no value (i.e. 0) then remove it as
      // the database will insert the correct value for us.
     
else {
        unset(
$fields[$serial]);
      }
    }
   
db_set_active('PGMigration');
   
$query = db_insert($table, $options)->fields($fields);
   
db_set_active();
   
$return = SAVED_NEW;
  }
  else {
   
$query = db_update($table)->fields($fields);
    foreach (
$primary_keys as $key) {
     
$query->condition($key, $object->$key);
    }
   
$return = SAVED_UPDATED;
  }

 
// Execute the SQL.
 
if ($query_return = $query->execute()) {
    if (isset(
$serial)) {
     
// If the database was not told to return the last insert id, it will be
      // because we already know it.
     
if (isset($options) && $options['return'] != Database::RETURN_INSERT_ID) {
       
$object->$serial = $fields[$serial];
      }
      else {
       
$object->$serial = $query_return;
      }
    }
  }
 
// If we have a single-field primary key but got no insert ID, the
  // query failed. Note that we explicitly check for FALSE, because
  // a valid update query which doesn't change any values will return
  // zero (0) affected rows.
 
elseif ($query_return === FALSE && count($primary_keys) == 1) {
   
$return = FALSE;
  }

 
// If we are inserting, populate empty fields with default values.
 
if (empty($primary_keys)) {
    foreach (
$schema['fields'] as $field => $info) {
      if (isset(
$info['default']) && !property_exists($object, $field)) {
       
$object->$field = $info['default'];
      }
    }
  }

 
// If we began with an array, convert back.
 
if (is_array($record)) {
   
$record = (array) $object;
  }

  return
$return;
}
?>

I think is pretty strigforward. The script is executed on the MySQL site using drush php-script. The scripts first it sets up a connection to PostgreSQL, and then iterates over all available tables in the MySQL DB. For each table it selects all records and fields and insert hem into the PostgreSQL.

For inserting the data I use a modified drupal_write_record function. This is because drupal_write_record serializes serialized declared fields, but I don't unserialize those fields when I fetch them form MySQL (it's simpler to just pass the dat instead of decide which data must be unserialized). Also I need to change active connection in the middle of the drupal_write_record function to use MySQL one when retrieving the table schema and switch back to PostgreSQL when inserting data).

Tables without schema are ignored, usually correspond to uninstalled modules.

Some tables are ignores explicity, as search ones, becasue they are large and I preffer to regenerate them later in the PostgreSQL site.

Problems

The only problem I have is draggable views. I've several views with Dragable Views to allos admin to order it. When you add the draggable view weight field to the sort cirteria you can check the option 'New items appear bottom of the list'. If you do so, Views will produce a SQL sentence like that:

SELECT node.title AS node_title, draggableviews_structure.weight AS draggableviews_structure_weight
FROM
{node} node
INNER JOIN ...
[...]
ORDER BY COALESCE(draggableviews_structure_weight, 10000) ASC

PostgreSQL says: 'SQLSTATE[42703]: Undefined column: 7 ERROR: column "draggableviews_structure_weight" does not exist LINE 7: ORDER BY COALESCE(draggableviews_structure_weight, 10000) AS... '.

If you uncheck this option SQL sentences turnis into:

SELECT node.title AS node_title, draggableviews_structure.weight AS draggableviews_structure_weight
FROM
{node} node
INNER JOIN ...
[...]
ORDER BY draggableviews_structure_weight ASC

As I've read problems seems to be the COALESCE function, PostgreSQL wants it to be in the column selection section (just after de SELECT keyword).

I'm not sure if this is a Draggable Views bug or Views bug (I've to have a look the handler that handles that sort field) but for the moment I can live without this check ;)

That's all, I hope it can be useful for someone. Comments are welcome!

Comments

Use DBTNG Migrator

Josh Waihi's picture

Assuming your PostgreSQL or MySQL database don't use datatypes that are not supported by Drupal, you can use DBTNG Migrator to migrate databases to and from PostgreSQL, MySQL, SQLite or any database that Drupal has a driver for (including MongoDB). See http://drupal.org/project/dbtng_migrator

Nice module

tunic's picture

I searched for something like that (remembering the 'There's a Module for That!' sentence) but didn't find anything. I even searched this group but the only post that talks about MySQL to PosrtgreSQL doesn't have a certain answer :(

But this module seems to do de migration without errors in one step, thanks for pointint it out.

I'll have to improve my search skills :D

Thanks again!

Drupal 7 on postgresql

yoonghm's picture

Hi:

I am new to Drupal. Does it mean that there is not out-of-the-box from Drupal 7 on PostgreSQL?

Lately, there have been a lot of bad news regarding MySQL and Oracle. I wonder if Drupal 7 care about migrating to PostgreSQL or at least support both in the core as well as contributed modules.

Drupal 7 runs fine on PostgreSQL

sbuttgereit's picture

Hi--

I run Drupal 6 and 7 on PostgreSQL 9+ successfully and have been running Drupal on PostgreSQL for some time. Migration from a pre-existing MySQL Drupal installation to PostgreSQL is a very different matter than starting on PostgreSQL directly.

I have seen a few modules that can have problems with PostgreSQL, but most proper Drupal coding avoids these sorts of problems.

Steven C. Buttgereit
Managing Director
Muse Systems
www.musesystems.com

Drupal core supports MySQL,

Josh Waihi's picture

Drupal core supports MySQL, PostgreSQL and SQLite. There are drivers for Oracle and MSSQL. If you want to port from one database to another, presuming your database is standard Drupal. You can use DBTNG Migrator to do that: http://drupal.org/project/dbtng_migrator

Postgresql

Group organizers

Group notifications

This group offers an RSS feed. Or subscribe to these personalized, sitewide feeds:

Hot content this week