One another way to sanitize database - DB Sanitizer

sinn's picture

Problem/Motivation.
Drupal stores data for each entity in few tables: base table, field tables, revision table, field revision tables. It becomes difficult to support script for cleaning database when we have a lot of entities in a project and structure of entities are changed periodically.

Existing method of cleaning database from personal and critical data drush sql-sanitize command has few restrictions: need to implement hook_drush_sql_sync_sanitize in your modules and write all sql commands manually, need to keep in mind entities structure and it doesn't have UI.

I would like to represent another way to sanitize database: DB Sanitizer.

DB Sanitizer key features

  • Supports configurations - you can create few configurations to sanitize database for different use cases.
  • Management of single tables and entities separately
  • Checks whether new tables or entities was added - module says if some table was added so you have to decide what to do with this table.
  • Handles entity revisions - module generates sql code for revision tables.
  • Drush support to create sql script file and clean db - it helps integrate module in your CI workflow.

I'm really interested in feedback of Drupal Support community about this module. Is it helpful or not? What can be improved? Module is in sandbox but is fully working.

Comments

This looks interesting,

greggles's picture

This looks interesting, thanks for posting it.

There's also the Scrambler which Nico presented on recently.

I wrote in this group a while ago about feature in paranoia module: sanitizing db tables based on a whitelist approach which I added to Paranoia module a little while ago.

I'm curious what your thoughts are on all these approaches. Seems like an impressive proof of the power of contrib that we have so many options for cleaning up databases :)

Hi, Greg.I've checked

sinn's picture

Hi, Greg.

I've checked modules and my thoughts are below:

Paranoia:
+ easy to use
- we can't manage it.

Scrambler:
+ API exist
+ I like idea of methods that they use.
-/+ Fields are sanitized for all entities. It is easier to manage. But what will be if we need to sanitize fields from only some entity?
- Need to write custom code to sanitize ordinary tables.
- X Autoload dependency.
- Settings can't be exported.

DB Sanitizer:
- UI looks quite difficult to use.
- Need to write pieces of sql code.

For Scrambler and DB Sanitizer modules would be good to have hook_drush_sql_sync_sanitize implementation.

Can you clarify what you mean

greggles's picture

Can you clarify what you mean by this:

  • we can't manage it.

It's not clear to me what that means. Maybe there's a misperception about how it works?

When I saw Paranoia last time

sinn's picture

When I saw Paranoia last time there were pre-defined rules only. Using hooks to build some additional rules isn't simple. Using DB sanitizer we can add rules to sanitize in back-office.