Defining Schema data types

bjaspan's picture

Now that Schema almost has a driver for PostgreSQL I needed to resolve the data type issue. I like Frando's approach (I seem to say that a lot) of storing 'type' and 'size' separately because it makes a variety of things easier. For example:

$schema['mytable'] = array(
  'cols' => array(
     'col1' = array('type' => 'int', 'size' => 'big', ...)));

In a column specifier, 'size' is optional and defaults to 'normal'. For type varchar, 'size' is ignored. I changed a few of the constants Frando defined to make them more like what I think people will expect (e.g. 'varchar' instead of 'string') and I normalized the size constants so that all types use 'big' instead of some using 'long'.

So, what I have so far for MySQL:

    $map = array(
      'varchar:normal' => 'VARCHAR',
      'text:small' => 'SMALLTEXT',
      'text:medium' => 'MEDIUMTEXT',
      'text:big' =>  'LONGTEXT',
      'text:normal' => 'TEXT',

      'int:tiny' => 'TINYINT',
      'int:small' => 'SMALLINT',
      'int:medium' => 'MEDIUMINT',
      'int:big' => 'BIGINT',
      'int:normal => 'INT',

      'float:tiny' => 'FLOAT',
      'float:small' => 'FLOAT',
      'float:medium' => 'FLOAT',
      'float:big' => 'DOUBLE',
      'float:normal' => 'FLOAT',

      'blob:big' => 'LONGBLOG',
      'blob:normal' => 'BLOB',

      'datetime:normal' => 'DATETIME',

and for PostgreSQL:

    $map = array(
      'varchar:normal' => 'varchar',
      'text:small' => 'text',
      'text:medium' => 'text',
      'text:big' => 'text',
      'text:normal' => 'text',
      'int:tiny' => 'smallint',
      'int:small' => 'smallint',
      'int:medium' => 'integer',
      'int:big' => 'bigint',
      'int:normal' => 'integer',
      'float:tiny' => 'real',
      'float:small' => 'real',
      'float:medium' => 'real',
      'float:big' => 'double precision',
      'float:normal' => 'real',
      'blob:big' => 'bytea',
      'blob:normal' => 'bytea',

      'datetime:normal' => 'timestamp',



I definitely like

Frando's picture

I definitely like it!
Centralizing "big" and "long" is surely a good thing, as is adding a datetime type.
I'll have some more time to spend on this from tomorrow on.

Auto-increment type?

bjaspan's picture

I think we should add an auto-incrementing column type (e.g. "serial") or, equivalently, an auto-increment property that can be assigned to integer columns. I think I prefer the latter but the distinction isn't that important.

Drupal does not currently use auto-increment I thought "for database compatibility." However, MySQL 3+ (at least) and PostgreSQL 7.2+ (at least) support it and it seems unlikely that any other serious DBMS does not. For the database schema API to reach its potential we will need all tables to have a primary key and an auto-increment type will be the easiest way to add it.

I've asked on the developers list to find out why Drupal currently avoids it.

I asked on the developers

bjaspan's picture

I asked on the developers list and am getting a wide variety of answers. :-)

However, I have also discovered that Drupal already does use and depend on auto-incrementing columns. For example, the primary key of the roles table is auto-incr/serial.

I'm going to add a serial type to Schema to make replicating current Drupal behavior possible. We will force the issue to be decided one way or hte other. If the community decides against auto-incr, we can always take it out.

Yes to auto-increment.

bjaspan's picture

Discussion on the developers list, particularly Dries' contribution, makes it clear that an auto-incrementing type has a place in Schema and core.

Looks good.

sedave's picture

Only just noticed this list (and the corresponding module). Its looking good.

Quick question, will the documentation mention specifics of what is required of the type:size combo i.e. a text:big has to be able to store 2GB of character data, or int:medium has to be able to store a 32 bit signed integer (or values −2,147,483,648 to +2,147,483,647), etc...

This will help when future databases are being added so instead of saying MySQL uses longtext for text:big and can hold x amount, so I'll need a xxxx they can say Schema's text:big is specified as being required to handle at least 2GB of character data, so I'll use CLOB(2GB)

That's not a bad idea though

bjaspan's picture

That's not a bad idea though I worry that it will turn into a nightmare in practice. Consider that C and C++ don't guarantee minimum or maximum sizes for the good reason that it is impossible to do in a way that works for all systems. On the other hand, DBMSs are a much smaller and more unified space than all processing environments.

So, make a proposal. :-)