One of the first things we to do is define the data structure that modules will use for declaring their tables. Schema module, Frando's patch, dopry's patch, and CCK fields all use different structures though their underlying approaches are basically the same. We need to decide what kind of information the data structures must contain and can contain. This will obviously include at least columns (name, type, length, nullable, default), indices (name, columns, primary vs unique vs multi), and join/relationship information.
We should be database-independent enough to support a broad range of DBMS's but not try to be so general that we go crazy. The vast majority of Drupal uses int, varchar, and longtext columns, with a float here and there. We don't have to support everything.
Defining the join relationships is critical. Eliminating CREATE TABLE is a primary goal, of course, but I believe the functionality enabled by having join information available will be the killer app for the Schema API.
So, the floor is open. What should the table-definition data structure look like?

Comments
As to db types to support,
As to db types to support, the event module is going to start using a datetime field, so it would be good to support that, and generally a good idea to have a datetime field. I have worked with a number of different databases and I think all of them have a datetime field in the format YYYY-MM-DD HH:MM:SS. Also, there is a current issue on CCK discussing the need for a true decimal field instead of only a float field, and I think there is a consensus that it is needed.
not true for PostgreSQL?
As far as I can tell (though I havent' really used it) PostgreSQL does not have a datetime format:
http://www.postgresql.org/docs/7.4/interactive/datatype-datetime.html
The type that postgres calls
The type that postgres calls 'Timestamp' is in the format we want.
really?
But maybe the input/output is ISO format?
I know I can use SQL like "
I know I can use SQL like " <= '2007-03-07 12:45:00'" on those fields, and they seem to accept that format for input, but I guess I'm not sure how they're stored internally.
Defining table keys
Frando, your most recent patch at http://drupal.org/node/136171 shows that table definitions (what you call a model) can have separate elements called 'primary key', 'unique key', 'keys', and 'indexes'. Questions:
The name 'unique key' makes it sound like there can only be one but a table can have multiple unique keys, no? In fact, your code supports multiple unique keys.
What is the difference between 'keys' and 'indexes'? In MySQL at least, KEY and INDEX are synonyms, aren't they? Is this a distinction that other DBMS's have that MySQL doesn't?
In the Schema module, the single 'keys' element contains all the key definitions. It is an assoc array like 'cols' that maps key name to data about the key. The name 'PRIMARY' is defined to indicate the primary key and the numeracy of the key (unique vs multi) is encoded as a positional field in the array. It looks like this:
$schema['tablename'] = array('cols' => array(...),
'keys' => array(
'PRIMARY' => array('col1, col2'),
'key1' => array('col3, col4', 'unique'),
'key2' => array('col5, col6', 'multi')));
I now think the way I did this is wrong; the column names for a key should clearly be stored in an array, not as comma-separated values. Also, I think using a positional element for unique vs multi is wrong since everything else is associative.
What our approaches share is direct access to the primary key which, I've found while writing Schema, is important. However, I don't see the need for separate 'unique key', 'keys', and 'indexes' elements, unless you clue me in. So, I propose we use your structure but rename 'unique key' to 'unique keys' and combine 'keys' and 'indexes'. So:
$schema['tablename'] = array('cols' => array(...),
'primary key' => array('col1', 'col2'),
'unique keys' => array(
array('key1' => array('col3', 'col4')),
'keys' => array(
array('key2' => array('col5', 'col6')));
comments?
PostgreSQL support underway
I started working on the postgres engine for Schema module today. I've written most of the database-inspection code. Still need to decide on the support set of types and their mappings to MysQL and PostgreSQL.
In my previous comment I wrote, "What is the difference between 'keys' and 'indexes'? In MySQL at least, KEY and INDEX are synonyms, aren't they? Is this a distinction that other DBMS's have that MySQL doesn't?"
The answer is two the last two questions are "yes" and "yes." In MySQL, an INDEX and a non-unique KEY are the same thing. In PostgreSQL, all keys are unique; indexes are a different type of entity. Therefore, I think Frando's approach of defining them separately is a good one. I suggest that each table entry can define:
$schema['table'] = array('cols' => array('col1' => array(...), ...),
'primary' => array('col1', 'col2', ...),
'unique' => array('key1' => array('col3', 'col4', ...), ...),
'index' => array('idx1' => array('col5', 'col6', ...), ...),
);
Please don't forget comments
I think its important to allow for both table and field comments. This is a great place to build in some documentation. If you are using a tool that displays the comments it makes understanding a new schema element much easier.
Joe Moraca
http://www.webdevgeeks.com
Joe Moraca
WebDevGeeks.com
Excellent suggestion.
Yes, descriptions should be included. How about just adding an optional 'description' element to each level?
$schema['mytable'] = array('description' => t('mytable stores foo data...'),
'cols' => array(
'id' = array('description' => t('what I want to say about id'), 'type' => 'int'),
...),
);
Schema can then display these fields in appropriate admin overview pages. Particularly when we add join info, it will be able to draw out the self-documenting schema.