There is lots of concern about the performance of dates. There are actually two parts to this:
- the time it takes to retrieve date values stored in one method or another, a database issue.
- how much time it takes to massage the raw data into the desired format for display, a PHP issue.
The second one is the real bottleneck when working with dates, and is unrelated to the storage method being used. This page is to discuss the first topic, the part of the performance that is related to the storage method that is used.
I am setting up some performance tests using Apache Bench. I created a new installation that has three content types, one that has a date that uses the native datetime field for storage, one that has a date that uses the ISO date stored in a varchar field, and one that uses a unix timestamp that is stored in an int field. All of them have the same granularity (down to the minute), all have an optional end date, and are required fields. To minimize PHP's part in this issue, each of them are set up to use the 'Plain' formatter, which just displays the raw value and does not try to do any timezone conversion or massage the raw value back into a recognizable date.
I used devel generate to create 2,000 users and 5,000 nodes of each type. I created a view for each date field that displays an unformatted list of the title and datefield (the 'plain' format of the date), sorted by the date field. I also made sure that Views caching is turned off, and that page caching is turned off. The field data is still being cached, there is no way to turn field caching off that I know of. I am not using pathauto or any url aliases since I am trying to keep the page as simple as possible and limit it as much as possible to database queries that use the date fields.
I'm going to display a summary of the results of running Apache Bench on each of those views for comparison.
Hopefully this is a fairly simple way to assess 'real world' performance implications of the alternative storage methods. I plan to test some other views of this data as well, and I'm open to other ideas about ways to assess this.
Comments
Storage/query performance testing
In order to test the actual storage/query performance in various situations, you'll have to add some more zeros to the amount of nodes; e.g., make that 5,000,000 nodes.
The Drupal node/entity/views concepts are actually not really relevant for DB performance testing.
Rather important:
Alternatively, count all records created on each weekday.
To accomplish this, you could copy the node table structure into a temporary testing table, remove the
vid
column, change thecreated
column into acreatedISO
VARCHAR and add acreatedDateTime
DATETIME to it, create two separate indexes for both columns, and lastly, pollute the table with plenty of records using different creation date values (but the same within one record), e.g., by cutting off and reverse-counting from the currenttime()
:// Yields roundabout ~1.3m records.
$count = (int) (time() / 1000);
while ($count--) {
$createdISO = date('Y-m-d\TH:i:s', $count * 1000);
$createdDateTime = date('Y-m-d H:i:s', $count * 1000);
db_query("INSERT INTO node_copy (title, createdISO, createdDateTime)
VALUES ('$count', '$createdISO', '$createdDateTime')");
}
Then, perform the above test cases and measure the query performance.
Daniel F. Kudwien
netzstrategen