Performance Tests for Storage Alternatives

We encourage users to post events happening in the community to the community events group on https://www.drupal.org.
KarenS's picture

There is lots of concern about the performance of dates. There are actually two parts to this:

  • the time it takes to retrieve date values stored in one method or another, a database issue.
  • how much time it takes to massage the raw data into the desired format for display, a PHP issue.

The second one is the real bottleneck when working with dates, and is unrelated to the storage method being used. This page is to discuss the first topic, the part of the performance that is related to the storage method that is used.

I am setting up some performance tests using Apache Bench. I created a new installation that has three content types, one that has a date that uses the native datetime field for storage, one that has a date that uses the ISO date stored in a varchar field, and one that uses a unix timestamp that is stored in an int field. All of them have the same granularity (down to the minute), all have an optional end date, and are required fields. To minimize PHP's part in this issue, each of them are set up to use the 'Plain' formatter, which just displays the raw value and does not try to do any timezone conversion or massage the raw value back into a recognizable date.

I used devel generate to create 2,000 users and 5,000 nodes of each type. I created a view for each date field that displays an unformatted list of the title and datefield (the 'plain' format of the date), sorted by the date field. I also made sure that Views caching is turned off, and that page caching is turned off. The field data is still being cached, there is no way to turn field caching off that I know of. I am not using pathauto or any url aliases since I am trying to keep the page as simple as possible and limit it as much as possible to database queries that use the date fields.

I'm going to display a summary of the results of running Apache Bench on each of those views for comparison.

Hopefully this is a fairly simple way to assess 'real world' performance implications of the alternative storage methods. I plan to test some other views of this data as well, and I'm open to other ideas about ways to assess this.

Comments

Storage/query performance testing

sun's picture

In order to test the actual storage/query performance in various situations, you'll have to add some more zeros to the amount of nodes; e.g., make that 5,000,000 nodes.

The Drupal node/entity/views concepts are actually not really relevant for DB performance testing.

Rather important:

  • Plenty of records
  • Different value for each record
  • A few clearly defined test scenarios; e.g.:
    1. Query the 10 most recent records, ordered reverse-chronologically (newest first). (typical Recent X block)
    2. Query 20 records that have been created in May 2011, ordered reverse-chronologically, starting from 41th record. (typical archive listing with pager on page 3)
    3. Query all records that have been created in the past 12 hours (time-based), ordered chronologically. (typical ticker/cron/queue processing)
    4. Count all records that have been created on a Saturday. (typical statistics query)
      Alternatively, count all records created on each weekday.

To accomplish this, you could copy the node table structure into a temporary testing table, remove the vid column, change the created column into a createdISO VARCHAR and add a createdDateTime DATETIME to it, create two separate indexes for both columns, and lastly, pollute the table with plenty of records using different creation date values (but the same within one record), e.g., by cutting off and reverse-counting from the current time():

// Yields roundabout ~1.3m records.
$count = (int) (time() / 1000);

while ($count--) {
  $createdISO = date('Y-m-d\TH:i:s', $count * 1000);
  $createdDateTime = date('Y-m-d H:i:s', $count * 1000);
  db_query("INSERT INTO node_copy (title, createdISO, createdDateTime)
    VALUES ('$count', '$createdISO', '$createdDateTime')");
}

Then, perform the above test cases and measure the query performance.

Daniel F. Kudwien
netzstrategen

Date API

Group organizers

Group notifications

This group offers an RSS feed. Or subscribe to these personalized, sitewide feeds: