Tuning MySQL - Adding indexes, lots of them: Shotgun Strategy

We encourage users to post events happening in the community to the community events group on https://www.drupal.org.
mikeytown2's picture

This discussion got me thinking...
http://groups.drupal.org/node/56438#comment-160418

So I decided to create some code that will auto add indexes to your CCK fields where there are none. Works for me (using MySQL), may not work for you. Also at the bottom is code to add indexes elsewhere in Drupal.

<?php
$ret
= array();
// Add indexes to CCK fields
$result = db_query("SHOW TABLES LIKE 'content_%'");
while (
$table = db_result($result)) {
  if (
$table == 'content_group') {
   
db_add_index($ret, 'content_group', 'weight', array('weight'));
   
db_add_index($ret, 'content_group', 'type_name', array('type_name'));
   
db_add_index($ret, 'content_group', 'group_name', array('group_name'));
   
db_add_index($ret, 'content_group', 'weight-group', array('weight', 'group_name'));
    continue;
  }
  if (
$table == 'content_node_field') {
   
db_add_index($ret, 'content_node_field', 'type', array('type'));
   
db_add_index($ret, 'content_node_field', 'widget_type', array('widget_type'));
    continue;
  }
  if (
$table == 'content_node_field_instance') {
    continue;
  }
 
// Add indexes for CCK Fields
 
$data = db_query("SHOW COLUMNS FROM %s"$table);
  while (
$column = db_fetch_array($data)) {
    if (
$column['Key'] == '' && !stristr($column['Field'], 'data') && !stristr($column['Field'], 'format') && !stristr($column['Type'], 'text')) {
     
db_add_index($ret, $table, $column['Field'], array($column['Field']));
    }
  }
}

// Add indexes elsewhere
db_add_index($ret, 'access', 'type', array('type'));
db_add_index($ret, 'access', 'mask', array('mask'));
db_add_index($ret, 'access', 'status', array('status'));
db_add_index($ret, 'comments', 'timestamp', array('timestamp'));
db_add_index($ret, 'node_comment_statistics', 'comment_count', array('comment_count'));
db_add_index($ret, 'menu_links', 'external', array('external'));
db_add_index($ret, 'menu_links', 'updated', array('updated'));
db_add_index($ret, 'menu_links', 'customized', array('customized'));
db_add_index($ret, 'menu_links', 'cust-ext', array('customized', 'external'));
db_add_index($ret, 'menu_links', 'depth', array('depth'));
db_add_index($ret, 'menu_custom', 'title', array('title'));
db_add_index($ret, 'users', 'pass', array('pass'));
db_add_index($ret, 'users', 'status', array('status'));
db_add_index($ret, 'filter_formats', 'roles', array('roles'));
db_add_index($ret, 'term_data', 'name', array('name'));
db_add_index($ret, 'imagecache_preset', 'presetname', array('presetname'));
db_add_index($ret, 'blocks', 'module', array('module'));
db_add_index($ret, 'blocks', 'delta', array('delta'));
db_add_index($ret, 'system', 'name', array('name'));
db_add_index($ret, 'system', 'status', array('status'));
db_add_index($ret, 'system', 'type', array('type'));
db_add_index($ret, 'date_format_types', 'title', array('title'));
db_add_index($ret, 'node_counter', 'totalcount', array('totalcount'));
db_add_index($ret, 'node_counter', 'daycount', array('daycount'));
db_add_index($ret, 'node_counter', 'timestamp', array('timestamp'));
db_add_index($ret, 'files', 'filepath', array('filepath'));


// Output some data if not running from update.php
//echo str_replace('    ', '&nbsp;&nbsp;&nbsp;&nbsp;', nl2br(htmlentities(print_r($ret, TRUE))));
?>

Comments

Thanks

rjbrown99's picture

Thanks, that worked for me unmodified.

sorry for my nobism but where

ddorian's picture

sorry for my nobism but where to put this

Code Placement

mikeytown2's picture

If you have a custom module, add it in an update function and add in return $ret; at the end.
http://api.drupal.org/api/function/hook_update_N/6

If the above is over your head, then create a new node of type page, making sure to use the php filter. All you need to do is preview the node, saving the node would gain you nothing. Also uncomment the last line, so it outputs something. Code only needs to run once. Running the code multiple times is pointless; nothing bad will happen so don't be afraid. Run this code again after adding in new CCK fields or when a CCK field goes from storing 1 value to more then 1.

Thanks for that...

AndyF's picture

would it be worth bunging the bottom lot in a hook_install and the cck bits in a hook_cron and putting them in a contrib indexy module?

Suggestion

rjbrown99's picture

How about submitting this as a patch to the DB Maintenance module?
http://drupal.org/project/db_maintenance

That small module does things like optimize tables, seems like it may be a good fit to also have it add indexes.

More slow queries

rjbrown99's picture

Here are a few more slow queries from my log. I'm only posting the general purpose ones that probably apply to lots of other people.

EXPLAIN SELECT nt.type, nt.* from mydb.node_type nt ORDER BY nt.type ASC
EXPLAIN SELECT dft.type, dft.title, dft.locked from mydb.date_format_types dft ORDER BY dft.title
EXPLAIN SELECT cid, category, selected from mydb.contact ORDER BY weight, category
EXPLAIN SELECT MAX(totalcount) from mydb.node_counter

Index creation would then look something like this yes?

db_add_index($ret, 'node_type', 'type', array('type'));
db_add_index($ret, 'date_format_types', 'type', array('type'));
db_add_index($ret, 'date_format_types', 'title', array('title'));

Indexes

mikeytown2's picture

EXPLAIN SELECT nt.type, nt.* from mydb.node_type nt ORDER BY nt.type ASC

Selects all values from the table (nt.*). Drupal by default already has an index on the type row for node_type. db_add_index() will not help; index already exists.
http://api.drupal.org/api/function/node_schema/6

EXPLAIN SELECT dft.type, dft.title, dft.locked from mydb.date_format_types dft ORDER BY dft.title

Selects all values from the table as well (only 3 rows). Drupal by default already has an index on the title row for date_format_types. db_add_index() will not help; index already exists.

EXPLAIN SELECT cid, category, selected from mydb.contact ORDER BY weight, category

Drupal by default already has an index on weight & category, so having an individual index on each may not do anything.
http://api.drupal.org/api/function/contact_schema/6

EXPLAIN SELECT MAX(totalcount) from mydb.node_counter

Forgot about this one...
http://drupal.org/node/624000

I'll add it to the code at the top. It would look like this
http://api.drupal.org/api/function/statistics_title_list/6

<?php
db_add_index
($ret, 'node_counter', 'totalcount', array('totalcount'));
db_add_index($ret, 'node_counter', 'daycount', array('daycount'));
db_add_index($ret, 'node_counter', 'timestamp', array('timestamp'));
?>

your syntax is correct.

Hrmmmmm, I don't know about this...

dalin's picture

@mikeytown I think you've got something far too generalized here. You're basically adding indexes to as many fields as possible which has it's own performance drawbacks (UDPATE/INSERT/DELETEs become much slower, and some other SELECTs as well). Many of the indexes that you are adding could only improve queries on building select boxes in the admin section - at the cost of slower queries for more general pages. You are also adding several indexes that already exist which will also slow things down. Also I'm guessing that in some situations what you really need are a single index on multiple fields. Remember that a database can only utilize one index per query.

Here's my standard method of query improvement:

  • Every few months analyze the slow query log.
  • For the worst offenders (in both frequency of offence, and total sum of query time) see if the query can be improved either by rewriting, or adding indexes. Queries that run during cron, or in the admin section are generally not candidates for optimization unless they either cause locking/blocking or can be optimized without affecting general performance.
  • Make changes and test. Run EXPLAIN before and after your change, and run the query (with SQL_NO_CACHE) before and after the change. Often times an index that you think should help is not actually utilized. Some times you can re-arrange the query so that the EXPLAIN looks much worse, but the query actually performs much better.
  • If the query is being called from a module, submit a patch.

@rjbrown99 The better place to put index creation statements is in the module that is running the query that doesn't utilize indexes. If module whiz_bang does some strange query on the comments table that isn't using an index, then whiz_bang should be implementing hook_schema_alter to add an index.

--


Dave Hansen-Lange
Director of Technical Strategy, Advomatic.com
Pronouns: he/him/his

MySQL 5.0 & index_merge

mikeytown2's picture

Remember that a database can only utilize one index per query.

New in 5.0 is index_merge
http://dev.mysql.com/doc/refman/5.0/en/index-merge-optimization.html

After playing around with the menu_links table I get an explain like so

id    select_type     table   type            possible_keys                              
1     SIMPLE          ml      index_merge     router_path,updated,customized,external  

key                          key_len  ref     rows    Extra
updated,customized,external  2,2,2    NULL    69      Using union(updated,intersect(customized,external)); Using where

The key is intersect(customized,external); This can be a single index. Updated top post with this multi-key index. Now the explain generates this

id  select_type   table   type            possible_keys 
1   SIMPLE        ml      index_merge     router_path,updated,customized,external,cust-ext  

key               key_len     ref     rows  Extra
updated,cust-ext  2,4         NULL    75    Using union(updated,cust-ext); Using where

Which looks like you haven't

dalin's picture

Which looks like you haven't gained very much (estimated rows increased, but a better merge algorithm). Did the query execute any faster? You may need to benchmark both scenarios.

What is the query that you are attempting to improve?

IIRC index_merge is only possible if all the necessary indexes are already in memory (so there has to be enough RAM available to MySQL to keep many indexes in RAM).

--


Dave Hansen-Lange
Director of Technical Strategy, Advomatic.com
Pronouns: he/him/his

Query is faster; but no index wins.

mikeytown2's picture

InnoDB:
0.0107 sec 3 indexes
0.0104 sec 2 indexes (2 combined)
0.0036 sec no indexes

MyISAM:
0.0219 sec 3 indexes
0.0216 sec 2 indexes (2 combined)
0.0036 sec no indexes

Query Cache:
0.0003 sec

Explain with no index selected.

id  select_type  table   type  possible_keys  key  key_len  ref   rows  Extra
1   SIMPLE       ml      ALL   router_path    NULL NULL     NULL  1386  Using where

Key Buffer size is 32M so the extra keys are all in memory. Also server is tuned for InnoDB, so the MyISAM numbers could be better.

Table: menu_links - 1,386 rows.
Query: Was in the "not using index" log & it looked interesting.
Query is found in _menu_navigation_links_rebuild
http://drupal.org/node/593682

SELECT ml.link_path, ml.mlid, ml.router_path, ml.updated from menu_links ml WHERE ml.updated = 1 OR (router_path NOT IN ('node','rss.xml','admin','blog','batch','contact','forum','logout','user','user/login','system/files','filefield/progress','masquerade/autocomplete-user','user/timezone','admin_menu/flush-cache','admin_menu/toggle-modules','masquerade/autocomplete','taxonomy/autocomplete','blog/feed','content/js_add_more','admin/by-module','admin/by-task','admin/compact','filter/tips','node/add','comment/delete','comment/edit','masquerade/unswitch','user/register','user/password','views/ajax','content/press-releases','wysiwyg/%','user/autocomplete','admin/content','blog/%','admin/reports','admin/build','admin/settings','admin/user','node/%','user/%','node/%/view','user/%/view','system/files/imagecache','masquerade/autocomplete/multiple','js/path_redirect/autocomplete_404','admin/settings/actions','admin/user/rules','admin/reports/updates','admin/build/block','admin/content/comment','admin/build/contact','admin/content/feed','admin/settings/imageapi','admin/build/imagecache','admin/settings/logging','masquerade/switch/%','admin/settings/menu_breadcrumb','admin/settings/nice_menus','node/%/refresh','node/%/purge','admin/reports/status','taxonomy/term/%','admin/build/path','admin/settings/admin_menu','blog/%/feed','admin/settings/clean-urls','user/%/contact','admin/content/node','admin/content/types','admin/settings/date-time','node/%/delete','user/%/delete','node/%/edit','admin/settings/error-reporting','admin/settings/feedapi','admin/settings/file-system','admin/content/forum','admin/settings/image-toolkit','admin/settings/filters','admin/settings/masquerade','admin/build/menu','admin/build/modules','admin/settings/performance','admin/user/permissions','admin/content/node-settings','admin/content/rss-publishing','comment/reply/%','node/%/revisions','admin/user/roles','admin/settings/site-information','admin/settings/site-maintenance','admin/content/taxonomy','admin/build/themes','admin/build/path-redirect','admin/user/settings','admin/user/user','admin/build/views','admin/settings/wysiwyg','user/%/edit','admin/content/page_title','admin/settings/admin','node/add/blog','node/add/book','node/add/feed','node/add/forum','node/add/in-the-news','node/add/news','node/add/page','node/add/press-release','node/add/homepage-scrolling-news','node/add/story','admin/build/block/list','admin/content/forum/list','admin/build/imagecache/list','admin/content/node/overview','admin/content/types/list','admin/build/path/list','admin/settings/imageapi/list','admin/settings/filters/list','admin/build/path-redirect/list','admin/build/modules/list','admin/content/taxonomy/list','admin/user/rules/list','admin/user/user/list','admin/settings/wysiwyg/profile','admin/content/comment/new','user/%/edit/account','admin/content/feed/list','admin/build/themes/select','admin/build/menu/list','admin/build/themes/settings','admin/build/contact/list','sites/datasphere.com/files/imagecache','admin/settings/actions/manage','admin/settings/actions/orphan','admin/build/modules/uninstall','admin/build/path/add','admin/settings/filters/add','admin/user/rules/add','admin/user/user/create','admin/content/comment/approval','admin/user/rules/check','admin/settings/clean-urls/check','admin/settings/imageapi/config','admin/settings/actions/configure','admin/settings/date-time/lookup','admin/build/path/edit','admin/user/roles/edit','admin/user/rules/edit','admin/content/feed/export_opml','admin/content/types/fields','admin/reports/updates/list','admin/reports/updates/check','admin/reports/status/php','admin/content/node-settings/rebuild','admin/reports/status/run-cron','admin/reports/status/sql','admin/build/block/add','admin/build/contact/add','admin/content/types/add','admin/build/menu/add','admin/build/imagecache/add','admin/content/node-type/blog','admin/content/node-type/book','admin/build/block/configure','admin/build/path/delete','admin/build/path/delete_bulk','admin/build/block/delete','admin/settings/filters/delete','admin/user/rules/delete','admin/content/node-type/feed','admin/content/node-type/forum','admin/content/feed/import_opml','admin/content/node-type/in-the-news','admin/content/taxonomy/%','admin/content/node-type/news','admin/content/node-type/page','admin/content/node-type/press-release','admin/content/node-type/homepage-scrolling-news','admin/build/contact/settings','admin/build/menu/settings','admin/reports/updates/settings','admin/content/node-type/story','admin/build/views/add','admin/build/views/list','admin/build/path-redirect/add','admin/build/views1/convert','admin/build/path/pathauto','admin/settings/date-time/configure','admin/build/views1/delete','admin/build/path-redirect/export','admin/content/taxonomy/export','admin/settings/date-time/formats','admin/content/taxonomy/import','admin/content/forum/settings','admin/build/path-redirect/settings','admin/build/views/import','admin/build/views/tools','admin/settings/filters/%','admin/build/imagecache/%','admin/build/menu-customize/%','admin/content/node-type/blog/edit','admin/content/node-type/book/edit','admin/content/node-type/feed/edit','admin/content/node-type/forum/edit','admin/content/node-type/homepage-scrolling-news/edit','admin/content/node-type/in-the-news/edit','admin/content/node-type/news/edit','admin/content/node-type/page/edit','admin/content/node-type/press-release/edit','admin/content/node-type/story/edit','admin/build/themes/settings/global','admin/content/taxonomy/%/list','admin/settings/filters/%/edit','admin/build/modules/list/confirm','admin/build/menu-customize/%/list','admin/build/modules/uninstall/confirm','admin/settings/date-time/formats/lookup','admin/build/themes/settings/bluemarine','admin/build/themes/settings/bulletin_board','admin/build/themes/settings/bulletin_board_v2','admin/build/themes/settings/chameleon','admin/build/themes/settings/corkboard','admin/build/themes/settings/datasphere','admin/content/node-type/blog/delete','admin/content/node-type/book/delete','admin/content/node-type/feed/delete','admin/content/node-type/homepage-scrolling-news/delete','admin/content/node-type/in-the-news/delete','admin/content/node-type/news/delete','admin/content/node-type/page/delete','admin/content/node-type/press-release/delete','admin/content/node-type/story/delete','admin/content/node-type/forum/delete','admin/content/taxonomy/edit/term','admin/build/themes/settings/garland','admin/build/themes/settings/hyperlocal_base','admin/build/block/list/js','admin/build/themes/settings/local_newspaper','admin/build/themes/settings/marvin','admin/build/themes/settings/pushbutton','admin/build/themes/settings/zen','admin/build/themes/settings/zen_classic','admin/build/themes/settings/zen_datasphere','admin/build/themes/settings/STARTERKIT','admin/build/themes/settings/hyperlocal','admin/content/forum/edit/%','admin/build/themes/settings/minnelli','admin/settings/imageapi/config/imageapi_gd','admin/build/menu-customize/%/add','admin/build/block/list/bluemarine','admin/build/block/list/bulletin_board','admin/build/block/list/bulletin_board_v2','admin/build/block/list/chameleon','admin/settings/filters/%/configure','admin/build/block/list/corkboard','admin/settings/date-time/formats/custom','admin/build/block/list/datasphere','admin/settings/actions/delete/%','admin/build/contact/delete/%','admin/build/menu-customize/%/delete','admin/content/node-type/blog/display','admin/content/node-type/book/display','admin/views/ajax/autocomplete/tag','admin/content/node-type/feed/display','admin/content/node-type/forum/display','admin/content/node-type/homepage-scrolling-news/display','admin/content/node-type/in-the-news/display','admin/content/node-type/news/display','admin/content/node-type/page/display','admin/content/node-type/press-release/display','admin/content/node-type/story/display','admin/build/contact/edit/%','admin/build/menu-customize/%/edit','admin/build/block/list/garland','admin/build/block/list/local_newspaper','admin/views/ajax/autocomplete/user','admin/build/block/list/hyperlocal_base','admin/content/node-type/blog/fields','admin/content/node-type/book/fields','admin/content/node-type/feed/fields','admin/content/node-type/forum/fields','admin/content/node-type/homepage-scrolling-news/fields','admin/content/node-type/in-the-news/fields','admin/content/node-type/news/fields','admin/content/node-type/page/fields','admin/content/node-type/press-release/fields','admin/content/node-type/story/fields','admin/build/block/list/marvin','admin/build/block/list/minnelli','admin/build/block/list/pushbutton','admin/settings/filters/%/order','user/reset/%/%/%','admin/build/block/list/zen','admin/build/block/list/zen_classic','admin/build/block/list/zen_datasphere','filefield/ahah/%/%/%','admin/build/block/list/STARTERKIT','admin/build/block/list/hyperlocal','admin/build/views/export/%','admin/build/views/clone/%','admin/build/views/disable/%','admin/build/views/enable/%','admin/content/forum/add/container','admin/settings/date-time/formats/add','admin/content/forum/add/forum','admin/content/taxonomy/add/vocabulary','admin/build/views/break-lock/%','admin/build/views/tools/basic','admin/settings/date-time/formats/configure','admin/build/views/delete/%','admin/build/views/edit/%','admin/build/views/tools/convert','admin/settings/date-time/delete/%','admin/build/path-redirect/delete/%','admin/build/path-redirect/edit/%','node/%/revisions/%/view','admin/build/imagecache/%/%','node/%/revisions/%/delete','node/%/revisions/%/revert','admin/build/imagecache/%/delete','admin/build/imagecache/%/export','admin/build/imagecache/%/flush','admin/build/imagecache/%/override','admin/content/node-type/blog/display/basic','admin/content/node-type/book/display/basic','admin/content/node-type/feed/display/basic','admin/content/node-type/forum/display/basic','admin/content/node-type/homepage-scrolling-news/display/basic','admin/content/node-type/in-the-news/display/basic','admin/content/node-type/news/display/basic','admin/content/node-type/page/display/basic','admin/content/node-type/press-release/display/basic','admin/content/node-type/story/display/basic','admin/content/node-type/blog/display/rss','admin/content/node-type/book/display/rss','admin/content/node-type/feed/display/rss','admin/content/node-type/forum/display/rss','admin/content/node-type/homepage-scrolling-news/display/rss','admin/content/node-type/in-the-news/display/rss','admin/content/node-type/news/display/rss','admin/content/node-type/page/display/rss','admin/content/node-type/press-release/display/rss','admin/content/node-type/story/display/rss','admin/content/node-type/blog/display/token','admin/content/node-type/book/display/token','admin/content/node-type/feed/display/token','admin/content/node-type/forum/display/token','admin/content/node-type/homepage-scrolling-news/display/token','admin/content/node-type/in-the-news/display/token','admin/content/node-type/news/display/token','admin/content/node-type/page/display/token','admin/content/node-type/press-release/display/token','admin/content/node-type/story/display/token','admin/build/menu/item/%/delete','admin/content/forum/edit/container/%','admin/content/forum/edit/forum/%','admin/build/menu/item/%/edit','admin/content/taxonomy/edit/vocabulary/%','admin/build/menu/item/%/reset','admin/build/views/%/add-display/%','admin/build/views/%/%/%','admin/build/views/%/analyze/%','admin/build/views/%/details/%','admin/settings/date-time/formats/delete/%','admin/build/views/%/preview/%','admin/content/taxonomy/%/add/term','admin/build/imagecache/%/add/%','admin/build/imagecache/%/%/delete','admin/settings/wysiwyg/profile/%/edit','admin/settings/wysiwyg/profile/%/delete') AND external = 0 AND customized = 1).

I wonder how well the MySQL 5.1 optimizer works on something like this. Server is using CentOS with MySQL 5.0.45.

Generalize

rjbrown99's picture

Thanks. I'm only working with a test DB at the moment so no worries with me.

What I understood as mikeytown2's goal was to provide for some generalized indexes where they might not otherwise exist. Perhaps the initial approach goes too far in terms of creating some specific indexes? If so, is there a consensus as to what indexes might be appropriate that would not otherwise exist without creating them?

If so, is there a consensus

dalin's picture

If so, is there a consensus as to what indexes might be appropriate that would not otherwise exist without creating them?

Any place that you have a View that has filters or arguments on non-primary-key, MySQL may require an additional index. But only if those queries are showing up in your slow query log. Beyond the default indexes I don't think you'll find a consensus as to what additional indexes to add (and if you do find a consensus then it should be in the appropriate module).

You are looking for a magic bullet to improve your SQL performance, but there isn't one. You need to analyze your slow query log, try to improve queries, and weigh the benefits/costs of your optimizations for overall performance.

--


Dave Hansen-Lange
Director of Technical Strategy, Advomatic.com
Pronouns: he/him/his

CCK & Views

mikeytown2's picture

Any place that you have a View that has filters or arguments on non-primary-key, MySQL may require an additional index.

The vast majority of use cases for CCK & Views IMHO is so you can create a view targeting that CCK field. Filefield is different, but then again the fid need to be joined to the files table so it's back in the slot for being indexed. The list field is a maybe, but if you want to create a view that only has files then this is needed. Creating an index for every CCK field is overkill, but IMHO the trade-off is worth it. Most people want faster page views & the penalty for inserts with the extra index is quite small.

The rest of the indexes are for core & 99% of optimizations for core don't get accepted. Granted I could do a better job at creating the index; using multiple indexes would be the ideal. The menu_links table looks like it could use multiple columns per index.

I wanted to get the talk about database indexes going. I know for a fact that indexes on CCK, if your using views in any creative way, will help a lot. Creating something that goes through every view looking for filters and applying an index to that field seems like it would be a more targeted approach to this.

FYI, we discussed

moshe weitzman's picture

FYI, we discussed auto-indexes for cck and views at http://groups.drupal.org/node/7614

Measurement

rjbrown99's picture

This is a good discussion, as I can tell you I personally know a heck of a lot more about the OS, webserver, and PHP than I do about databases. Scaling and tuning a database for Drupal is an interesting topic. I have been trying to cobble together some thoughts on storage engines as well based on reading lots of sources, but that's probably best left for a new thread.

For purposes of this discussion about indexes, how about starting at an appropriate threshold for measurement? The default value of long_query_time is 10, and mikeytown2 suggested in the other thread of setting it to 1. Let's say I am aiming for a load time of all pages under 5 seconds. Assuming that's the case and accounting for network transit of data to a client browser, it seems like it would be reasonable to set the long_query_time to anywhere from 1 to 3 seconds.

Can we also assume that for purposes of this discussion that we are not taking into consideration any external caching solutions such as memcached and are focused purely on in-database tuning?

References:
http://dev.mysql.com/doc/refman/5.1/en/slow-query-log.html
http://groups.drupal.org/node/13844

I set long_query_time to 1.

dalin's picture

I set long_query_time to 1. You can go even lower if you are using the Percona flavour of MySQL.

If you want pages to load in <5 s then you probably need the page to be generated in about 0.5-0.75 s (most of page load time is CSS, JS, Images, Flash, etc.). SQL time can be anywhere from 10-40% of page generation time. There's often 100-300 queries per page, which means that any query that takes longer than ~10ms is possibly a problem.

--


Dave Hansen-Lange
Director of Technical Strategy, Advomatic.com
Pronouns: he/him/his

Perhaps a Database Noob Request...

DeeZone's picture

This might be considered a database Noob request but I think it would be helpful also outline at a basic level how to check on the results of the script above, in effect a turn key solution. I'm guessing many will simply copy and paste, close their eyes and hope for the best thinking this might make things a little better in Drupal Land. Perhaps some explanation of how to check on what indexes have been applied via phpMyAdmin or the mySQL command line might be in order.

I'm curious because my quick poke at phpMyAdmin didn't provide a clear indication that the indexes had been applied correctly but PHP warnings at additional previews/submits of the script above suggested the indexes are in place.

phpMyAdmin

mikeytown2's picture

Go to the table
click on the structure tab
click on Details at the bottom

Thanks!

DeeZone's picture

Thanks! Painfully obvious now that you point it out. A little less mySQL mystery in my life is always a good thing.

Finding "Useless" Indexes

mikeytown2's picture

Run this SQL
http://arjen-lentz.livejournal.com/122399.html

SELECT s.table_name,
       concat(s.index_name,'(',group_concat(s.column_name order by s.seq_in_index),')') as idx,
       GROUP_CONCAT(s.cardinality ORDER BY s.seq_in_index) AS card,
       t.table_rows
  FROM information_schema.tables t
  JOIN information_schema.statistics s USING (table_schema,table_name)
WHERE t.table_schema='drupal'
   AND t.table_rows > 1000
   AND s.non_unique
GROUP BY s.table_name,s.index_name
HAVING (card + 0) < (t.table_rows / 3)

Another way
http://forge.mysql.com/tools/tool.php?id=85

SELECT
  t.TABLE_SCHEMA AS db
, t.TABLE_NAME AS table
, s.INDEX_NAME AS inde name
, s.COLUMN_NAME AS field name
, s.SEQ_IN_INDEX seq in index
, s2.max_columns AS # cols
, s.CARDINALITY AS card
, t.TABLE_ROWS AS est rows
, ROUND(((s.CARDINALITY / IFNULL(t.TABLE_ROWS, 0.01)) * 100), 2) AS sel %
FROM INFORMATION_SCHEMA.STATISTICS s
INNER JOIN INFORMATION_SCHEMA.TABLES t
  ON s.TABLE_SCHEMA = t.TABLE_SCHEMA
  AND s.TABLE_NAME = t.TABLE_NAME
INNER JOIN (
  SELECT
     TABLE_SCHEMA
   , TABLE_NAME
   , INDEX_NAME
   , MAX(SEQ_IN_INDEX) AS max_columns
  FROM INFORMATION_SCHEMA.STATISTICS
  WHERE TABLE_SCHEMA != 'mysql'
  GROUP BY TABLE_SCHEMA, TABLE_NAME, INDEX_NAME
) AS s2
ON s.TABLE_SCHEMA = s2.TABLE_SCHEMA
AND s.TABLE_NAME = s2.TABLE_NAME
AND s.INDEX_NAME = s2.INDEX_NAME
WHERE t.TABLE_SCHEMA != 'mysql'                         /* Filter out the mysql system DB /
AND t.TABLE_ROWS > 10                                   /
Only tables with some rows /
AND s.CARDINALITY IS NOT NULL                           /
Need at least one non-NULL value in the field /
AND (s.CARDINALITY / IFNULL(t.TABLE_ROWS, 0.01)) < 1.00 /
Selectivity < 1.0 b/c unique indexes are perfect anyway /
ORDER BY sel %, s.TABLE_SCHEMA, s.TABLE_NAME          /
Switch to sel % DESC for best non-unique indexes */

As mentioned in the comments,

dalin's picture

As mentioned in the comments, there are numerous reasons why comparing cardinality to the total number of rows is not a great way to find unused indexes (though it may be one of the only ways). If you simply remove all the indexes suggested you're bound to get a huge decrease in MySQL performance. But the indexes may be worth looking into for tables that get a lot of writes.

--


Dave Hansen-Lange
Director of Technical Strategy, Advomatic.com
Pronouns: he/him/his

Worth a module?

Vacilando's picture

A very crude summary -- while mikeytown2's initial script might have disadvantages, it seems it might come useful for sites with lots of CCK tables used in Views. As dalin pointed out, a disadvantage may be that some indexes will be doubled or not used or should better be on multiple columns. And mikeytown2 shows there are some approaches to weeding out unused indexes and maybe also for detecting where multiple indexes are needed.
And Moshe pointed out there was a similar discussion before - but its steam unfortunately petered out in 2008.

It seems to me that this could merit a new module. Of course, it won't be perfect, but it may provide a focus where further efforts on automatic adding of indexes can be fine-tuned. DB Indexer, or DBoost if you like :-)
Or, as rjbrown99 said, it could be something to add to the DB Maintenance module -- though that module seems to be stuck in terms of maintenance.

If for no other reason, it would be good to package this because a script like this needs to be run regularly as the site grows and CCK and other not indexed tables are added to the site.


---
Tomáš J. Fülöpp
http://twitter.com/vacilandois

Views & CCK Tuning Code: Laser Strategy

mikeytown2's picture

This loads every view, gets all filters and relationships, and creates an index if one doesn't exist. If you have ran the code at the top then this will do nothing. This is a vastly more targeted approach as it only adds indexes where there is a view filter or view relationship. Let me know how this works for you & if this can be improved.

<?php
/**
* Add an index if it doesn't exist
*
* @param $ret
*   Array to which query results will be added.
* @param $table
*   The table to be altered.
* @param $index
*   The name of the index.
*/
function autotune_index_exists(&$ret, $table, $index) {
  global
$db_type;
 
$column_exists = FALSE;
  if (
db_table_exists($table)) {
    if (
stristr($db_type, 'pgsql')) {
     
// Selecting a db schema table, don't put pg_indexes inside {}
      //$result = db_query("SELECT * FROM pg_indexes WHERE tablename = '{%s}'", $table);
      //while ($name = db_fetch_array($result)) {
        //if (stristr($name['indexname'], $index)) {
          //return TRUE;
        //}
      //}
   
}
    else {
     
$result = db_query('SHOW COLUMNS FROM {%s}', $table);
      while (
$column = db_fetch_array($result)) {
        if (
$column['Field'] == $index && $column['Key'] != '') {
          return
TRUE;
        }
        elseif (
$column['Field'] == $index) {
         
$column_exists = TRUE;
          break;
        }
      }
    }
   
// Index doesn't exists create it
   
if ($column_exists) {
     
db_add_index($ret, $table, $index, array($index));
    }
    else {
      Return
FALSE;
    }
  }
  return
FALSE;
}



// Get all content_type tables
$cck_content_types = array();
$result = db_query("SHOW TABLES LIKE 'content_type_%'");
while (
$table = db_result($result)) {
 
$cck_content_types[] = $table;
}

// Get all filters and relationships from views
$results = db_query("
SELECT id, name
FROM views_display
INNER JOIN views_view
USING ( vid )"
);
$indexes = array();
while(
$row = db_fetch_array($results)) {
 
// Load View
 
$view = views_get_view($row['name']);
 
$filters = $view->get_items('filter', $row['id']);
 
$relationships = $view->get_items('relationship', $row['id']);
 
 
// Build data structure
 
foreach ($filters as $name => $filter) {
    if (!empty(
$filters['type']['value'])) {
      foreach(
$filters['type']['value'] as $key => $value)
     
$indexes[$filter['table']][$filter['id']]['node-type'][$key] = $value;
    }
    else {
     
$indexes[$filter['table']][$filter['id']]['node-type']['-1'] = 'all';
    }
   
$indexes[$filter['table']][$filter['id']]['relationship'] = $filter['relationship'];
  }
  foreach (
$relationships as $name => $relationship) {
   
$indexes[$relationship['table']][$relationship['id']] = $relationship['relationship'];
  }
}

// Add indexes to tables if none exist.
$ret = array();
foreach (
$indexes as $table => $values) {
  foreach (
$values as $field => $extra) {
   
$cck_table = str_replace('node_data', 'content', $table);

   
// Not a cck field
   
if (db_table_exists($table)) {
     
autotune_index_exists($ret, $table, $field);
      if (
$extra['relationship'] != 'none') {
       
autotune_index_exists($ret, $table, $extra['relationship']);
      }
    }

   
// CCK field with own table
   
elseif (db_table_exists($cck_table)) {
     
autotune_index_exists($ret, $cck_table, $field);
      if (
$extra['relationship'] != 'none') {
       
autotune_index_exists($ret, $table, $extra['relationship']);
      }
    }

   
// CCK field in content type table
   
else {
      foreach (
$cck_content_types as $cck_content_table) {
       
// Place index on all content_type tables
       
if (!empty($extra['node-type']['-1']) && $extra['node-type']['-1'] == 'all') {
         
autotune_index_exists($ret, $cck_content_table, $field);
          if (
$extra['relationship'] != 'none') {
           
autotune_index_exists($ret, $table, $extra['relationship']);
          }
        }

       
// Place index only where needed
       
else {
          foreach (
$extra['node-type'] as $nodetype) {
           
autotune_index_exists($ret, 'content_type_' . $nodetype, $field);
            if (
$extra['relationship'] != 'none') {
             
autotune_index_exists($ret, $table, $extra['relationship']);
            }
          }
        }
      }
    }
  }
}

// Output some data if not running from update.php
//echo str_replace('    ', '&nbsp;&nbsp;&nbsp;&nbsp;', nl2br(htmlentities(print_r($ret, TRUE))));
//return $ret;
?>

Very instructional...

DeeZone's picture

This is fantastic mikeytown2, as vacilando asks, is this worthy of packaging? With my limited understanding I'm tempted to say "yes". Purely for personal educational reasons I'm seriously considering running with your fantastic work and creating a "verbose" module that attempts to encapsulate some of the discussion in this thread. Ideally options could be provided with pro/con explanations allowing the administrators to make a semi informed choice on how to tune their database.

To echo vacilando's question, would this be worthy of a module?

Module

mikeytown2's picture

I put out the request to the DB Maintenance module, since this code could fit in there http://drupal.org/node/755132
If I don't hear anything in a week, then I'll create a module called autotune; unless there are some better ideas on what to call it?

Fantastic!

DeeZone's picture

If you're interested in help on the documentation side consider my hat in the ring...

Autotune? Index Boost?

Vacilando's picture

@mikeytown -- your latest code seems even more useful, but I see that your proposal to DB Maintenance was turned down, unfortunately. So will we see a new module soon? Other than "autotune" it could be called "db tuner" (to bring "database" into the name), or even, tongue-in-cheek, "index boost"! Looking forward!


---
Tomáš J. Fülöpp
http://twitter.com/vacilandois

yeah

As If's picture

yeah, cuz "autotune" always makes me think of talentless bimbos with singing careers.

Just to add some fuel here,

moshe weitzman's picture

Just to add some fuel here, there are lots of routine queries that filter or sort from 2 different tables and thus unconditionally require a tmp table to resolve in mysql. this is a bad scenario, but very common. the materialized views module is our best attempt at solving that problem. it runs today on drupal.org and other sites. it needs more caring in order to become more plug and play.

More

dbtuner module

mikeytown2's picture

K so I'm going to go with the name dbtuner. I'm going to include code to change the DB's collation as well. Thoughts?
http://drupal.org/node/756364
http://www.phpwact.org/php/i18n/utf-8/mysql
http://stackoverflow.com/questions/367711/what-is-the-best-collation-to-...
http://stackoverflow.com/questions/637148/how-does-mysql-use-collations-...

Edit: don't want to conflict with all the db_* functions out there already, so name will be dbtuner.

What are your thoughts in

dalin's picture

What are your thoughts in changing collation. Google doesn't find me any benchmarks, but I'm guessing any gains would be minimal.

--


Dave Hansen-Lange
Director of Technical Strategy, Advomatic.com
Pronouns: he/him/his

It's Up

mikeytown2's picture

Let me know how it works!
http://drupal.org/project/dbtuner
Dev will be available in 3 hours

Thank you!

jcchapster's picture

Thank you for this.

  • johnc

No sign...

DeeZone's picture

There appears to be no sign of dev 24 hours after your last post. Any idea when it will be made available?

It was on the "View all

greggles's picture

It was on the "View all releases" tab. To make it easier to find the release I've edited the releases for the project so it shows "snapshot" releases. I hope that's ok, mikeytown!

Thanks!

mikeytown2's picture

Thanks for doing that! I remember doing it for http://drupal.org/project/location_display and I then checked on dbtuner and it was already done. Now I know why :)
I think this weekend was a record for me; 2 new modules in under 24hrs.

awesome

WhenInRome's picture

awesome

I concur, Awesome!

mikeefreedom's picture

mikeytown2 you are a legend!

Exactly what I've been searching for all morning!!