Posted by mikeytown2 on March 21, 2010 at 6:23am
This discussion got me thinking...
http://groups.drupal.org/node/56438#comment-160418
So I decided to create some code that will auto add indexes to your CCK fields where there are none. Works for me (using MySQL), may not work for you. Also at the bottom is code to add indexes elsewhere in Drupal.
<?php
$ret = array();
// Add indexes to CCK fields
$result = db_query("SHOW TABLES LIKE 'content_%'");
while ($table = db_result($result)) {
if ($table == 'content_group') {
db_add_index($ret, 'content_group', 'weight', array('weight'));
db_add_index($ret, 'content_group', 'type_name', array('type_name'));
db_add_index($ret, 'content_group', 'group_name', array('group_name'));
db_add_index($ret, 'content_group', 'weight-group', array('weight', 'group_name'));
continue;
}
if ($table == 'content_node_field') {
db_add_index($ret, 'content_node_field', 'type', array('type'));
db_add_index($ret, 'content_node_field', 'widget_type', array('widget_type'));
continue;
}
if ($table == 'content_node_field_instance') {
continue;
}
// Add indexes for CCK Fields
$data = db_query("SHOW COLUMNS FROM %s", $table);
while ($column = db_fetch_array($data)) {
if ($column['Key'] == '' && !stristr($column['Field'], 'data') && !stristr($column['Field'], 'format') && !stristr($column['Type'], 'text')) {
db_add_index($ret, $table, $column['Field'], array($column['Field']));
}
}
}
// Add indexes elsewhere
db_add_index($ret, 'access', 'type', array('type'));
db_add_index($ret, 'access', 'mask', array('mask'));
db_add_index($ret, 'access', 'status', array('status'));
db_add_index($ret, 'comments', 'timestamp', array('timestamp'));
db_add_index($ret, 'node_comment_statistics', 'comment_count', array('comment_count'));
db_add_index($ret, 'menu_links', 'external', array('external'));
db_add_index($ret, 'menu_links', 'updated', array('updated'));
db_add_index($ret, 'menu_links', 'customized', array('customized'));
db_add_index($ret, 'menu_links', 'cust-ext', array('customized', 'external'));
db_add_index($ret, 'menu_links', 'depth', array('depth'));
db_add_index($ret, 'menu_custom', 'title', array('title'));
db_add_index($ret, 'users', 'pass', array('pass'));
db_add_index($ret, 'users', 'status', array('status'));
db_add_index($ret, 'filter_formats', 'roles', array('roles'));
db_add_index($ret, 'term_data', 'name', array('name'));
db_add_index($ret, 'imagecache_preset', 'presetname', array('presetname'));
db_add_index($ret, 'blocks', 'module', array('module'));
db_add_index($ret, 'blocks', 'delta', array('delta'));
db_add_index($ret, 'system', 'name', array('name'));
db_add_index($ret, 'system', 'status', array('status'));
db_add_index($ret, 'system', 'type', array('type'));
db_add_index($ret, 'date_format_types', 'title', array('title'));
db_add_index($ret, 'node_counter', 'totalcount', array('totalcount'));
db_add_index($ret, 'node_counter', 'daycount', array('daycount'));
db_add_index($ret, 'node_counter', 'timestamp', array('timestamp'));
db_add_index($ret, 'files', 'filepath', array('filepath'));
// Output some data if not running from update.php
//echo str_replace(' ', ' ', nl2br(htmlentities(print_r($ret, TRUE))));
?>
Comments
Thanks
Thanks, that worked for me unmodified.
sorry for my nobism but where
sorry for my nobism but where to put this
Code Placement
If you have a custom module, add it in an update function and add in
return $ret;
at the end.http://api.drupal.org/api/function/hook_update_N/6
If the above is over your head, then create a new node of type page, making sure to use the php filter. All you need to do is preview the node, saving the node would gain you nothing. Also uncomment the last line, so it outputs something. Code only needs to run once. Running the code multiple times is pointless; nothing bad will happen so don't be afraid. Run this code again after adding in new CCK fields or when a CCK field goes from storing 1 value to more then 1.
Thanks for that...
would it be worth bunging the bottom lot in a hook_install and the cck bits in a hook_cron and putting them in a contrib indexy module?
Suggestion
How about submitting this as a patch to the DB Maintenance module?
http://drupal.org/project/db_maintenance
That small module does things like optimize tables, seems like it may be a good fit to also have it add indexes.
More slow queries
Here are a few more slow queries from my log. I'm only posting the general purpose ones that probably apply to lots of other people.
EXPLAIN SELECT nt.type, nt.* from mydb.node_type nt ORDER BY nt.type ASC
EXPLAIN SELECT dft.type, dft.title, dft.locked from mydb.date_format_types dft ORDER BY dft.title
EXPLAIN SELECT cid, category, selected from mydb.contact ORDER BY weight, category
EXPLAIN SELECT MAX(totalcount) from mydb.node_counter
Index creation would then look something like this yes?
db_add_index($ret, 'node_type', 'type', array('type'));
db_add_index($ret, 'date_format_types', 'type', array('type'));
db_add_index($ret, 'date_format_types', 'title', array('title'));
Indexes
EXPLAIN SELECT nt.type, nt.* from mydb.node_type nt ORDER BY nt.type ASC
Selects all values from the table (nt.*). Drupal by default already has an index on the type row for node_type. db_add_index() will not help; index already exists.
http://api.drupal.org/api/function/node_schema/6
EXPLAIN SELECT dft.type, dft.title, dft.locked from mydb.date_format_types dft ORDER BY dft.title
Selects all values from the table as well (only 3 rows). Drupal by default already has an index on the title row for date_format_types. db_add_index() will not help; index already exists.
EXPLAIN SELECT cid, category, selected from mydb.contact ORDER BY weight, category
Drupal by default already has an index on weight & category, so having an individual index on each may not do anything.
http://api.drupal.org/api/function/contact_schema/6
EXPLAIN SELECT MAX(totalcount) from mydb.node_counter
Forgot about this one...
http://drupal.org/node/624000
I'll add it to the code at the top. It would look like this
http://api.drupal.org/api/function/statistics_title_list/6
<?php
db_add_index($ret, 'node_counter', 'totalcount', array('totalcount'));
db_add_index($ret, 'node_counter', 'daycount', array('daycount'));
db_add_index($ret, 'node_counter', 'timestamp', array('timestamp'));
?>
your syntax is correct.
Hrmmmmm, I don't know about this...
@mikeytown I think you've got something far too generalized here. You're basically adding indexes to as many fields as possible which has it's own performance drawbacks (UDPATE/INSERT/DELETEs become much slower, and some other SELECTs as well). Many of the indexes that you are adding could only improve queries on building select boxes in the admin section - at the cost of slower queries for more general pages. You are also adding several indexes that already exist which will also slow things down. Also I'm guessing that in some situations what you really need are a single index on multiple fields. Remember that a database can only utilize one index per query.
Here's my standard method of query improvement:
@rjbrown99 The better place to put index creation statements is in the module that is running the query that doesn't utilize indexes. If module whiz_bang does some strange query on the comments table that isn't using an index, then whiz_bang should be implementing hook_schema_alter to add an index.
--
Dave Hansen-Lange
Director of Technical Strategy, Advomatic.com
Pronouns: he/him/his
MySQL 5.0 & index_merge
New in 5.0 is index_merge
http://dev.mysql.com/doc/refman/5.0/en/index-merge-optimization.html
After playing around with the menu_links table I get an explain like so
id select_type table type possible_keys
1 SIMPLE ml index_merge router_path,updated,customized,external
key key_len ref rows Extra
updated,customized,external 2,2,2 NULL 69 Using union(updated,intersect(customized,external)); Using where
The key is
intersect(customized,external)
; This can be a single index. Updated top post with this multi-key index. Now the explain generates thisid select_type table type possible_keys
1 SIMPLE ml index_merge router_path,updated,customized,external,cust-ext
key key_len ref rows Extra
updated,cust-ext 2,4 NULL 75 Using union(updated,cust-ext); Using where
Which looks like you haven't
Which looks like you haven't gained very much (estimated rows increased, but a better merge algorithm). Did the query execute any faster? You may need to benchmark both scenarios.
What is the query that you are attempting to improve?
IIRC index_merge is only possible if all the necessary indexes are already in memory (so there has to be enough RAM available to MySQL to keep many indexes in RAM).
--
Dave Hansen-Lange
Director of Technical Strategy, Advomatic.com
Pronouns: he/him/his
Query is faster; but no index wins.
InnoDB:
0.0107 sec 3 indexes
0.0104 sec 2 indexes (2 combined)
0.0036 sec no indexes
MyISAM:
0.0219 sec 3 indexes
0.0216 sec 2 indexes (2 combined)
0.0036 sec no indexes
Query Cache:
0.0003 sec
Explain with no index selected.
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE ml ALL router_path NULL NULL NULL 1386 Using where
Key Buffer size is 32M so the extra keys are all in memory. Also server is tuned for InnoDB, so the MyISAM numbers could be better.
Table: menu_links - 1,386 rows.
Query: Was in the "not using index" log & it looked interesting.
Query is found in _menu_navigation_links_rebuild
http://drupal.org/node/593682
SELECT ml.link_path, ml.mlid, ml.router_path, ml.updated from menu_links ml WHERE ml.updated = 1 OR (router_path NOT IN ('node','rss.xml','admin','blog','batch','contact','forum','logout','user','user/login','system/files','filefield/progress','masquerade/autocomplete-user','user/timezone','admin_menu/flush-cache','admin_menu/toggle-modules','masquerade/autocomplete','taxonomy/autocomplete','blog/feed','content/js_add_more','admin/by-module','admin/by-task','admin/compact','filter/tips','node/add','comment/delete','comment/edit','masquerade/unswitch','user/register','user/password','views/ajax','content/press-releases','wysiwyg/%','user/autocomplete','admin/content','blog/%','admin/reports','admin/build','admin/settings','admin/user','node/%','user/%','node/%/view','user/%/view','system/files/imagecache','masquerade/autocomplete/multiple','js/path_redirect/autocomplete_404','admin/settings/actions','admin/user/rules','admin/reports/updates','admin/build/block','admin/content/comment','admin/build/contact','admin/content/feed','admin/settings/imageapi','admin/build/imagecache','admin/settings/logging','masquerade/switch/%','admin/settings/menu_breadcrumb','admin/settings/nice_menus','node/%/refresh','node/%/purge','admin/reports/status','taxonomy/term/%','admin/build/path','admin/settings/admin_menu','blog/%/feed','admin/settings/clean-urls','user/%/contact','admin/content/node','admin/content/types','admin/settings/date-time','node/%/delete','user/%/delete','node/%/edit','admin/settings/error-reporting','admin/settings/feedapi','admin/settings/file-system','admin/content/forum','admin/settings/image-toolkit','admin/settings/filters','admin/settings/masquerade','admin/build/menu','admin/build/modules','admin/settings/performance','admin/user/permissions','admin/content/node-settings','admin/content/rss-publishing','comment/reply/%','node/%/revisions','admin/user/roles','admin/settings/site-information','admin/settings/site-maintenance','admin/content/taxonomy','admin/build/themes','admin/build/path-redirect','admin/user/settings','admin/user/user','admin/build/views','admin/settings/wysiwyg','user/%/edit','admin/content/page_title','admin/settings/admin','node/add/blog','node/add/book','node/add/feed','node/add/forum','node/add/in-the-news','node/add/news','node/add/page','node/add/press-release','node/add/homepage-scrolling-news','node/add/story','admin/build/block/list','admin/content/forum/list','admin/build/imagecache/list','admin/content/node/overview','admin/content/types/list','admin/build/path/list','admin/settings/imageapi/list','admin/settings/filters/list','admin/build/path-redirect/list','admin/build/modules/list','admin/content/taxonomy/list','admin/user/rules/list','admin/user/user/list','admin/settings/wysiwyg/profile','admin/content/comment/new','user/%/edit/account','admin/content/feed/list','admin/build/themes/select','admin/build/menu/list','admin/build/themes/settings','admin/build/contact/list','sites/datasphere.com/files/imagecache','admin/settings/actions/manage','admin/settings/actions/orphan','admin/build/modules/uninstall','admin/build/path/add','admin/settings/filters/add','admin/user/rules/add','admin/user/user/create','admin/content/comment/approval','admin/user/rules/check','admin/settings/clean-urls/check','admin/settings/imageapi/config','admin/settings/actions/configure','admin/settings/date-time/lookup','admin/build/path/edit','admin/user/roles/edit','admin/user/rules/edit','admin/content/feed/export_opml','admin/content/types/fields','admin/reports/updates/list','admin/reports/updates/check','admin/reports/status/php','admin/content/node-settings/rebuild','admin/reports/status/run-cron','admin/reports/status/sql','admin/build/block/add','admin/build/contact/add','admin/content/types/add','admin/build/menu/add','admin/build/imagecache/add','admin/content/node-type/blog','admin/content/node-type/book','admin/build/block/configure','admin/build/path/delete','admin/build/path/delete_bulk','admin/build/block/delete','admin/settings/filters/delete','admin/user/rules/delete','admin/content/node-type/feed','admin/content/node-type/forum','admin/content/feed/import_opml','admin/content/node-type/in-the-news','admin/content/taxonomy/%','admin/content/node-type/news','admin/content/node-type/page','admin/content/node-type/press-release','admin/content/node-type/homepage-scrolling-news','admin/build/contact/settings','admin/build/menu/settings','admin/reports/updates/settings','admin/content/node-type/story','admin/build/views/add','admin/build/views/list','admin/build/path-redirect/add','admin/build/views1/convert','admin/build/path/pathauto','admin/settings/date-time/configure','admin/build/views1/delete','admin/build/path-redirect/export','admin/content/taxonomy/export','admin/settings/date-time/formats','admin/content/taxonomy/import','admin/content/forum/settings','admin/build/path-redirect/settings','admin/build/views/import','admin/build/views/tools','admin/settings/filters/%','admin/build/imagecache/%','admin/build/menu-customize/%','admin/content/node-type/blog/edit','admin/content/node-type/book/edit','admin/content/node-type/feed/edit','admin/content/node-type/forum/edit','admin/content/node-type/homepage-scrolling-news/edit','admin/content/node-type/in-the-news/edit','admin/content/node-type/news/edit','admin/content/node-type/page/edit','admin/content/node-type/press-release/edit','admin/content/node-type/story/edit','admin/build/themes/settings/global','admin/content/taxonomy/%/list','admin/settings/filters/%/edit','admin/build/modules/list/confirm','admin/build/menu-customize/%/list','admin/build/modules/uninstall/confirm','admin/settings/date-time/formats/lookup','admin/build/themes/settings/bluemarine','admin/build/themes/settings/bulletin_board','admin/build/themes/settings/bulletin_board_v2','admin/build/themes/settings/chameleon','admin/build/themes/settings/corkboard','admin/build/themes/settings/datasphere','admin/content/node-type/blog/delete','admin/content/node-type/book/delete','admin/content/node-type/feed/delete','admin/content/node-type/homepage-scrolling-news/delete','admin/content/node-type/in-the-news/delete','admin/content/node-type/news/delete','admin/content/node-type/page/delete','admin/content/node-type/press-release/delete','admin/content/node-type/story/delete','admin/content/node-type/forum/delete','admin/content/taxonomy/edit/term','admin/build/themes/settings/garland','admin/build/themes/settings/hyperlocal_base','admin/build/block/list/js','admin/build/themes/settings/local_newspaper','admin/build/themes/settings/marvin','admin/build/themes/settings/pushbutton','admin/build/themes/settings/zen','admin/build/themes/settings/zen_classic','admin/build/themes/settings/zen_datasphere','admin/build/themes/settings/STARTERKIT','admin/build/themes/settings/hyperlocal','admin/content/forum/edit/%','admin/build/themes/settings/minnelli','admin/settings/imageapi/config/imageapi_gd','admin/build/menu-customize/%/add','admin/build/block/list/bluemarine','admin/build/block/list/bulletin_board','admin/build/block/list/bulletin_board_v2','admin/build/block/list/chameleon','admin/settings/filters/%/configure','admin/build/block/list/corkboard','admin/settings/date-time/formats/custom','admin/build/block/list/datasphere','admin/settings/actions/delete/%','admin/build/contact/delete/%','admin/build/menu-customize/%/delete','admin/content/node-type/blog/display','admin/content/node-type/book/display','admin/views/ajax/autocomplete/tag','admin/content/node-type/feed/display','admin/content/node-type/forum/display','admin/content/node-type/homepage-scrolling-news/display','admin/content/node-type/in-the-news/display','admin/content/node-type/news/display','admin/content/node-type/page/display','admin/content/node-type/press-release/display','admin/content/node-type/story/display','admin/build/contact/edit/%','admin/build/menu-customize/%/edit','admin/build/block/list/garland','admin/build/block/list/local_newspaper','admin/views/ajax/autocomplete/user','admin/build/block/list/hyperlocal_base','admin/content/node-type/blog/fields','admin/content/node-type/book/fields','admin/content/node-type/feed/fields','admin/content/node-type/forum/fields','admin/content/node-type/homepage-scrolling-news/fields','admin/content/node-type/in-the-news/fields','admin/content/node-type/news/fields','admin/content/node-type/page/fields','admin/content/node-type/press-release/fields','admin/content/node-type/story/fields','admin/build/block/list/marvin','admin/build/block/list/minnelli','admin/build/block/list/pushbutton','admin/settings/filters/%/order','user/reset/%/%/%','admin/build/block/list/zen','admin/build/block/list/zen_classic','admin/build/block/list/zen_datasphere','filefield/ahah/%/%/%','admin/build/block/list/STARTERKIT','admin/build/block/list/hyperlocal','admin/build/views/export/%','admin/build/views/clone/%','admin/build/views/disable/%','admin/build/views/enable/%','admin/content/forum/add/container','admin/settings/date-time/formats/add','admin/content/forum/add/forum','admin/content/taxonomy/add/vocabulary','admin/build/views/break-lock/%','admin/build/views/tools/basic','admin/settings/date-time/formats/configure','admin/build/views/delete/%','admin/build/views/edit/%','admin/build/views/tools/convert','admin/settings/date-time/delete/%','admin/build/path-redirect/delete/%','admin/build/path-redirect/edit/%','node/%/revisions/%/view','admin/build/imagecache/%/%','node/%/revisions/%/delete','node/%/revisions/%/revert','admin/build/imagecache/%/delete','admin/build/imagecache/%/export','admin/build/imagecache/%/flush','admin/build/imagecache/%/override','admin/content/node-type/blog/display/basic','admin/content/node-type/book/display/basic','admin/content/node-type/feed/display/basic','admin/content/node-type/forum/display/basic','admin/content/node-type/homepage-scrolling-news/display/basic','admin/content/node-type/in-the-news/display/basic','admin/content/node-type/news/display/basic','admin/content/node-type/page/display/basic','admin/content/node-type/press-release/display/basic','admin/content/node-type/story/display/basic','admin/content/node-type/blog/display/rss','admin/content/node-type/book/display/rss','admin/content/node-type/feed/display/rss','admin/content/node-type/forum/display/rss','admin/content/node-type/homepage-scrolling-news/display/rss','admin/content/node-type/in-the-news/display/rss','admin/content/node-type/news/display/rss','admin/content/node-type/page/display/rss','admin/content/node-type/press-release/display/rss','admin/content/node-type/story/display/rss','admin/content/node-type/blog/display/token','admin/content/node-type/book/display/token','admin/content/node-type/feed/display/token','admin/content/node-type/forum/display/token','admin/content/node-type/homepage-scrolling-news/display/token','admin/content/node-type/in-the-news/display/token','admin/content/node-type/news/display/token','admin/content/node-type/page/display/token','admin/content/node-type/press-release/display/token','admin/content/node-type/story/display/token','admin/build/menu/item/%/delete','admin/content/forum/edit/container/%','admin/content/forum/edit/forum/%','admin/build/menu/item/%/edit','admin/content/taxonomy/edit/vocabulary/%','admin/build/menu/item/%/reset','admin/build/views/%/add-display/%','admin/build/views/%/%/%','admin/build/views/%/analyze/%','admin/build/views/%/details/%','admin/settings/date-time/formats/delete/%','admin/build/views/%/preview/%','admin/content/taxonomy/%/add/term','admin/build/imagecache/%/add/%','admin/build/imagecache/%/%/delete','admin/settings/wysiwyg/profile/%/edit','admin/settings/wysiwyg/profile/%/delete') AND external = 0 AND customized = 1).
I wonder how well the MySQL 5.1 optimizer works on something like this. Server is using CentOS with MySQL 5.0.45.
Generalize
Thanks. I'm only working with a test DB at the moment so no worries with me.
What I understood as mikeytown2's goal was to provide for some generalized indexes where they might not otherwise exist. Perhaps the initial approach goes too far in terms of creating some specific indexes? If so, is there a consensus as to what indexes might be appropriate that would not otherwise exist without creating them?
If so, is there a consensus
Any place that you have a View that has filters or arguments on non-primary-key, MySQL may require an additional index. But only if those queries are showing up in your slow query log. Beyond the default indexes I don't think you'll find a consensus as to what additional indexes to add (and if you do find a consensus then it should be in the appropriate module).
You are looking for a magic bullet to improve your SQL performance, but there isn't one. You need to analyze your slow query log, try to improve queries, and weigh the benefits/costs of your optimizations for overall performance.
--
Dave Hansen-Lange
Director of Technical Strategy, Advomatic.com
Pronouns: he/him/his
CCK & Views
The vast majority of use cases for CCK & Views IMHO is so you can create a view targeting that CCK field. Filefield is different, but then again the fid need to be joined to the files table so it's back in the slot for being indexed. The list field is a maybe, but if you want to create a view that only has files then this is needed. Creating an index for every CCK field is overkill, but IMHO the trade-off is worth it. Most people want faster page views & the penalty for inserts with the extra index is quite small.
The rest of the indexes are for core & 99% of optimizations for core don't get accepted. Granted I could do a better job at creating the index; using multiple indexes would be the ideal. The menu_links table looks like it could use multiple columns per index.
I wanted to get the talk about database indexes going. I know for a fact that indexes on CCK, if your using views in any creative way, will help a lot. Creating something that goes through every view looking for filters and applying an index to that field seems like it would be a more targeted approach to this.
FYI, we discussed
FYI, we discussed auto-indexes for cck and views at http://groups.drupal.org/node/7614
Measurement
This is a good discussion, as I can tell you I personally know a heck of a lot more about the OS, webserver, and PHP than I do about databases. Scaling and tuning a database for Drupal is an interesting topic. I have been trying to cobble together some thoughts on storage engines as well based on reading lots of sources, but that's probably best left for a new thread.
For purposes of this discussion about indexes, how about starting at an appropriate threshold for measurement? The default value of long_query_time is 10, and mikeytown2 suggested in the other thread of setting it to 1. Let's say I am aiming for a load time of all pages under 5 seconds. Assuming that's the case and accounting for network transit of data to a client browser, it seems like it would be reasonable to set the long_query_time to anywhere from 1 to 3 seconds.
Can we also assume that for purposes of this discussion that we are not taking into consideration any external caching solutions such as memcached and are focused purely on in-database tuning?
References:
http://dev.mysql.com/doc/refman/5.1/en/slow-query-log.html
http://groups.drupal.org/node/13844
I set long_query_time to 1.
I set long_query_time to 1. You can go even lower if you are using the Percona flavour of MySQL.
If you want pages to load in <5 s then you probably need the page to be generated in about 0.5-0.75 s (most of page load time is CSS, JS, Images, Flash, etc.). SQL time can be anywhere from 10-40% of page generation time. There's often 100-300 queries per page, which means that any query that takes longer than ~10ms is possibly a problem.
--
Dave Hansen-Lange
Director of Technical Strategy, Advomatic.com
Pronouns: he/him/his
Perhaps a Database Noob Request...
This might be considered a database Noob request but I think it would be helpful also outline at a basic level how to check on the results of the script above, in effect a turn key solution. I'm guessing many will simply copy and paste, close their eyes and hope for the best thinking this might make things a little better in Drupal Land. Perhaps some explanation of how to check on what indexes have been applied via phpMyAdmin or the mySQL command line might be in order.
I'm curious because my quick poke at phpMyAdmin didn't provide a clear indication that the indexes had been applied correctly but PHP warnings at additional previews/submits of the script above suggested the indexes are in place.
phpMyAdmin
Go to the table
click on the structure tab
click on Details at the bottom
Thanks!
Thanks! Painfully obvious now that you point it out. A little less mySQL mystery in my life is always a good thing.
Finding "Useless" Indexes
Run this SQL
http://arjen-lentz.livejournal.com/122399.html
SELECT s.table_name,
concat(s.index_name,'(',group_concat(s.column_name order by s.seq_in_index),')') as idx,
GROUP_CONCAT(s.cardinality ORDER BY s.seq_in_index) AS card,
t.table_rows
FROM information_schema.tables t
JOIN information_schema.statistics s USING (table_schema,table_name)
WHERE t.table_schema='drupal'
AND t.table_rows > 1000
AND s.non_unique
GROUP BY s.table_name,s.index_name
HAVING (card + 0) < (t.table_rows / 3)
Another way
http://forge.mysql.com/tools/tool.php?id=85
SELECT
t.TABLE_SCHEMA AS
db
, t.TABLE_NAME AS
table
, s.INDEX_NAME AS
inde name
, s.COLUMN_NAME AS
field name
, s.SEQ_IN_INDEX
seq in index
, s2.max_columns AS
# cols
, s.CARDINALITY AS
card
, t.TABLE_ROWS AS
est rows
, ROUND(((s.CARDINALITY / IFNULL(t.TABLE_ROWS, 0.01)) * 100), 2) AS
sel %
FROM INFORMATION_SCHEMA.STATISTICS s
INNER JOIN INFORMATION_SCHEMA.TABLES t
ON s.TABLE_SCHEMA = t.TABLE_SCHEMA
AND s.TABLE_NAME = t.TABLE_NAME
INNER JOIN (
SELECT
TABLE_SCHEMA
, TABLE_NAME
, INDEX_NAME
, MAX(SEQ_IN_INDEX) AS max_columns
FROM INFORMATION_SCHEMA.STATISTICS
WHERE TABLE_SCHEMA != 'mysql'
GROUP BY TABLE_SCHEMA, TABLE_NAME, INDEX_NAME
) AS s2
ON s.TABLE_SCHEMA = s2.TABLE_SCHEMA
AND s.TABLE_NAME = s2.TABLE_NAME
AND s.INDEX_NAME = s2.INDEX_NAME
WHERE t.TABLE_SCHEMA != 'mysql' /* Filter out the mysql system DB /
AND t.TABLE_ROWS > 10 / Only tables with some rows /
AND s.CARDINALITY IS NOT NULL / Need at least one non-NULL value in the field /
AND (s.CARDINALITY / IFNULL(t.TABLE_ROWS, 0.01)) < 1.00 / Selectivity < 1.0 b/c unique indexes are perfect anyway /
ORDER BY
sel %
, s.TABLE_SCHEMA, s.TABLE_NAME / Switch tosel %
DESC for best non-unique indexes */As mentioned in the comments,
As mentioned in the comments, there are numerous reasons why comparing cardinality to the total number of rows is not a great way to find unused indexes (though it may be one of the only ways). If you simply remove all the indexes suggested you're bound to get a huge decrease in MySQL performance. But the indexes may be worth looking into for tables that get a lot of writes.
--
Dave Hansen-Lange
Director of Technical Strategy, Advomatic.com
Pronouns: he/him/his
Dropping unused indexes
Looks interesting
http://www.percona.com/docs/wiki/patches:userstatv2#index_statistics
http://www.mysqlperformanceblog.com/2008/09/12/unused-indexes-by-single-...
http://www.mysqlperformanceblog.com/2009/01/15/dropping-unused-indexes/
According to comments in the blog, InnoDB is better at inserts with multiple indexes then MyISAM is.
Worth a module?
A very crude summary -- while mikeytown2's initial script might have disadvantages, it seems it might come useful for sites with lots of CCK tables used in Views. As dalin pointed out, a disadvantage may be that some indexes will be doubled or not used or should better be on multiple columns. And mikeytown2 shows there are some approaches to weeding out unused indexes and maybe also for detecting where multiple indexes are needed.
And Moshe pointed out there was a similar discussion before - but its steam unfortunately petered out in 2008.
It seems to me that this could merit a new module. Of course, it won't be perfect, but it may provide a focus where further efforts on automatic adding of indexes can be fine-tuned. DB Indexer, or DBoost if you like :-)
Or, as rjbrown99 said, it could be something to add to the DB Maintenance module -- though that module seems to be stuck in terms of maintenance.
If for no other reason, it would be good to package this because a script like this needs to be run regularly as the site grows and CCK and other not indexed tables are added to the site.
---
Tomáš J. Fülöpp
http://twitter.com/vacilandois
Views & CCK Tuning Code: Laser Strategy
This loads every view, gets all filters and relationships, and creates an index if one doesn't exist. If you have ran the code at the top then this will do nothing. This is a vastly more targeted approach as it only adds indexes where there is a view filter or view relationship. Let me know how this works for you & if this can be improved.
<?php
/**
* Add an index if it doesn't exist
*
* @param $ret
* Array to which query results will be added.
* @param $table
* The table to be altered.
* @param $index
* The name of the index.
*/
function autotune_index_exists(&$ret, $table, $index) {
global $db_type;
$column_exists = FALSE;
if (db_table_exists($table)) {
if (stristr($db_type, 'pgsql')) {
// Selecting a db schema table, don't put pg_indexes inside {}
//$result = db_query("SELECT * FROM pg_indexes WHERE tablename = '{%s}'", $table);
//while ($name = db_fetch_array($result)) {
//if (stristr($name['indexname'], $index)) {
//return TRUE;
//}
//}
}
else {
$result = db_query('SHOW COLUMNS FROM {%s}', $table);
while ($column = db_fetch_array($result)) {
if ($column['Field'] == $index && $column['Key'] != '') {
return TRUE;
}
elseif ($column['Field'] == $index) {
$column_exists = TRUE;
break;
}
}
}
// Index doesn't exists create it
if ($column_exists) {
db_add_index($ret, $table, $index, array($index));
}
else {
Return FALSE;
}
}
return FALSE;
}
// Get all content_type tables
$cck_content_types = array();
$result = db_query("SHOW TABLES LIKE 'content_type_%'");
while ($table = db_result($result)) {
$cck_content_types[] = $table;
}
// Get all filters and relationships from views
$results = db_query("
SELECT id, name
FROM views_display
INNER JOIN views_view
USING ( vid )"
);
$indexes = array();
while($row = db_fetch_array($results)) {
// Load View
$view = views_get_view($row['name']);
$filters = $view->get_items('filter', $row['id']);
$relationships = $view->get_items('relationship', $row['id']);
// Build data structure
foreach ($filters as $name => $filter) {
if (!empty($filters['type']['value'])) {
foreach($filters['type']['value'] as $key => $value)
$indexes[$filter['table']][$filter['id']]['node-type'][$key] = $value;
}
else {
$indexes[$filter['table']][$filter['id']]['node-type']['-1'] = 'all';
}
$indexes[$filter['table']][$filter['id']]['relationship'] = $filter['relationship'];
}
foreach ($relationships as $name => $relationship) {
$indexes[$relationship['table']][$relationship['id']] = $relationship['relationship'];
}
}
// Add indexes to tables if none exist.
$ret = array();
foreach ($indexes as $table => $values) {
foreach ($values as $field => $extra) {
$cck_table = str_replace('node_data', 'content', $table);
// Not a cck field
if (db_table_exists($table)) {
autotune_index_exists($ret, $table, $field);
if ($extra['relationship'] != 'none') {
autotune_index_exists($ret, $table, $extra['relationship']);
}
}
// CCK field with own table
elseif (db_table_exists($cck_table)) {
autotune_index_exists($ret, $cck_table, $field);
if ($extra['relationship'] != 'none') {
autotune_index_exists($ret, $table, $extra['relationship']);
}
}
// CCK field in content type table
else {
foreach ($cck_content_types as $cck_content_table) {
// Place index on all content_type tables
if (!empty($extra['node-type']['-1']) && $extra['node-type']['-1'] == 'all') {
autotune_index_exists($ret, $cck_content_table, $field);
if ($extra['relationship'] != 'none') {
autotune_index_exists($ret, $table, $extra['relationship']);
}
}
// Place index only where needed
else {
foreach ($extra['node-type'] as $nodetype) {
autotune_index_exists($ret, 'content_type_' . $nodetype, $field);
if ($extra['relationship'] != 'none') {
autotune_index_exists($ret, $table, $extra['relationship']);
}
}
}
}
}
}
}
// Output some data if not running from update.php
//echo str_replace(' ', ' ', nl2br(htmlentities(print_r($ret, TRUE))));
//return $ret;
?>
Very instructional...
This is fantastic mikeytown2, as vacilando asks, is this worthy of packaging? With my limited understanding I'm tempted to say "yes". Purely for personal educational reasons I'm seriously considering running with your fantastic work and creating a "verbose" module that attempts to encapsulate some of the discussion in this thread. Ideally options could be provided with pro/con explanations allowing the administrators to make a semi informed choice on how to tune their database.
To echo vacilando's question, would this be worthy of a module?
Module
I put out the request to the DB Maintenance module, since this code could fit in there http://drupal.org/node/755132
If I don't hear anything in a week, then I'll create a module called autotune; unless there are some better ideas on what to call it?
Fantastic!
If you're interested in help on the documentation side consider my hat in the ring...
Autotune? Index Boost?
@mikeytown -- your latest code seems even more useful, but I see that your proposal to DB Maintenance was turned down, unfortunately. So will we see a new module soon? Other than "autotune" it could be called "db tuner" (to bring "database" into the name), or even, tongue-in-cheek, "index boost"! Looking forward!
---
Tomáš J. Fülöpp
http://twitter.com/vacilandois
yeah
yeah, cuz "autotune" always makes me think of talentless bimbos with singing careers.
Just to add some fuel here,
Just to add some fuel here, there are lots of routine queries that filter or sort from 2 different tables and thus unconditionally require a tmp table to resolve in mysql. this is a bad scenario, but very common. the materialized views module is our best attempt at solving that problem. it runs today on drupal.org and other sites. it needs more caring in order to become more plug and play.
More
Thanks for that last post. I found the original articles and threads about Materialized Views here. This is a shortcut to what-will-this-do-for-me.
http://groups.drupal.org/node/17644
http://fourkitchens.com/blog/2009/02/23/real-results-materialized-views
http://fourkitchens.com/blog/2009/02/17/developer-preview-materialized-v...
http://openlibrary.fkbuild.com/blog/2009/06/21/improvements-materialized...
dbtuner module
K so I'm going to go with the name dbtuner. I'm going to include code to change the DB's collation as well. Thoughts?
http://drupal.org/node/756364
http://www.phpwact.org/php/i18n/utf-8/mysql
http://stackoverflow.com/questions/367711/what-is-the-best-collation-to-...
http://stackoverflow.com/questions/637148/how-does-mysql-use-collations-...
Edit: don't want to conflict with all the db_* functions out there already, so name will be dbtuner.
What are your thoughts in
What are your thoughts in changing collation. Google doesn't find me any benchmarks, but I'm guessing any gains would be minimal.
--
Dave Hansen-Lange
Director of Technical Strategy, Advomatic.com
Pronouns: he/him/his
It's Up
Let me know how it works!
http://drupal.org/project/dbtuner
Dev will be available in 3 hours
Thank you!
Thank you for this.
No sign...
There appears to be no sign of dev 24 hours after your last post. Any idea when it will be made available?
It was on the "View all
It was on the "View all releases" tab. To make it easier to find the release I've edited the releases for the project so it shows "snapshot" releases. I hope that's ok, mikeytown!
knaddison blog | Morris Animal Foundation
Thanks!
Thanks for doing that! I remember doing it for http://drupal.org/project/location_display and I then checked on dbtuner and it was already done. Now I know why :)
I think this weekend was a record for me; 2 new modules in under 24hrs.
awesome
awesome
I concur, Awesome!
mikeytown2 you are a legend!
Exactly what I've been searching for all morning!!