Really slow query time and high load on CPU

Media Crumb's picture

Hey guys,

First time here but always reading and using your tips. I run a decent size site over at http://www.2old2play.com/

Page loads are kinda all over the place. I've got memcached, Varnish and APC all rolling and it's sped things up tremendously. However I still have issues with a delay to first byte and insanely long submission for new posts (up to 44 sec).

I've started using newrelic which is AMAZING, but I still don't know how to go about fixing the issue. This is what I've got so far....

node - SELECT
Action /views_page
Max time
12,894 ms

SELECT DISTINCT node.nid AS nid, node.type AS node_type, node.title AS node_title, users.name AS users_name, users.uid AS users_uid, node_comment_statistics.comment_count AS node_comment_statistics_comment_count, og_ancestry.nid AS og_ancestry_nid, node_comment_statistics.last_comment_timestamp AS node_comment_statistics_last_comment_timestamp FROM node node INNER JOIN users users ON node.uid = users.uid INNER JOIN node_comment_statistics node_comment_statistics ON node.nid = node_comment_statistics.nid LEFT JOIN og_ancestry og_ancestry ON node.nid = og_ancestry.nid INNER JOIN node_access na ON na.nid = node.nid WHERE (na.grant_view >= ? AND ((na.gid = ? AND na.realm = '?') OR (na.gid = ? AND na.realm = '?') OR (na.gid = ? AND na.realm = '?'))) AND ( node.status = ? )ORDER BY node_type DESC LIMIT ?, ?

12 seconds for just this one SQL is pretty dismal. Average load times are about 10sec and if someone is submitting content it can hang to almost 40+. It's a dedicated server with only a few tiny WP sites, but for the most part 2o2p is using all the resources.

I've also got:

UPDATE users SET access = ? WHERE uid = ?

running at almost 9 sec.

Any ideas?

Comments

Fun with layers

Eidolon Night's picture

I'd check to see what your apc and memcache usage is. Fragmenting in APC can really slow things down, and the reason you see times jump around is likely because Varnish tends to mask issues. When I'm trying to work on setups like this (my preferred setup is the same) I usually make Varnish super aggressive and then just bypass it by hitting the apache port directly.

Not sure which Drupal version you're on, but check out http://drupal.org/project/dbtuner/ or just use the script directly on your server. Oh, and make sure you're using innodb. Community sites especially benefit from that.

WebOzy: the web services company
liveBatavia: connecting local people to local business

I've got a great hit rate

Media Crumb's picture

I've got a great hit rate with APC (or so it seems) and memcached looks to be running fine. I"m currently on PressFlow D6. I'll see if DBtuner can help.

As for innodb are the improvements that great? Is it easy to change over my esisting db? Thanks so much for the help

http://i.minus.com/iuFc661sfEUff.jpg <-- link to apc

http://i.minus.com/iXRt9Ak9LbYta.jpg <-- link to mem

Awesome

Eidolon Night's picture

APC and memcache look great. I like the site. I have actually worked on a few gaming-related sites so I am familiar with the performance woes. You're using Advanced Forum and seem to have regular posters so, yes, innodb will a nice performance boost. I know there is a module that'll switch over all your tables... or at least I thought there was. It might actually be bundled into the db tuner module. You can also check out http://drupal.org/project/db_maintenance

When you check the performance times, are you using a regular user or your admin account? Admin users bypass certain caches and I've seen admin menu do strange things to load times. So when you're running tests, make sure you use a regular authenticated user.

If you have devel installed, can you post the page execution, query number, and query time for a normal user? That's usually where I get started.

On a side note for later, definitely check out https://www.cloudflare.com I use them for all my clients, but they especially shine on image-heavy gaming sites, photography sites, etc. Saves your server bandwidth and keeps it from having to process every single request.

WebOzy: the web services company
liveBatavia: connecting local people to local business

You could start by running an

pdrake's picture

You could start by running an EXPLAIN on the first query to get an idea why it is running so slowly. That said, from looking at these, it looks like maybe you have a configuration problem in MySQL or on your server. Are sessions currently being stored in memcached or MySQL? In addition to newrelic, tracelytics offers a free trial and with the tracelytics drupal module, you can get some good insights into performance.

I believe session are on SQL.

Media Crumb's picture

I believe session are on SQL. I have to admin I'm fairly noobish to all this so its likely ive run into some config issues. However I have followed the examples of a lot of great tutorials and the settings seem accurate.

D6 or D7?

mikeytown2's picture

Drupal 6 or Drupal 7 site?

If a Drupal 6 site:
change tables from MyISAM to InnoDB.
Look at what http://drupal.org/project/dbtuner says

If a Drupal 7 site OR tables are InnoDB:
Make sure innodb_buffer_pool_size is a reasonable size (not the default).

Run and be sure to put the values back for where you have question marks.

EXPLAIN SELECT DISTINCT
  node.nid AS nid,
  node.type AS node_type,
  node.title AS node_title,
  users.name AS users_name,
  users.uid AS users_uid,
  node_comment_statistics.comment_count AS node_comment_statistics_comment_count,
  og_ancestry.nid AS og_ancestry_nid,
  node_comment_statistics.last_comment_timestamp AS node_comment_statistics_last_comment_timestamp
FROM node node
INNER JOIN users users ON node.uid = users.uid
INNER JOIN node_comment_statistics node_comment_statistics ON node.nid = node_comment_statistics.nid
LEFT JOIN og_ancestry og_ancestry ON node.nid = og_ancestry.nid
INNER JOIN node_access na ON na.nid = node.nid
WHERE (
  na.grant_view >= ? AND (
    (na.gid = ? AND na.realm = '?')
    OR (na.gid = ? AND na.realm = '?')
    OR (na.gid = ? AND na.realm = '?')
  )
)
AND ( node.status = ? )
ORDER BY node_type DESC
LIMIT ?, ?

He mentioned "I'm currently

Eidolon Night's picture

He mentioned "I'm currently on PressFlow D6."

Thanks for clearing up the dbtuner stuff. I couldn't remember for sure if dbtuner did that or a different module. I know that Presfflow installs with innodb by default (even in D6), but not sure if he started with Presfflow or upgraded to it.

WebOzy: the web services company
liveBatavia: connecting local people to local business

Using Drupal 6 with

Media Crumb's picture

Using Drupal 6 with Pressflow.

Im installing DBtuner now.

When you say, "be sure to put the values back for where you have question marks" I"m not sure what you mean. Should I run this query as is?

Question marks

mikeytown2's picture

In your original post you posted a DB Query and replaced values with ?. Undo that step and you should be able to run explain on the query then.

Oddly enough this was the

Media Crumb's picture

Oddly enough this was the query i was given. I didn't replace it at all. I've been running New Relic to track all of this.

http://i.minus.com/ibuhFV7XStscwp.jpg

Above is the offending query and time it took to run it. Maybe I'm not getting all the information from Relic.

ah ok

mikeytown2's picture

well from what I see there, this query is coming from the views module so db tuner can be used to run explain on it.

Go to admin/settings/dbtuner/engine
What does "Engines in use: " say?

On to db tuner, get it installed and go to admin/settings/dbtuner/collation
What does "Collations in use: " say?

Go to admin/settings/dbtuner
Are there any drupal messages on this page? If so what do they say?

Go to admin/settings/dbtuner/mysqltuner
Paste output in code block.

1) Engines in use:

Media Crumb's picture

1) Engines in use: MyISAM

2)Collations in use: utf8_general_ci, utf8_unicode_ci, latin1_swedish_ci

3) Fatal error: Call to a member function get_option() on a non-object in /home/public_html/sites/all/modules/views/includes/view.inc on line 1995

Warning: Division by zero in eval() (line 1 of /home/public_html/sites/all/modules/dbtuner/dbtuner.mysqltuner.inc(84) : eval()'d code).

Queries
Uptime in seconds: 231306
Uptime: 2d 16h 15m 6s
Questions: 72766719
% slow queries: 0.0069454828655941
slow query rate: 0 per day
Long query time: 5
Slow query logging: ON
% reads: 88.166912345307
% writes: 11.833087654693
qps: 314.5907110062
reads per sec: 0.32933089615637 per day
writes per sec: 0.044200270350334 per day
Queries: 314.5907110062 per second
Connections: 464 Thousand
Bytes sent: 1 Billion
Bytes received: 1 Billion


versions
Supported Version: 5
Release Series: 5.0
version less then 5.1, upgrade!
    (substr("version",0,3) ne "5.1")
    (substr("5.0.92-community-log",0,3) '5.0' !== "5.1")
Minor Version: 92
Distribution: MySQL Community Edition (GPL)
Distribution: MySQL Community Edition (GPL)

MySQL Architecture: i686
MySQL is not compiled as a 64-bit package.
    ("version_compile_machine" !~ /64/)
    ("i686" dbtuner_strnistr('i686 ', array('64')))


Query cache
Query cache efficiency (%): 87.124004155192
% query cache used: 71.276748180389
The query cache is not being fully utilized.
    (Qcache_free_memory / query_cache_size * 100 <80)
    (95666032 / 134217728 * 100 71.276748180389<80)
Query cache low memory prunes: 0 per day
Query cache size: 128.0 Mb
Query cache min result size: 1.0 Mb
The max size of the result set in the query cache is the default of 1 Mb. Changing this (usually by increasing) may increase efficiency.
    (&hr_bytes(query_cache_limit) eq "1.0 Mb")
    (dbtuner_hr_bytes(1048576) '1.0 Mb' === "1.0 Mb")


Sorts
Total sorts: 1454806
% sorts that cause temporary tables: 0.31158793681082
rate of sorts that cause temporary tables: 0 per day
sort_buffer_size: 4.0 Mb
read_rnd_buffer_size: 8.0 Mb
Sort rows: 0 per day

Joins,scans
rate of joins without indexes: 0 per day
rate of reading first index entry: 0 per day
rate of reading fixed position: 0 per day
rate of reading next table row: 0 per day

temp tables
tmp_table_size-max_heap_table_size: 16777216
tmp_table_size and max_heap_table_size are not the same.
    (tmp_table_size-max_heap_table_size !=0)
    (33554432-16777216 16777216!=0)
tmp_table_size: 32.0 Mb
max_heap_table_size: 16.0 Mb
% temp disk tables: 32.808218867091
Too many temporary tables are being written to disk. Increase max_heap_table_size and tmp_table_size.
    (Created_tmp_disk_tables / (Created_tmp_tables + Created_tmp_disk_tables) * 100 >25)
    (650315 / (1331856 + 650315) * 100 32.808218867091>25)
temp disk rate: 0 per day
temp table rate: 0 per day


MyISAM index cache
MyISAM key buffer size: 384.0 Mb
max % MyISAM key buffer ever used: 32.198079427083
MyISAM key buffer (index cache) % used is low. You may need to decrease the size of key_buffer_size, re-examine your tables to see if indexes have been removed, or examine queries and expectations about what indexes are being used.
    ((Key_blocks_used)*key_cache_block_size/key_buffer_size * 100 <95)
    ((126608)*1024/402653184 * 100 32.198079427083<95)
% MyISAM key buffer used: 43.052164713542
MyISAM key buffer (index cache) % used is low. You may need to decrease the size of key_buffer_size, re-examine your tables to see if indexes have been removed, or examine queries and expectations about what indexes are being used.
    ((1-Key_blocks_unused*key_cache_block_size/key_buffer_size) * 100 <95)
    ((1-223928*1024/402653184) * 100 43.052164713542<95)
% index reads from memory: 99.998968051646


other caches
table open cache size (5.1+): table_open_cache
Size of the table cache
    (table_open_cache >-1)
    (table_open_cache table_open_cache>-1)
rate of table open: 0 per day
% open files: 27.432895825943
rate of open files: 0 per day
Immediate table locks %: 99.939885425499
Table lock wait rate: 0 per day
thread cache: 512
Total threads created: 25
thread cache hit rate %: 5.3772113781793E-5
Threads that are slow to launch: 0
Slow launch time: 2


Connections
% connections used: 9.8039215686275
Max connections used: 25
Max connections limit: 255
% aborted connections: 0.21143195139001
rate of aborted connections: 0 per day
% aborted clients: 0.28907888369092
rate of aborted clients: 0 per day


InnoDB
Is InnoDB enabled?: YES
% innoDB log size: 62.5
InnoDB log file size is not an appropriate size, in relation to the InnoDB buffer pool. Consider changing either\ninnodb_log_file_size or innodb_buffer_pool_size
    (innodb_log_file_size / innodb_buffer_pool_size * 100 >=0)
    (5242880 / 8388608 * 100 62.5>=0)


other
MyISAM concurrent inserts: 1

INSERT DELAYED USAGE

Delayed_errors 0

Delayed_insert_threads 0

Delayed_writes 0

Not_flushed_delayed_rows

Recommendations

mikeytown2's picture

Use MariaDB 5.5 or MySQL 5.6 if you're feel adventurous. We are running 5.6 in production on a small set of our database servers; once 5.6 is "stable" we will be transitioning all db servers over to 5.6.

Once you've made the switch to a newer version of MySQL, look into changing from MyISAM to InnoDB. First step is to set your innodb_buffer_pool_size to a correct value. See this for more ideas on correct values http://groups.drupal.org/node/232448. Do not switch db engines until you've increased the innodb_buffer_pool_size.

Collations. Looks like some of your tables are latin1_swedish_ci while others are utf8_*_ci. Recommend changing all to utf8_general_ci. Be sure to backup the data before doing this. This can be done today, just be sure to do it while your site is in maintenance mode (AKA a time when it's not being used a lot). This will help speed up joins if the tables are two different Collations.

Good luck!

New relic masks values in

tsphethean's picture

New relic masks values in queries to protect customer data. its a while since i looked but i think there's an option to turn this masking off which might be worth doing briefly to help troubleshooting. beware that this will mean any sensitive data used in queries will be sent to NR if you do this.

Other option is to check your mysql slow query log which should contain this and any other slow queries in full for you to then run through explain.

Where does this log reside?

Media Crumb's picture

Where does this log reside? Thanks

The link below should help

tsphethean's picture

The link below should help you find it and gives an overview of options for configuring it if you want more detail.

http://dev.mysql.com/doc/refman/5.0/en/slow-query-log.html

default is that it gets created in your data directory but depends on your conf.

Database optimization...

jdonson's picture
  1. Innodb all tabs!

  2. Distribute physical IO (ask if you want more info - this is juicy)

  3. use slow query log to find out what is up with both time pigs and AGGREGATE time pigs

  4. Use MySQL 5.1++

  5. Have dedicated database server(s [master and slave).

  6. Consider how to use a slave.

  7. Set up for min dump and restore times. (ask or you will miss some goodies)

  8. Consider benchmarking and load testing systems on dev host! Then you can talk supply and demand with REASONABLE expectations!

  9. Then optimize queries and database structure.

  10. Keep old data in _hist tables

  11. etcetcetcetc

Been doing this for 20 years. MySQL is RAD! Learn to use it, or you will LOSE data/sleep/sanity!

Jeremy Donson
Database and Systems Engineer
New York City

I'd love to, but to be quite

Media Crumb's picture

I'd love to, but to be quite honest there just isn't enough time in the day. :(

As a person with a front end background its hard enough to deal with UI choices while worrying about learning PHP, SQL, APC, Varnish, Mem, Apache configurations etc. Lucky for me I love to learn, although sleep would be good at some point too!

The other issue is money, community site generally don't make millions and so just rented the server alone puts a strain on our budget. The site survives off of members really.

Don't worry. While jdonson

Eidolon Night's picture

Don't worry. While jdonson has great suggestions, they're not necessary for your site. That's the sort of stuff larger corporate clients need. My typical setup for sites such as yours is:

  • linode for hosting
  • LAMP stack (optimized)
  • APC (just for opcode caching)
  • Memcache
  • Varnish
  • Cloudflare

I've found it get's me the most bang for my buck on sites like this. Once you have things set up, you usually don't have to touch them until your traffic goes crazy in direction or the other.

WebOzy: the web services company
liveBatavia: connecting local people to local business

Running off a single server,

Jamie Holly's picture

Running off a single server, instead of using Memcache and APC, consider using just APC. APC also has a user cache that is much faster.

You also need to consider the amount of RAM on the server. If you got a lot of big joins and sorts and they don't fit in RAM, then the server has to use the filesystem. This is a big performance killer. RAM is cheap anymore, so check the server stats (you can do this from TOP) and see if the server is having to do a lot of swap. Since you're running everything on a single server, you need to budget out how much memory can be used by Apache and MySQL.

The best and easiest thing to do on troubling queries is to do a dump of the database and import it locally. After that run the queries through PHPMyAdmin locally (setup a local environment using something like XAMPP. It's really easy to do). Then you can play with the queries locally and try adding new indexes to see if you can get it to improve. Once you figure out a fix, move it over to your live server.

Not a lot of people like load testing and that (I love it since most of my clients are high profile, high traffic and highly complex sites), but it is a must. Drupal is an extremely versatile platform and different combinations of modules and features can cause performance issues. Sometimes you can tune MySQL or change some indexes to remedy them. Other times you end up having to rewrite whole queries to get the performance back.


HollyIT - Grab the Netbeans Drupal Development Tool at GitHub.

MAKING THE MOST OF YOUR BACK END TIME

jdonson's picture

Well, you are talking about making the most with the least???? I feel ya!

Sounds like some prioritizations are in order. MOST of coding is putting things in sequence! :)

...Consider the paramount priorities for back end, in sequence:

  1. stability and recoverability - if performance is bad enough, then we are UNSTABLE!!!

  2. logic

  3. security

  4. performance

  5. Sleep!

THEN Regarding stability and performance and security,
consider the graph of severity x likelihood:

Likely and SEVERE? BE 110% PREPARED!

Likely and NOT SEVERE? So what?

Unlikely and SEVERE??!?!?! BE VERY PREPARED!! DO NOT MISS THIS QUADRANT!

Unlikely and NOT SEVERE? What are we even thinking about that?!?!?!

Hope I saved you years, now MAKE THE MOST OF YOUR TIME!

Jeremy Donson
Database and Systems Engineer
New York City

mikeytown2 knows the deal

jdonson's picture

He is good source of info - make good use of him! %-)

Jeremy Donson
Database and Systems Engineer
New York City

more silly ideas...

jdonson's picture

Multiple independent sources of verification are required for ANY measure of certainty.

True in journalism, science and law (and engineering).

Jeremy Donson
Database and Systems Engineer
New York City

Known problem...

fuzzy76's picture

I actually worked on a site with very similar problems in january this year. That DISTINCT query with 4 JOINS is a killer, and there's no easy way around it.

For some reason (you will find this by running EXPLAIN on the query, and interpreting the results) the DISTINCT keyword makes MySQL store all temporary tables it uses for the JOINs on disk instead of in memory. That is why the query is so mind-boggingly slow. Any amount of MySQL configuration cannot change this behaviour afaik.

You need to figure out where that query comes from (I'm guessing a view somehere) and cache it for what it's worth. If it is a "last post"-view for a forum (ours was, and yours look pretty similar) I know you don't want a high cache, but even a couple of seconds might help so it isn't generated on every page load. There is a setting somwhere (in devel.module I think?) for having views tag all its queries with the name of the view, that will probably help.

I'm actually not sure if the MySQL behaviour is the same for InnoDB (our site was on MyISAM, and changing it wasn't an option at the time), so you might want to check into that.

Can be solved

mikeytown2's picture

Looking at the handy wiki: http://groups.drupal.org/node/187209 there is a link to a module for breaking a views query into sub queries: http://groups.drupal.org/node/154564#comment-517159
Views Optimized is not user friendly by any standard, but it does work and we do use it in production.

What's new in MySQL 5.6 and available in MariaDB 5.5 is the join optimizer: http://www.mysqlperformanceblog.com/2012/04/04/join-optimizations-in-mys...

More info about MySQL 5.6 & MariaDB 5.5 optimizations:
http://www.mysqlperformanceblog.com/search/MariaDB+Optimization+MySQL/
http://www.percona.com/live/mysql-conference-2012/sessions/mysql-optimiz...

The use of DISTINCT does not

pdrake's picture

The use of DISTINCT does not necessarily cause a temporary table to be written to disk, though in this case I would say it is highly likely. What does cause a temporary table to be written to disk is the presence of a BLOB/TEXT column, using DISTINCT on a column that is more than 512 byte or a temporary table that exceeds either tmp_table_size or max_heap_table_size. There are some other (less common) conditions that may cause a temporary table to be written to disk as well. See http://dev.mysql.com/doc/refman/5.5/en/internal-temporary-tables.html for more detail.

Time to start pulling the query apart

murrayw's picture

It appears to me that the real problem is the nature of the query. Whilst it is important to make sure that all your indexes are in RAM (tune your mysql), you will still fail to get decent performance if the query has some inherent faults in it. As noted above, running an EXPLAIN is very helpful to see where you may be missing indexes. You just need to work out how to decipher the output :)

Looking at the query my initial hunches would be:

  • The OR clauses are stopping indexes from working causing full table scans. Try removing the ORs to see if they are the culprit. Sometimes in these situations you can do a UNION query to combine the results of two queries which are much faster when run separately. That may not be practical here though. Generally I avoid ORs unless I know that the results will be whittled down to a fairly small set before they are applied. ie. you don't want to be applying them to thousands of rows.

  • There may be problems with trying to use too many indexes on the node table. There are joins on n.nid and n.uid combined with the where for n.status and sort on n.nodetype. Try removing the sort, the where, and then take out the join to users. You may get lucky on this - one of them may be the culprit. In the past I have seen incredibly slow performance (on the content listing page) joining to the user table on n.uid where there has been a massive node table. This is the kind of data you get out of EXPLAIN.

  • Joining just to get data into the select can be problematic. Why join to users, node_comment_statistics and og_ancestry just for the data. You could rewrite the query to remove these joins and fields and then do separate lookups (selects) using the nid. It is often much better to run one hairy query with minimal joins (this one), and then do lookups for each row returned. The lookups on the unjoined tables will be super fast because you are using a single key which has been indexed.

  • Another possibility may be the number of joins you are doing. I can't give solid advice on this but I suspect that MySQL bogs down the more joins it does. This used to be the case many versions ago.

In a nutshell: get rind of the unneeded joins, hope that the ORs won't bog down the query too much, do extra lookups on users etc to get the field data out.

To test this I recommend:

  • grabbing a copy of the database locally

  • running the query with SQL_NO_CACHE option to stop the query cache from working. eg
    SELECT SQL_NO_CACHE * FROM TABLE

  • starting from the most basic of queries and then building up to the most complex. Time each one and see what is causing the problem - classic debugging. eg.
    SELECT SQL_NO_CACHE node.nid from node;
    SELECT SQL_NO_CACHE node.nid from node NNER JOIN users users ON node.uid = users.uid;
    etc

I don't profess to be an expert in MySQL queries - the advice above is just based on hunches without seeing the EXPLAIN. It is very rewarding to get to the cause of the problem - you will learn something about MySQL along the way. Getting this right is important because you will never get decent performance if you have chronic problems such as slow query like this.

As for the UPDATE query... That has me stumped. It looks simple and should be fast lookup on the key. This makes me suspect that there are no keys in RAM. Make sure that you have key buffer for myisam or the innodb buffer turned on. This simple advice may solve all your problems.

Best of luck with it.

Drupal Geek
morpht.com

Convert your database engine to Innodb

v1p3r1c3's picture

I follow all the steps that deals performance issues but changing to innodb makes my high traffic site less complicated. I have now lower CPU load and slow queries.

Try this one. http://highervisibilitywebsites.com/convert-your-mysql-database-myisam-i...

tmpfs

rjbrown99's picture

I also mount a tmpfs partition and use it for the mysql tmpdir. This is essentially a ramdisk so it keeps the queries that would otherwise swap to disk in memory. Made a difference for me so you may want to google around for more.

High performance

Group events

Add to calendar

Group notifications

This group offers an RSS feed. Or subscribe to these personalized, sitewide feeds:

Hot content this week