MySQL performance issue with heavy traffic - help!

Events happening in the community are now at Drupal community events on www.drupal.org.
Quarantine's picture

Hi there,

I'm running a site with about 13,000 nodes, and we currently have 3 clones of the same websites (with slightly different content) for different regions.

This is our current server:

Dual Nehalem Quad Core Xeon 5520 (8 x 2.26GHz + hyper-threading CPU)
2 x 500GB HDD
12GB RAM

It was functioning fine until recently we had a sudden influx of visitors that are causing the server to crash and slow down tremendously. Our server guy has been trying his best to optimise and finetune the server for us but to no avail - we were running anywhere between 6000-13000 queries per second on average during our peak hours which is absolutely insane.

He suggested that we get another server and move MySQL on it to make it a completely dedicated MySQL server, so we did and acquired the following:

Quad Core Xeon 3220 2.4GHZ / 4GB RAM

Unfortunately that didn't seem to help much as well as our server is still excruciatingly slow - MySQL uses a lot of CPU resources (80-150% CPU), while the memory usage is okay at about 20-30%.

We do have quite a number of Views on the site, and since the slowdown I've been using Pager to display 20 items at one go, and since I read from a thread hear about Views caching, only discovered that functionality and I just implemented it.

It's still somewhat slow.

We have Boost and Parallel modules enabled, and we've also tried using Litespeed 2-CPU (the 15-day trial) on my server guy's advice but it only seems to help a little.

Any ideas what I could possibly do? I'm really at my wits' end at this point.

Thanks in advance!

Comments

Are you using Memcache to

schnitzel's picture

Are you using Memcache to save the cache not on the MySQL Server?

Disk performance etc.

skifter's picture

Hi

We have had performance problems on Drupal solutions due to Hard disk performance. We switched to SAS disks. Try and look at the read/write performance of your disks. You can use dstat and iotop.

You could also add Varnish to your setup or make a load ballanced setup with multiple webservers.

Or setup a second server to serve all images, stylesheets etc for your site. Often called a static server or CDN server. (there is a drupal module for it)

/Skifter

Use an alternative caching

Jamie Holly's picture

Use an alternative caching backend, like Memcache or APC (both can be used with the cacherouter project).

Check that MySQL is tuned properly. Things like sort and key buffers, query cache, temp table size. Scripts like tuning primer can really help with this.

I would also check to see if certain queries are taking a lot longer to execute than others. You can do this in the devel module (enabling query output) or by enabling the slow query log on MySQL.

If you have a lot of writes going to the database, then changing the heavily updated tables to INNODB will help, but you will most likely need more RAM on the database server to see any improvement.


HollyIT - Grab the Netbeans Drupal Development Tool at GitHub.

Schnitzel: I have to be

Quarantine's picture

Schnitzel: I have to be honest - I'm not very familiar with administrating a server, but IIRC my server guy (who isn't available now so I can't ask him) has mentioned to me before that he has installed some kind of caching mechanism in the server that may or may not have been memcache.

I noticed that there's a memcache module for Drupal and we're not using that at the moment.

I'll get back to you once I have a concrete answer on this - thanks!

skifter: Hmm, would hard disk performance cause the CPU usage by mysql to be insanely high? I'll have to look into that.

We currently do have a second server that we recently acquired that we're currently using purely for MySQL only, but I'm not sure if using it to serve images/stylesheets/etc would help or not as our issue now is MySQL that's consuming a ton of CPU resources. =/

Disk and mysql

skifter's picture

Hi

If you are using all your ram and the system is swap'ing a slow disk would definitely give you a high cpu load.
Look for IOwaits.

You could try and tune your MySQL with mysqltuner.pl.

Do you have many open connections in your MySQL database ?

/Skifter

be sure boost is running as expected

jayatdrupal's picture

You have probably already checked, but make sure that boost is running properly. Cron needs to be running to fill the boast files into the cache directory. Make sure you are seeing cached files in the cache directory. (go to the directory boast is configured for, and do -> ls -ltR ) .

Then make sure that the .htaccess file in your main directory is has the boost entries in it.

You can also view source on your pages in the browser and you will see the boost comment at the bottom of the page.

Did some checks

Quarantine's picture

skifter: Let me know if I'm doing this wrong, but I found the command to display MySQL's open connections via Google:

mysql> show status like 'Conn%';
+---------------+--------+
| Variable_name | Value |
+---------------+--------+
| Connections | 324351 |
+---------------+--------+
1 row in set (0.00 sec)

As for mysqltuner.pl, I'm currently using the module DBTuner in Drupal - is that a good equivalent?

intoxination:

Sorry I only saw your post now. I asked my server guy about converting our current MyISAM DB into InnoDB, having seeing it recommended a lot for high traffic sites, but he told me this (in verbatim):

"The problem with converting everything to innodb is innodb doesn't do well with certain types of indexes which actually can decrease performance. The module for drupal I believe just alters the table type which can be done from command line."

He also went on to saying that he's considering converting certain tables into InnoDB (selective InnoDB is what he calls it), but he didn't do it yet - his exact reply: "There is additional my.cnf modification requirements that need to be made for it to be useful, but that wont help locking in the query cache."

jayatdrupal:

Under Drupal's status report, Boost is running fine and my .htaccess has boost entries.

However.. when I used View Source on my pages in the browser, I don't seem to see any Boost comment at the bottom of the page. What do they look like exactly? I tried Ctrl+F for "boost" but couldn't find anything in my pages.

Sorry I missed out a question

Quarantine's picture

Sorry I missed out a question I meant to ask skifter. I'm not sure how many open connections is considered as a lot, but.. the amount of 324351 is a lot right? >_>

mysql connections

skifter's picture

Hi

Yep - it sounds like a lot.

Try this one for more info. "show status like '%onn%';"
Maybe you can see if many of the connections are droped ones.

Try this also. "show processlist;"

/Skifter

(No subject)

Quarantine's picture

-removed-

Thanks for that Skifter!

Quarantine's picture

Thanks for that Skifter! Here's the output of show status like '%onn%';:

+--------------------------+--------+
| Variable_name            | Value  |
+--------------------------+--------+
| Aborted_connects         | 2726   |
| Connections              | 367898 |
| Max_used_connections     | 95     |
| Ssl_client_connects      | 0      |
| Ssl_connect_renegotiates | 0      |
| Ssl_finished_connects    | 0      |
| Threads_connected        | 7      |
+--------------------------+--------+
7 rows in set (0.00 sec)

And:

mysql> show processlist;
+--------+--------------+-------------------+--------------------+----------------+------+-------------------------------+------------------------------------------------------------------------------------------------------+
| Id     | User         | Host              | db                 | Command        | Time | State                         | Info                                                                                                 |
+--------+--------------+-------------------+--------------------+----------------+------+-------------------------------+------------------------------------------------------------------------------------------------------+
|     34 | DELAYED      | localhost         | saigohei_wordpress | Delayed insert |   13 | Waiting for INSERT            |                                                                                                      |
| 216487 | DELAYED      | localhost         | saigohei_wordpress | Delayed insert |   32 | Waiting for INSERT            |                                                                                                      |
| 369461 | root         | localhost         | NULL               | Query          |    0 | NULL                          | show processlist                                                                                     |
| 369518 | hiddenstreet | 10.6.158.84:42160 | hsnetwork          | Query          |    0 | Waiting for query cache lock  | SELECT * FROM maplestory_custom_breadcrumbs_paths WHERE specific_path = 'map/ossyria/eos-tower' AND  |
| 369532 | hiddenstreet | 10.6.158.84:42173 | hsnetwork          | Query          |    0 | storing result in query cache | SELECT 1 FROM maplestory_node_access WHERE (nid = 0 OR nid = 2746) AND ((gid = 0 AND realm = 'all')) |
| 369541 | hiddenstreet | 10.6.158.84:42182 | hsnetwork          | Query          |    0 | Waiting for query cache lock  | SELECT dst FROM maplestory_url_alias WHERE src = 'node/8416' AND language IN('en', '') ORDER BY lang |
| 369544 | hiddenstreet | 10.6.158.84:42185 | hsnetwork          | Query          |    0 | Waiting for query cache lock  | SELECT * FROM maplestory_custom_breadcrumbs_paths WHERE specific_path = 'items/food/sap-of-ancient-t |
| 369546 | hiddenstreet | 10.6.158.84:42187 | hsnetwork          | Query          |    0 | Waiting for query cache lock  | SELECT t.* FROM maplestory_term_node r INNER JOIN maplestory_term_data t ON r.tid = t.tid INNER JOIN |
| 369547 | hiddenstreet | 10.6.158.84:42188 | hsnetwork          | Sleep          |    0 |                               | NULL                                                                                                 |
| 369548 | hiddenstreet | 10.6.158.84:42189 | hsnetwork          | Sleep          |    0 |                               | NULL                                                                                                 |
| 369549 | hiddenstreet | 10.6.158.84:42190 | hsnetwork          | Sleep          |    0 |                               | NULL                                                                                                 |
| 369552 | hiddenstreet | 10.6.158.84:42193 | hsnetwork          | Query          |    0 | Waiting for query cache lock  | SELECT name, filename, throttle FROM main_system WHERE type = 'module' AND status = 1 AND bootstrap  |
+--------+--------------+-------------------+--------------------+----------------+------+-------------------------------+------------------------------------------------------------------------------------------------------+
12 rows in set (0.00 sec)

Unfortuantely I don't really know what all these mean. =/

The index performance issue

Jamie Holly's picture

The index performance issue really doesn't play with Drupal since it uses some of the most simple indexes. Having said that, it's still best to selectively switch some tables to InnoDB. This is where digging into what is actually happening helps. If you see a lot of queries waiting on write-locks, then the tables that are locking should go to InnoDB since InnoDB utilizes row-level locking instead of table-level like MyISAM.

Something else to look into is how many queries it takes to generate the different pages and what those queries are. There's other solutions out there that can help with that info. Say you are seeing a lot of queries from drupal_lookup_path(), you might be better off switching to Pressflow and utilizing the path caching mechanism built into it.

Also check to make sure your site is actually caching, both Boost and core. You can check core by using the Devel module and enabling query output and reloading the same page a couple of times. See if you are getting a bunch of cache_sets in there.

Going by your number of 6000-13000 qps and applying some simple math. Let's take the lower end of that and go with a very high number of queries per page, 1,000. That's 6 page views per second/360 minute/21600 hour/518400 per day. With Boost running that would also mean all those page views are people bypassing boost, so I would venture to say you have some other problems lying in there, like Boost not caching or misconfigured.


HollyIT - Grab the Netbeans Drupal Development Tool at GitHub.

but.. the amount of 324351 is

andribas's picture

but.. the amount of 324351 is a lot right?

No, this is not a lot. I have
Connections 162569
From yesterday - this number of total connections - http://dev.mysql.com/doc/refman/5.0/en/server-status-variables.html#stat...

If you can't get info about modules in drupal and you sure the problem is mysql right now i agree with intoxination that first you should start with logs of mysql server.
Enable bin log if it is not enabled:

log-bin = mysqld
log-bin-index = mysqld

Enable slow query log:
slow_query_log                          = 1
slow_query_log_file                     = /var/log/mysql/slow_query.log
long_query_time                         = 1

Get http://hackmysql.com/mysqlsla and http://www.mysqlperformanceblog.com/files/utils/mysql_slow_log_parser
First will analyze your bin log (--log-type binary) and the second one slow_log.
After that you will get the queries that have most write from bin log - if there are cache_* tables involved than you should move it to memcache, if others - tell us which ones.
And from slow query log you will get all queries with count and spended time that executed longer than 1 second.

For example, I get from there that I should "ALTER TABLE comments ADD INDEX timestamp ( timestamp );" because I'm using flat comments, and "ALTER TABLE term_node ADD INDEX tid (tid);" for module tagadelic - you will see what queries take much time.

But in your case, you will probably get cache_* tables from bin log, may be cache_views_data when you enabled views cache.
Also consider use of phpmyadmin (it's server status) - it shows in red your bottlenecks.
And set something like that in my.cnf

read_buffer_size                        = 4M
read_rnd_buffer_size            = 4M
myisam_sort_buffer_size         = 8M
key_buffer_size = 128M
table_cache = 1024
sort_buffer_size = 32M
tmp_table_size = 64M
max_heap_table_size = 64M
query_cache_size = 256M

if you stay with myisam like me.
I have a similar server and never get more than 6Gb used total of memory from 12G when run web-server and mysql on same server.

Now after tunig I'm really happy with drupal and mysql - it works really fast.

Connections

skifter's picture

Sorry - i got confused there. I thought of max concurring connections :-(

What about the "query cache lock" do you get that all the time ?

/Skifter

intoxination: Thanks for the

Quarantine's picture

intoxination: Thanks for the very helpful tips!

For some reason, the Devel module seems to be.. causing even more MySQL load on my server after I turned it on - is it me or does this happen to everyone?

Also I double-checked my Boost module again and everything seems to be running fine. I logged out, went to the View Source page and this time I did see the Boost comment at the bottom - the previous time I didn't see it because I didn't log out, something I forgot to do.

andribas: Thank you so much for the detailed post!

I have bin log enabled, so I guess all's that left for me to do is to run the first script. But I'm.. not entirely sure what command I should run for mysqlsla, so I ran:

mysqlbinlog bin.log | mysqlsla -lt binary -

And this is the result: http://pastebin.com/k8C61qhV

I have only just enabled slow query log so I assume I might need some time for it to generate first, cos when I ran mysql_slow_log_parser, it said "Starting from..." and then it was blank.

Also btw, my server guy just told me that we do have APC running on the server.

1 question, you mentioned that I can use phpmyadmin to check my bottlenecks - is that under "Show MySQL runtime information"? If so, I'm seeing a couple of things that are in the red:

Innodb_buffer_pool_reads 147
Handler_read_rnd 25 k
Handler_read_rnd_next 8,692 k
Binlog_cache_disk_use 203
Created_tmp_disk_tables 8,877
Key_writes 55 k
Select_full_join 1
Opened_tables 617
Table_locks_waited 223

Thank you so much for providing that my.cnf configuration! We applied that and now our MySQL server is blazing fast now. :D Although we'll be waiting for the next 12 hours and see if it's still the same, which is when our peak hours are. Even so, I do still think we need to check on our bottlenecks.

Just curious - do you maintain 2 servers like we do currently (separate MySQL into a different server?)?

Skifter:

Judging from a few "show processlist;" I've been doing so far, we do seem to be getting quite a bit of that "query cache lock". :/

"I have bin log enabled, so I

andribas's picture

"I have bin log enabled, so I guess all's that left for me to do is to run the first script. But I'm.. not entirely sure what command I should run for mysqlsla, so I ran:"

Here is the proof that you store cache in mysql:
UPDATE maplestory_cache_content SET data = 'S', created = N, expire = N, headers = 'S', serialized = N WHERE cid = 'S'
I think you store all the cache there. Also if you choose memcache API you can store sessions table in memcache. This will loadoff your mysql server.
for memcache use something like this in settings.php:

$conf = array(
  'cache_inc' => './sites/all/modules/memcache/memcache.inc',
  'session_inc' => './sites/all/modules/memcache/memcache-session.inc',
  'memcache_servers' => array(
    'localhost:11211' => 'default',
    'localhost:11212' => 'block',
    'localhost:11213' => 'content',
    'localhost:11214' => 'filter',
    'localhost:11215' => 'form',
    'localhost:11216' => 'menu',
    'localhost:11217' => 'session',
    'localhost:11218' => 'users',
  ),
  'memcache_bins' => array(
    'cache' => 'default',
    'cache_block' => 'block',
    'cache_content' => 'content',
    'cache_filter' => 'filter',
    'cache_form' => 'form',
    'cache_menu' => 'menu',
    'session' => 'session',
    'users' => 'users',
  ),
);

And start 8 servers (actually you can start from 3-4 servers, I prefer first look what tables were used in cache) memcached

I use memcache not so long, so I installed cacti and templates for memcached to get realtime stats in graph for memcached - it shows requests, hits, and memory use. First i installed every cache_* table in separate server memcached, to analyze, but now I can see that in my install i can move cache_block and cache_form to 'default' cache. Set to every memcached server about 16-32 Mb and see in graph how much memory it uses and what is requests/hits per second.

Memcache is used because it's the only storage that have LRU similar to mysql cache_query - least recent used entries deleted, other storage like APC do not assume any limits and you can run out of memory allocated for storage.
Also, APC is often used for opcode caching, in my install I have about 40Mb code for every site, and I do not want to mix it with drupal cache - share the same pool.

After you will install memcache module analyze again bin log and slow log and post here.

Just curious - do you maintain 2 servers like we do currently (separate MySQL into a different server?)?
No, I use 1 server, but it 2 Xeon E5620 and sas 15k raid10 hardware - but i think you could speed up your server just tuning well.

memcache settings if using memcache 1.5+

mikeytown2's picture

Newer versions of memcache only requires 1 server now. I would recommend using 1.5 or the very latest dev due to some performance issues. If using the latest dev, be sure to exclude cache_form. If using 1.5 I would recommend using memcache.db.inc.

<?php
$conf
= array(
 
'cache_inc' => './sites/all/modules/memcache/memcache.inc',
 
'session_inc' => './sites/all/modules/memcache/memcache-session.inc',
 
'memcache_servers' => array(
   
'localhost:11211' => 'default',
  ),
 
'memcache_bins' => array(
   
'cache' => 'default',
  ),
);
?>

If your wondering, I'm currently using 1.5.

Now using memcache dev

mikeytown2's picture

We switched to the latest dev; so using memcache locking & can now do this 'cache_form' => 'database',

<?php
$conf
['cache_inc'] = './sites/all/modules/memcache/memcache.inc';
$conf['lock_inc'] = './sites/all/modules/memcache/memcache-lock.inc';
?>

The 2 commits on the 28th is what allowed me to switch over.

Also have it running with this patch to catch and report memcache errors to watchdog.

1 question, you mentioned

andribas's picture

1 question, you mentioned that I can use phpmyadmin to check my bottlenecks - is that under "Show MySQL runtime information"? If so, I'm seeing a couple of things that are in the red:

Innodb_buffer_pool_reads 147
Handler_read_rnd 25 k
Handler_read_rnd_next 8,692 k
Binlog_cache_disk_use 203
Created_tmp_disk_tables 8,877
Key_writes 55 k
Select_full_join 1
Opened_tables 617
Table_locks_waited 223

I have almost the same, except I do not use innodb - so i can't tell you about commits and Binlog_cache_disk_use. Maybe someone else can explain. I think you should use another analyzer for innodb log.

Ah I see.. thanks so much for

Quarantine's picture

Ah I see.. thanks so much for the helpful input guys! I'll try installing memcache 1.5 later.

One quick question - right now I'm confused as to whether to install memcached on my old server (the one with the dual 5520) or on my new server, which currently houses MySQL only and nothing else?

This is my first time using two servers at once so I'm actually a little clueless - was using the installing instructions here for my CentOS 64bit (http://crazytoon.com/2008/01/20/memcached-how-do-you-install-memcached-c...) and configuring it here (http://www.lullabot.com/articles/installing-memcached-redhat-or-centos).

Under the "Install the Memcache PECL Extension" header in the Lullabot page, I have to add something to my php.ini and restart apache later, but obviously I can't do that on my new server as I don't have apache running there, only MySQL. On the other hand, I'm not sure if installing memcached on the old server would make sense as memcached was supposed to help with MySQL's load, which is currently located on my new server?

Or do I install memcached on both servers?

You got 3 things to install

Jamie Holly's picture

You got 3 things to install for memcache:

  • The actual memcache server
  • The PECL memcached extensions that provides the functions for PHP to communicate with the memcache server
  • The memcache Drupal module, which allows Drupal to communicate with Memcache via the memcache PHP/PECL extension.

So you can run memcache (the server) on the database server and then install the PHP/PECL extension on your web server. Honestly I would put it all on the webserver so that you can access it via localhost instead of having to go over the network.


HollyIT - Grab the Netbeans Drupal Development Tool at GitHub.

If I understand you

Quarantine's picture

If I understand you correctly, you're saying that I should install both memcache and the PHP/PECL extension on my web server right? So I can access it via localhost?

I'll be trying to do that now and will report back.

mikeytown2: Thank you for

andribas's picture

mikeytown2:
Thank you for pointing this!
I will try to change my 1.7 to 1.5 or dev in weekend.
I knew about 1 server to store all, but i'm new to memcached, and i tried to analyze what is going on, what data it stores, does it worth it, what's happening on peaks and so on.

Cacti shows me graphs like this:
Only local images are allowed.
Only local images are allowed.
Only local images are allowed.

  1. It is very useful to estimate how much memory do you need.
  2. In peaks, other tables cache_block, cache_menu, sessions etc. would not be purged from cache if cache_filter or cache_content will grow fast - of course, if you have no "unlimited" memcached with, say 1Gb memory, - but if you allocated for example 256Mb is it worth it?
  3. You can see Hits / misses and Requests get/set by table.

For example, I discovered in early setup that I have many requests and misses to cache_block and cache_views_data.
I'm having block caching = 5 min, no page caching and no views caching.

Then I found blocks where i use embed_view with argument term_id from node and particular vocaulary.
Blocks created by admin (not system of from modules) by default not cached, when I turn cache=1 (per role) for those blocks, that can be cached, I' ve got hits on cache_block and less requests to cache_views_data. Before that i tried to enable views caching, but it was grow and blocks still got misses.

This is why I suggested to start from sevelar memcached servers.
I agree that after you have estimated optimal size for cache, discovered it's performance in peaks, and tuned everything on site it's easy to use just 1 server.

Thanks guys! I've finally

Quarantine's picture

Thanks guys!

I've finally managed to install and activate memcache on all of my Drupal sites. One thing that I noticed is that the "Who's online" block doesn't function anymore - does anyone know how I could possibly get it to work again or if there's any decent alternative to it?

Also, I seem to be logged out automatically somewhat frequently - like every 15 minutes or so at this point. I tried Googling about this but couldn't find anything that could point me to the right direction.

I'll try and re-analyse my logs soon and post the results - currently I'm still stuck at work.

Incredibly bizarre - my

Quarantine's picture

Incredibly bizarre - my webserver crashed and I had to hard reboot it, and then I discover that the memcache module has been disabled by itself. I think I'm going to temporarily disable memcache first.

I'm not sure if I've done something wrong or not. I replicated the same memcache conf settings into each of my 4 sites' settings.php file - would that have caused any problems?

My code was:

$conf = array(
  'cache_inc' => './sites/all/modules/memcache/memcache.db.inc',
  'session_inc' => './sites/all/modules/memcache/memcache-session.inc',
  'memcache_servers' => array(
    'localhost:11211' => 'default',
  ),
  'memcache_bins' => array(
    'cache' => 'default',
  ),
  'maintenance_theme' => 'lithharbor',
);

probably you should add

andribas's picture

probably you should add prefix, or install different servers on each site

prefix

mikeytown2's picture

I use this to auto handle prefixes

<?php
  $conf
['memcache_key_prefix'] = array_pop(explode('/', str_replace("\\", '/', realpath(conf_path()))));
?>

http://groups.drupal.org/node/73463#comment-227488

For some reason, the Devel

dalin's picture

For some reason, the Devel module seems to be.. causing even more MySQL load on my server after I turned it on - is it me or does this happen to everyone?

Yes. Devel module adds a comment to each query to aid in debugging. An old bug in MySQL causes any query with a comment to bypass the query cache. You need to be running MySQL 5.5 or a recent build of Percona 5.1 to avoid this.

--


Dave Hansen-Lange
Director of Technical Strategy, Advomatic.com
Pronouns: he/him/his

I haven't heard of that mysql

moshe weitzman's picture

I haven't heard of that mysql bug. Thanks for the report. This would only affect users with 'access devel information' permission (or similar name - can't recall).

To be honest, the mysql query is usually disabled on large sites that I work on. The cache is so frequently cleared due to write activity that it is a net zero or negative. Drupal joins to the users table for almost every significant query and that table gets written to every 3 minutes by each user which means that it gets written to by one person or another every second. Thus, all those queries can't use cache. We reduced this write acitivity a ton in D7 so the query cache hit rate should be looked into again.

Anyway, I'd accept a patch that removes these comments from devel.

High performance

Group notifications

This group offers an RSS feed. Or subscribe to these personalized, sitewide feeds: