Fast and powerful servers, but slow site

We encourage users to post events happening in the community to the community events group on https://www.drupal.org.
wilmar81's picture

Hi!

We recently launched a new website with drupal and to avoid the site to often break, we decided to buy to servers. One who handles http requests, and the other where all queries occur.

We thought it would be easy, but the site is offline every our (almost when the cache is flush). Additionnaly, the site cannot handle more than users ( we consider a user is connected if his access time is lower than to 500s).

Here are the specs of our servers:

On the both servers, we have 16 cpus - Intel(R) Xeon(R) CPU - , and each has 2.27GHz speed.
Ram is at 12Go on both servers too.

We have created about 20 content types.

74 modules are installed.

As performance modules we have:

  • boost for non authenticated users
  • authcache for logged-in users
  • cacherouter user with authcache

The former site had about 30.000 visitors/day.
We didn't have statistics about the ratio loggued/non loggued.

I know this is not explained in details. But i can tell you more depending on your questions.

Hope you could help.

Thank you

Ps: the site is about classifieds ads.

Taxonomy, Cck and Views are installed amongst other modules.

Taxonomy has about 1463920 entries.

On tuesday, i noticed the following:

<

blockquote>The site breaks down almost every hour (i set up the cron to run every hour). Each time the site breaks, i rerun mysql and then the site is up again

<

blockquote>

Here is mysql conf file:

[mysqld]
safe-show-database

skip-locking
safe-show-database

interactive_timeout=100
wait_timeout=100
connect_timeout=10
max_user_connections=250
max_connections=500

key_buffer=64M
join_buffer=2M
record_buffer=2M
sort_buffer=4M
max_allowed_packet=24M

query_cache_limit = 2M
query_cache_size = 256M
query_cache_type = 1

tmp_table_size=256M
table_cache=2048
thread_cache=256
thread_concurrency=16
max_heap_table_size=256M
innodb_buffer_pool_size=1G

log_slow_queries = /var/log/slow-queries.log

[mysqld_safe]
open_files_limit = 8192

[mysqldump]
quick
max_allowed_packet = 16M

[myisamchk]
key_buffer = 64M
sort_buffer = 64M
read_buffer = 16M
write_buffer = 16M

Comments

On the both servers, we have

Garrett Albright's picture

On the both servers, we have 16 cpus - Intel(R) Xeon(R) CPU - , and each has 2.27GHz speed.
Ram is at 12Go on both servers too.

Throw more hardware at the problem. That will fix it.

Smart-assery aside, any number of things could be wrong here. What is it exactly that goes down? Is it the Drupal site itself, the PHP runtime, the web server daemon, the database daemon, the operating system, the physical hardware…? Do you see anything useful in the error logs for the affected programs?

What information do you need?

wilmar81's picture

Drupal itself don't go down. At the beginning, we used to have max_user_connections error. We set it 250, but the problems still occurs.

Concerning error_log, could you please tell me where to find error_log about mysql? I also tried to enable log for slow-queries, but it seems it doesn't work at all.

The main concern is mysql I think. Why? Because each time the site showed offline, I restarted mysql and it gone online the minutes after.

Php max_execution_time is 30
As web server, we're using apache
As db server we're using mysql
As operating system we use REDHAT Enterprise 5.5 i686

What information do you need about physical hardware?

--Willy

Life worth it to live. No matter what you're going through. Only your weakness can overcome your strenght.

--Sites
http://www.telecomaroc.com/
http://www.squaresystems.co.ma/

i686?

jvandyk's picture

If you truly have 12GB of RAM, you might want to consider running RHEL5 x86_64 instead of i686, since unless you are running PAE extensions with i686 much of your RAM is not being used (only the first 4GB is addressable). Explanation:

https://bugzilla.redhat.com/show_bug.cgi?id=241314#c10

MySQL configuration

ronaldbradford's picture

If you think it is MySQL perhaps you would like to post your my.cnf configuration.

There are also rather trivial steps in gather MySQL instrumentation to identify if the database is indeed your issue.

Some of my public scripts are at http://ronaldbradford.com/mysql-dba/
You generally need more knowledge in being able to interpret this information, such as identifying your resource bottleneck (if any) however from a system perspective you need to determine if/when/where the DB is an issue.

Take a look at this thread,

gansbrest's picture

Take a look at this thread, the guy had similar problems after cache clear:

http://groups.drupal.org/node/63393

It might be Apache and not MySQL problem (check out thread comments).

Investigate what's going on after you clear cache, use SHOW PROCESSLIST to check database for Locks.

Mysql conf

wilmar81's picture

To see mysql conf file, read the former post.

--Willy

Life worth it to live. No matter what you're going through. Only your weakness can overcome your strenght.

--Sites
http://www.telecomaroc.com/
http://www.squaresystems.co.ma/

Ethernet speed?

kbahey's picture

Make sure that you diagnose before you treat.

Just like you added lots of hardware and it did not help, you can be chasing an imaginary reasons, and not really the one (or few) that is the cause of slowness.

Assuming it is in MySQL, a lot of times, we see that the connection between the servers is set to 100Mbps or even 10Mbps, while the hardware and the switch can do 1,000.

Check that the port is full duplex, and they are at 1,000Mbps.

Drupal performance tuning, development, customization and consulting: 2bits.com, Inc..
Personal blog: Baheyeldin.com.

I've seen very similar

Jamie Holly's picture

I've seen very similar problems numerous times in the past and usually you find a bad query locking things up to be the culprit. You can vertically scale as much as you want in this scenario and it won't do a thing. In cases that it isn't a bad query, it's usually something hardware related and a common one is something Khalid mentioned - something bad in the network.

When the site is acting up try to login into MySQL from the command line and run a:

SHOW PROCESSLIST;

See if you have a query that has been running a long time. If not then see if you have a ton of sleeping connections doing nothing (and maybe a few actually running queries that are less than 1 sec). If this is the case then it really could be a network issue not letting connections close out properly. I've seen this before and it took me a lot of hair pulling to find the problem.

If you do have queries that are running for a long time, then copy the query that is taking for ever to run, restart the server and run the query manually to see how long it takes. Also do an explain on it to see if it's returning a lot of rows that need temporary tables and a filesort. These things can kill a server very quickly.

Your MySQL error log should be in /var/log/mysqld.log

For your slow query log, first thing to do is add this in there with it:

long_query_time = 1

The default slow query time is 10 seconds, but this will drop it to 1. You really shouldn't have any queries taking over 1 second on a page view, so this is a good way to catch it. After you add that, restart MySQL and give it time to catch some queries. You might also want to flush the cache after enabling the slow query log, just to see if it's something that has to be regenerated at that time.

One of the things with Views is that it doesn't create indexes, so when your dataset starts growing you can run into problems from it. You really need to go through and find if any queries are really slow and figure up if you can better index your tables, or if you need some other solution like caching or rewriting the query into a custom module.

Finally, as someone mentioned above - with 12g of memory you really should be running 64bit.


HollyIT - Grab the Netbeans Drupal Development Tool at GitHub.

SHOW PROCESSLIST : First check

wilmar81's picture

Intoxination, thank you for your response. I started by running SHOW PROCESSLIST; and this is what i got ( at that time, we have less visitors than normally - it is night by here):
+-----+----------------------+--------------------+----------------+---------+------+------------------+------------------+
| Id | User | Host | db | Command | Time | State | Info |
+-----+----------------------+--------------------+----------------+---------+------+------------------+------------------+
| 496 | wilmar_drupal | 192.168.23.1:57636 | wilmar_ma | Sleep | 79 | | NULL |
| 497 | wilmar_drupal | 192.168.23.1:57637 | wilmar_ma | Sleep | 51 | | NULL |
| 507 | wilmar_drupal | 192.168.23.1:57647 | wilmar_ma | Sleep | 76 | | NULL |
| 510 | wilmar_drupal | 192.168.23.1:57650 | wilmar_ma | Sleep | 37 | | NULL |
| 512 | wilmar_drupal | 192.168.23.1:57652 | wilmar_ma | Sleep | 11 | | NULL |
| 513 | wilmar_drupal | 192.168.23.1:57653 | wilmar_ma | Sleep | 3 | | NULL |
| 515 | wilmar_drupal | 192.168.23.1:57655 | wilmar_ma | Sleep | 8 | | NULL |
| 521 | wilmar_drupal | 192.168.23.1:57660 | wilmar_ma | Sleep | 8 | | NULL |
| 527 | dfk65_paper | 192.168.23.1:57664 | dfk65_journaux | Sleep | 2 | | NULL |
| 530 | wilmar_drupal | 192.168.23.1:57667 | wilmar_ma | Sleep | 7 | | NULL |
| 532 | wilmar_drupal | 192.168.23.1:57669 | wilmar_ma | Sleep | 101 | | NULL |
| 535 | wilmar_drupal | 192.168.23.1:57672 | wilmar_ma | Sleep | 94 | | NULL |
| 536 | wilmar_drupal | 192.168.23.1:57673 | wilmar_ma | Sleep | 5 | | NULL |
| 537 | wilmar_drupal | 192.168.23.1:57674 | wilmar_ma | Sleep | 18 | | NULL |
| 543 | wilmar_drupal | 192.168.23.1:57680 | wilmar_ma | Sleep | 12 | | NULL |
| 544 | wilmar_drupal | 192.168.23.1:57681 | wilmar_ma | Sleep | 3 | | NULL |
| 548 | wilmar_drupal | 192.168.23.1:57685 | wilmar_ma | Sleep | 6 | | NULL |
| 549 | wilmar_drupal | 192.168.23.1:57686 | wilmar_ma | Sleep | 74 | | NULL |
| 551 | wilmar_drupal | 192.168.23.1:57688 | wilmar_ma | Sleep | 31 | | NULL |
| 552 | wilmar_drupal | 192.168.23.1:57689 | wilmar_ma | Sleep | 7 | | NULL |
| 554 | wilmar_drupal | 192.168.23.1:57691 | wilmar_ma | Sleep | 22 | | NULL |
| 558 | dfk65_paper | 192.168.23.1:57695 | dfk65_journaux | Sleep | 75 | | NULL |
| 560 | wilmar_drupal | 192.168.23.1:57697 | wilmar_ma | Sleep | 8 | | NULL |
| 561 | wilmar_drupal | 192.168.23.1:57698 | wilmar_ma | Sleep | 45 | | NULL |
| 564 | wilmar_drupal | 192.168.23.1:57701 | wilmar_ma | Sleep | 5 | | NULL |
| 567 | wilmar_drupal | 192.168.23.1:57704 | wilmar_ma | Sleep | 25 | | NULL |
| 568 | wilmar_drupal | 192.168.23.1:57705 | wilmar_ma | Sleep | 7 | | NULL |
| 571 | wilmar_drupal | 192.168.23.1:57708 | wilmar_ma | Sleep | 62 | | NULL |
| 572 | wilmar_drupal | 192.168.23.1:57709 | wilmar_ma | Sleep | 33 | | NULL |
| 574 | wilmar_drupal | 192.168.23.1:57710 | wilmar_ma | Sleep | 41 | | NULL |
| 576 | wilmar_drupal | 192.168.23.1:57712 | wilmar_ma | Sleep | 0 | | NULL |
| 577 | wilmar_drupal | 192.168.23.1:57713 | wilmar_ma | Sleep | 7 | | NULL |
| 578 | wilmar_drupal | 192.168.23.1:57714 | wilmar_ma | Sleep | 7 | | NULL |
| 579 | wilmar_drupal | 192.168.23.1:57715 | wilmar_ma | Sleep | 59 | | NULL |
| 580 | wilmar_drupal | 192.168.23.1:57716 | wilmar_ma | Sleep | 11 | | NULL |
| 581 | wilmar_drupal | 192.168.23.1:57717 | wilmar_ma | Sleep | 35 | | NULL |
| 582 | wilmar_drupal | 192.168.23.1:57718 | wilmar_ma | Sleep | 58 | | NULL |
| 585 | wilmar_drupal | 192.168.23.1:57721 | wilmar_ma | Sleep | 42 | | NULL |
| 588 | wilmar_drupal | 192.168.23.1:57724 | wilmar_ma | Sleep | 8 | | NULL |
| 589 | wilmar_drupal | 192.168.23.1:57725 | wilmar_ma | Sleep | 23 | | NULL |
| 590 | wilmar_drupal | 192.168.23.1:57726 | wilmar_ma | Sleep | 7 | | NULL |
| 591 | wilmar_drupal | 192.168.23.1:57727 | wilmar_ma | Sleep | 19 | | NULL |
| 594 | wilmar_drupal | 192.168.23.1:57730 | wilmar_ma | Sleep | 21 | | NULL |
| 595 | wilmar_drupal | 192.168.23.1:57731 | wilmar_ma | Sleep | 23 | | NULL |
| 596 | wilmar_drupal | 192.168.23.1:57732 | wilmar_ma | Sleep | 39 | | NULL |
| 597 | wilmar_drupal | 192.168.23.1:57734 | wilmar_ma | Sleep | 18 | | NULL |
| 598 | wilmar_drupal | 192.168.23.1:57736 | wilmar_ma | Sleep | 7 | | NULL |
| 599 | dfk65_paper | 192.168.23.1:57738 | dfk65_journaux | Sleep | 16 | | NULL |
| 601 | wilmar_drupal | 192.168.23.1:57740 | wilmar_ma | Sleep | 3 | | NULL |
| 602 | wilmar_drupal | 192.168.23.1:57741 | wilmar_ma | Sleep | 20 | | NULL |
| 603 | wilmar_drupal | 192.168.23.1:57743 | wilmar_ma | Sleep | 33 | | NULL |
| 605 | wilmar_drupal | 192.168.23.1:57745 | wilmar_ma | Sleep | 30 | | NULL |
| 606 | wilmar_drupal | 192.168.23.1:57746 | wilmar_ma | Sleep | 4 | | NULL |
| 607 | dfk65_paper | 192.168.23.1:57747 | dfk65_journaux | Sleep | 2 | | NULL |
| 608 | wilmar_drupal | 192.168.23.1:57748 | wilmar_ma | Sleep | 25 | | NULL |
| 609 | wilmar_drupal | 192.168.23.1:57749 | wilmar_ma | Sleep | 16 | | NULL |
| 610 | wilmar_drupal | 192.168.23.1:57750 | wilmar_ma | Sleep | 8 | | NULL |
| 612 | wilmar_drupal | 192.168.23.1:57752 | wilmar_ma | Sleep | 9 | | NULL |
| 613 | root | localhost | NULL | Query | 0 | NULL | SHOW PROCESSLIST |
| 615 | wilmar_drupal | 192.168.23.1:57754 | wilmar_ma | Sleep | 6 | | NULL |
| 616 | wilmar_drupal | 192.168.23.1:57755 | wilmar_ma | Sleep | 4 | | NULL |
| 617 | wilmar_drupal | 192.168.23.1:57756 | wilmar_ma | Sleep | 9 | | NULL |
| 618 | wilmar_drupal | 192.168.23.1:57757 | wilmar_ma | Sleep | 6 | | NULL |
| 621 | wilmar_drupal | 192.168.23.1:57759 | wilmar_ma | Sleep | 3 | | NULL |
| 622 | unauthenticated user | 192.168.23.1:57760 | NULL | Connect | NULL | Reading from net | NULL |
| 623 | wilmar_drupal | 192.168.23.1:57761 | wilmar_ma | Sleep | 0 | | NULL |
| 624 | wilmar_drupal | 192.168.23.1:57762 | wilmar_ma | Sleep | 1 | | NULL |
| 625 | wilmar_drupal | 192.168.23.1:57763 | wilmar_ma | Sleep | 1 | | NULL |
| 626 | wilmar_drupal | 192.168.23.1:57764 | wilmar_ma | Sleep | 1 | | NULL |
| 627 | wilmar_drupal | 192.168.23.1:57765 | wilmar_ma | Sleep | 1 | | NULL |
+-----+----------------------+--------------------+----------------+---------+------+------------------+------------------+

Another one

+-----+---------------+--------------------+-----------+---------+------+----------------------+------------------------------------------------------------------------------------------------------+
| Id | User | Host | db | Command | Time | State | Info |
+-----+---------------+--------------------+-----------+---------+------+----------------------+------------------------------------------------------------------------------------------------------+
| 535 | wilmar_drupal | 192.168.23.1:57672 | wilmar_ma | Sleep | 169 | | NULL |
| 537 | wilmar_drupal | 192.168.23.1:57674 | wilmar_ma | Sleep | 195 | | NULL |
| 549 | wilmar_drupal | 192.168.23.1:57686 | wilmar_ma | Sleep | 175 | | NULL |
| 554 | wilmar_drupal | 192.168.23.1:57691 | wilmar_ma | Sleep | 177 | | NULL |
| 591 | wilmar_drupal | 192.168.23.1:57727 | wilmar_ma | Sleep | 166 | | NULL |
| 594 | wilmar_drupal | 192.168.23.1:57730 | wilmar_ma | Sleep | 160 | | NULL |
| 612 | wilmar_drupal | 192.168.23.1:57752 | wilmar_ma | Sleep | 159 | | NULL |
| 664 | wilmar_drupal | 192.168.23.1:57847 | wilmar_ma | Sleep | 136 | | NULL |
| 739 | wilmar_drupal | 192.168.23.1:57976 | wilmar_ma | Sleep | 99 | | NULL |
| 741 | wilmar_drupal | 192.168.23.1:57981 | wilmar_ma | Sleep | 96 | | NULL |
| 755 | wilmar_drupal | 192.168.23.1:34563 | wilmar_ma | Sleep | 86 | | NULL |
| 761 | wilmar_drupal | 192.168.23.1:34573 | wilmar_ma | Sleep | 81 | | NULL |
| 781 | wilmar_drupal | 192.168.23.1:34614 | wilmar_ma | Query | 33 | Copying to tmp table | SELECT DISTINCT(node.nid) AS nid,
node_counter.daycount AS node_counter_daycount,
node.changed |
| 795 | wilmar_drupal | 192.168.23.1:34637 | wilmar_ma | Query | 24 | Copying to tmp table | SELECT DISTINCT(node.nid) AS nid,
node_counter.daycount AS node_counter_daycount,
node.changed |
| 805 | wilmar_drupal | 192.168.23.1:34647 | wilmar_ma | Query | 30 | Copying to tmp table | SELECT DISTINCT(node.nid) AS nid,
node_counter.daycount AS node_counter_daycount,
node.changed |
| 820 | wilmar_drupal | 192.168.23.1:34676 | wilmar_ma | Query | 11 | Locked | UPDATE ma_node_counter SET daycount = daycount + 1, totalcount = totalcount + 1, timestamp = 1273104 |
| 823 | wilmar_drupal | 192.168.23.1:34678 | wilmar_ma | Query | 10 | Locked | SELECT DISTINCT(node.nid) AS nid,
node_counter.daycount AS node_counter_daycount,
node.changed |
| 824 | root | localhost | NULL | Query | 0 | NULL | show processlist |
| 831 | wilmar_drupal | 192.168.23.1:34692 | wilmar_ma | Query | 2 | Locked | SELECT totalcount, daycount, timestamp FROM ma_node_counter WHERE nid = 394325 |
+-----+---------------+--------------------+-----------+---------+------+----------------------+------------------------------------------------------------------------------------------------------+
19 rows in set (0.00 sec)

--Willy

Life worth it to live. No matter what you're going through. Only your weakness can overcome your strenght.

--Sites
http://www.telecomaroc.com/
http://www.squaresystems.co.ma/

Two problems ..

kbahey's picture

You have two problems.

One is the statistics module. Either disable it altogether, or convert the ma_node_counter table to InnoDB.

ALTER TABLE ma_node_counter Engine=InnoDB;

(Assuming that the InnoDB engine is enabled, check it via "show engines;")

Second, you have long running queries, also from the statistics module. Can't see the entire query, but try disabling the statisitcs module and see if things are better.

Drupal performance tuning, development, customization and consulting: 2bits.com, Inc..
Personal blog: Baheyeldin.com.

What Khalid said. Those

Jamie Holly's picture

What Khalid said. Those queries are killers.

Once you get that straightened out, you might also want to drop your interactive_timeout and wait_timeout to something lower, like 30. When you get really busy and you have a bunch of sleeping connections piling up, it doesn't take long before all your available connections on MySQL are full and that will give you the "site offline" message. MySQL can be really fickle when it comes to dealing with closing connections and allowing sleeping connections to build up.


HollyIT - Grab the Netbeans Drupal Development Tool at GitHub.

stats module

mikeytown2's picture

It can be "fixed" this is what I do since I like what the stats module does.
http://drupal.org/project/dbtuner - This can add indexes to the node_counter table & greatly improve the SELECT ... WHERE = ... queries. It can also convert every table to innodb; but this is not ideal. Need to have a list of highly recommended tables like comments, node_comment_statistics & node_counter; then recommended like node, node_access, node_revisions, sessions, history, files, term_node, url_alias, users, users_roles, watchdog, accesslog.
http://drupal.org/project/boost - Has the boost stats block which will cache the html if you tell it to; html gets regenerated on cron.

Doing this you can run the stats module on a site with millions of nodes. Just be aware that cron will take a long time, even with the indexes in place.

All to InnoDb?

wilmar81's picture

One question. Do you think it's a good idea to change our content types table to InnoDb? Some of them are still MyIsam.

--Willy

Life worth it to live. No matter what you're going through. Only your weakness can overcome your strenght.

--Sites
http://www.telecomaroc.com/
http://www.squaresystems.co.ma/

Depends ...

kbahey's picture

There are two schools of thought here. One says make everything InnoDB, which has advantages (e.g. non locking dumps), but can slow things up and risk higher recovery time, and difficulty of repairs if there is corruption.

The other school is to selectively change tables that you see locking on to InnoDB, and leave the rest as MyISAM.

Drupal performance tuning, development, customization and consulting: 2bits.com, Inc..
Personal blog: Baheyeldin.com.

I'm of the all-InnoDB school

dalin's picture

I'm of the all-InnoDB school of thought. Given that your average time between recoveries is measured in years, I'm ok if it takes a couple hours. And this problem is alleviated with the latest Percona builds. With everything InnoDB you can then reduce all the MyISAM-specific settings to near-zero and give more resources to the InnoDB-specific settings. With innodb_buffer_pool_size larger than your active dataset, innodb_flush_log_at_trx_commit = 0,
innodb_flush_method = O_DIRECT, on a well indexed database, running a recent version of InnoDB plugin or XtraDB, it's fast.

--


Dave Hansen-Lange
Director of Technical Strategy, Advomatic.com
Pronouns: he/him/his

Do you happen to have PHP

choloepus's picture

Do you happen to have PHP persistent connections enabled?

Php persistent connections

wilmar81's picture

They were, but are disabled now. And it helped to disable it.

--Willy

Life worth it to live. No matter what you're going through. Only your weakness can overcome your strenght.

--Sites
http://www.telecomaroc.com/
http://www.squaresystems.co.ma/

If you haven't already you

dalin's picture

If you haven't already you also should tune MySQL. The default configuration would be terrible for your site. I like the mysqltuner 2 script:
https://launchpad.net/mysqltuner

But I would strongly advise doing research to understand what a config value does before changing it and also benchmarking before and after to confirm that your changes actually made an improvement.

--


Dave Hansen-Lange
Director of Technical Strategy, Advomatic.com
Pronouns: he/him/his

Thank you for your help

wilmar81's picture

Thank you for your advises.

I'm not able to log into admin space now. The server where drupal files are stored has been hardened last night, and i think the disable something on it. When accessing /user/login page, we get the following error:

warning: preg_match() [function.preg-match]: Compilation failed: this version of PCRE is not compiled with PCRE_UTF8 support at offset

I disabled all statistics module directly from mysql, and i disabled cron to avoid the displaying of the previous error.

Will come back to you soon i think.

Thank you.

--Willy

Life worth it to live. No matter what you're going through. Only your weakness can overcome your strenght.

--Sites
http://www.telecomaroc.com/
http://www.squaresystems.co.ma/

Sleeping queries

wilmar81's picture

Look at the table below please.

Can someone tell me what are sleeping processes? Are the important? Are they necessary? How many are acceptable?

+------+----------------------+--------------------+-----------+---------+------+------------------+------------------+
| Id | User | Host | db | Command | Time | State | Info |
+------+----------------------+--------------------+-----------+---------+------+------------------+------------------+
| 1496 | unauthenticated user | 192.168.23.1:57433 | NULL | Connect | NULL | Reading from net | NULL |
| 1504 | wilmar_drupal | 192.168.23.1:57441 | wilmar_ma | Sleep | 29 | | NULL |
| 1508 | wilmar_drupal | 192.168.23.1:57445 | wilmar_ma | Sleep | 2 | | NULL |
| 1512 | wilmar_drupal | 192.168.23.1:57449 | wilmar_ma | Sleep | 13 | | NULL |
| 1514 | wilmar_drupal | 192.168.23.1:57450 | wilmar_ma | Sleep | 14 | | NULL |
| 1515 | wilmar_drupal | 192.168.23.1:57451 | wilmar_ma | Sleep | 21 | | NULL |
| 1516 | wilmar_drupal | 192.168.23.1:57452 | wilmar_ma | Sleep | 20 | | NULL |
| 1517 | wilmar_drupal | 192.168.23.1:57453 | wilmar_ma | Sleep | 8 | | NULL |
| 1518 | wilmar_drupal | 192.168.23.1:57454 | wilmar_ma | Sleep | 10 | | NULL |
| 1519 | wilmar_drupal | 192.168.23.1:57455 | wilmar_ma | Sleep | 11 | | NULL |
| 1520 | wilmar_drupal | 192.168.23.1:57456 | wilmar_ma | Sleep | 11 | | NULL |
| 1521 | wilmar_drupal | 192.168.23.1:57457 | wilmar_ma | Sleep | 5 | | NULL |
| 1522 | wilmar_drupal | 192.168.23.1:57458 | wilmar_ma | Sleep | 10 | | NULL |
| 1523 | wilmar_drupal | 192.168.23.1:57459 | wilmar_ma | Sleep | 7 | | NULL |
| 1524 | wilmar_drupal | 192.168.23.1:57460 | wilmar_ma | Sleep | 3 | | NULL |
| 1525 | wilmar_drupal | 192.168.23.1:57461 | wilmar_ma | Sleep | 9 | | NULL |
| 1526 | wilmar_drupal | 192.168.23.1:57462 | wilmar_ma | Sleep | 8 | | NULL |
| 1527 | wilmar_drupal | 192.168.23.1:57463 | wilmar_ma | Sleep | 1 | | NULL |
| 1528 | wilmar_drupal | 192.168.23.1:57464 | wilmar_ma | Sleep | 3 | | NULL |
| 1529 | wilmar_drupal | 192.168.23.1:57465 | wilmar_ma | Sleep | 2 | | NULL |
| 1530 | wilmar_drupal | 192.168.23.1:57466 | wilmar_ma | Sleep | 2 | | NULL |
| 1531 | wilmar_drupal | 192.168.23.1:57467 | wilmar_ma | Sleep | 2 | | NULL |
| 1532 | root | localhost | NULL | Query | 0 | NULL | show processlist |
| 1534 | wilmar_drupal | 192.168.23.1:57469 | wilmar_ma | Sleep | 5 | | NULL |
| 1535 | wilmar_drupal | 192.168.23.1:57470 | wilmar_ma | Sleep | 1 | | NULL |
| 1536 | wilmar_drupal | 192.168.23.1:57471 | wilmar_ma | Sleep | 2 | | NULL |
| 1538 | wilmar_drupal | 192.168.23.1:57473 | wilmar_ma | Sleep | 2 | | NULL |
| 1539 | wilmar_drupal | 192.168.23.1:57474 | wilmar_ma | Sleep | 1 | | NULL |
| 1540 | unauthenticated user | 192.168.23.1:57475 | NULL | Connect | NULL | Reading from net | NULL |
+------+----------------------+--------------------+-----------+---------+------+------------------+------------------+

--Willy

Life worth it to live. No matter what you're going through. Only your weakness can overcome your strenght.

--Sites
http://www.telecomaroc.com/
http://www.squaresystems.co.ma/

Sleeping processes are

nvahalik's picture

Sleeping processes are connections where there is a client actively connected to MySQL, but is not sending, receiving, or querying the database on the MySQL Server. That is, the scripts that have opened these connections are not processing SQL data. They may be processing some data internally, blocking while reading an external file or web service, or the server could just be plain slow.

Sleeping processes in-and-of itself is not a bad thing. The larger the value in the time column the longer MySQL has been waiting. If you have lots of waiting connections for more than 10 seconds and you are using persistent connections, then these would be really normal and are nothing to worry about. If they are not persistent connections, then you definitely have an issue to look at.

Sleeping processes

wilmar81's picture

We deactivated persistent connections from php. I mean, on the web server, we didn't allow php to use persistent connections. But, i use mysqli extensions rather than mysql as i read it is better.

I think i see what can cause these sleeping processes. I wrote a script which regularly imports some data from our old db to the new one. I will remove it from the cron and see what it look like from mysql.

Thank you.

--Willy

Life worth it to live. No matter what you're going through. Only your weakness can overcome your strenght.

--Sites
http://www.telecomaroc.com/
http://www.squaresystems.co.ma/

Apache mod_php

kbahey's picture

If you are running Apache as a mod_php, and have a high maxclients number, then it is expected to get lots of processes with sleeps.

One way to lower them is to switch to FastCGI, and run Apache as MPM Worker (threaded). This has the benefit of making Apache faster and less resource greedy for static files, and having a lower number of PHP processes connected to the database.

Drupal performance tuning, development, customization and consulting: 2bits.com, Inc..
Personal blog: Baheyeldin.com.

Moving from mod_php to fastcgi

wilmar81's picture

Thank you Khalid. As suggested, we're about to switch to FastCgi. I'll let you know when it's done.

I looked at mysql and saw many processes taking more than 100s. I think we will move to rewrite views queries, many slow queries come from views module. But this must be accepted first by my superior.

Any suggestion apart from rewriting some queries?

--Willy

Life worth it to live. No matter what you're going through. Only your weakness can overcome your strenght.

--Sites
http://www.telecomaroc.com/
http://www.squaresystems.co.ma/

Here is an article on Apache

kbahey's picture

Here is an article on Apache fcgid with Drupal, and the impact on performance. Details on how to configure it is also included.

For slow views queries, enable caching for all of them. The latest views has caching, but it has to be configured on a per-display basis, not global.

Drupal performance tuning, development, customization and consulting: 2bits.com, Inc..
Personal blog: Baheyeldin.com.

Views' per-display cache

wilmar81's picture

Thank you Khalid for the article.

I set most of our most used tables to innoDb. Mainly, those who are used with cck.

I have a question about views cache. You said:

it has to be configured on a per-display basis, not global.

. Does it mean it is better to give up global configuration in views' cache and set a cache ONLY per-display?

--Willy

Life worth it to live. No matter what you're going through. Only your weakness can overcome your strenght.

--Sites
http://www.telecomaroc.com/
http://www.squaresystems.co.ma/

Slow queries from views

mikeytown2's picture

This will give you a hint at which views are slow & why.
http://drupal.org/project/dbtuner
It can add an index to the cck value thus speeding up the view since it will no longer require a table scan when your view does a "filter by CCK field" operation.

The sleeping connections are

dalin's picture

The sleeping connections are only a problem if you are hitting the MySQL max connections limit. Otherwise, as Khalid mentioned, it's fairly normal with the standard Apache MPM worker - they aren't necessarily actively waiting around for something. As mentioned previously on this thread you can set interactive_timeout and wait_timeout to something in the range of 30 seconds. This will work as long as you don't have some active process happening that takes longer than 30 seconds - ex. if cron processes some files for > 30 seconds and then tries to access the database again the connection will be gone.

--


Dave Hansen-Lange
Director of Technical Strategy, Advomatic.com
Pronouns: he/him/his

Right Dave! I forgot to

wilmar81's picture

Right Dave!

I forgot to mention it in previous post, but i decreased those parameters you suggested. I set them to 30 both. But the site is still slow. It seems another site is running drupal code on the same server. I check it and come back latter.

Can you suggest me any tool or scripts which can help me how many connections the server handles by second?

Thank you

--Willy

Life worth it to live. No matter what you're going through. Only your weakness can overcome your strenght.

--Sites
http://www.telecomaroc.com/
http://www.squaresystems.co.ma/

Diagnose first.

stewsnooze's picture

When I have touched these things before I have temporarily added some file logging to cache.inc in cache_clear_all()

It would be useful for you to put a debug_backtrace() in that function out to a file. It is probably much more useful to only do this on $wildcard = true

That should allow you to see which modules are wanting to clear the cache. From there you'll be able to see whether they should be clearing the cache. Whether you move to fastCGI or tune MySQL if one of your modules is dumping the entire cache regularly then you'll hit this wall again.

I think that is an important step to take. If you end up needing to tune MySQL then I recommend switching your tables to innodb and giving the innodb buffer cache 75% of RAM on its dedicated server.

See this link for an example innodb config

http://www.mysqlperformanceblog.com/2007/11/01/innodb-performance-optimi...

Full Fat Things ( http://fullfatthings.com ), my Drupal consultancy that makes sites fast.

Well, a few things to be

jcisio's picture

Well, a few things to be considered:

  • boost: turn on the "don't use db" option so that your site doesn't go down each hour. Later, make sure you have a cronjob to remove cache file older than certain time (a script like 'find /path -atime xx blabla')
  • cacherouter: have you memcached installed and enable?
  • mysql: if you have only one site, why two different max_user_connections and max_connections? Conf seems ok, but you need check that with an analyser (like mysqlturner.pl). With your traffic, I think even 150 max conn is enough. 16 CPUs on db server don't help much, it's the HDDs that count (SCSI or RAID, or better: both of them?).
  • Finally, after some quick rescue step, you need to take more time investigate the cause.

Other thoughts: 30k visitors, but about pageviews, sure you have it? 100k pageviews is not a problem, but 1000k is yes. And I don't think there're too many registered on a classified ads site (ok, it depends).

A site with high traffic

wilmar81's picture

OK. Another wise advise. I'll investigate as you said. I'll try my best lol.

Yes, we do have memcache installed. You suggest then we use memcache with cache router. Where? Web server or db server?

The site has more than 400k pageviews a day, and 100k members. It is one of the first sites in morocco.

About the script we can write, do you mean something like this?

#!/bin/bash

find /path/to/files -type f -mtime +3 -exec rm {} \;

Here are informations about hdd server:


SCSI device sda: 976773168 512-byte hdwr sectors (500108 MB)
sda: Write Protect is off
sda: Mode Sense: 73 00 00 08
SCSI device sda: drive cache: write back
SCSI device sda: 976773168 512-byte hdwr sectors (500108 MB)
sda: Write Protect is off
sda: Mode Sense: 73 00 00 08
SCSI device sda: drive cache: write back
sd 0:0:0:0: Attached scsi disk sda
sd 0:0:0:0: Attached scsi generic sg0 type 0

--Willy

Life worth it to live. No matter what you're going through. Only your weakness can overcome your strenght.

--Sites
http://www.telecomaroc.com/
http://www.squaresystems.co.ma/

Hope you've turned off the DB

jcisio's picture

Hope you've turned off the DB option in boost, you'll see immediate effect. With too much term/node and Views, boost + db = nightmare (or "cauchemar").

The script: yes, but 3 min is too short. Something like 30 min or 1h is better.

HDD: it could be better, but don't talk about it now, the current hardware is enough :)

Cacherouter can be use with memcached, to replace database cache (the cache_abc tables). You may want to read more in cacherouter project/group.

You didn't say about APC, it should be enabled, too.

innodb_buffer_pool_size=1G Th

crea's picture

innodb_buffer_pool_size=1G
That's very low (if you have large enough DB). Make sure to run 64bit and increase it up to 60%-70% of RAM

Well, maybe, maybe not. You

dalin's picture

Well, maybe, maybe not. You need to increase it to slightly more than your active data set - big enough to keep the most-used data in memory. That may be 10% of RAM that may be 90% of RAM (if you can give that much). A tuning script like mysqltuner 2 can help you discover what that amount is.

--


Dave Hansen-Lange
Director of Technical Strategy, Advomatic.com
Pronouns: he/him/his

Thank you everyone

wilmar81's picture

Hi everyone!

We really expected we could solve the problem we had. From the day I opened the ticket till friday, the site's performance has gone better. But it was not good enough for the site's owner. He finally decided to give up this new version. He returned to the previous one. It's a great deception. But we learnt many things:

  • Innodb helped in that case and is a good solution.
  • NEVER allow cacherouter to use files ( when I disabled cacherouter and allowed autcache to use drupal tables, the site gave a better feeedback ), use APC instead, or memcache if available.
  • Boost is REALLY recommended. Without boost, the site couldn't stand up for a minute.
  • Never use DISTINCT in views' displays. It can kill out your db.
  • FastCgi can improve performances. And it did.

And thank you for all your advises.

--Willy

Life worth it to live. No matter what you're going through. Only your weakness can overcome your strenght.

--Sites
http://www.telecomaroc.com/
http://www.squaresystems.co.ma/

jackhutton's picture

I'm trying to get my head around performance issues - this all over my head but very helpful to read through - thanks for the distillation as well. challenging.

High performance

Group notifications

This group offers an RSS feed. Or subscribe to these personalized, sitewide feeds:

Hot content this week