I`m trying to upgrade our hotel booking site. Each hotel has several characteristics, so I put them in integer CCK fields. After that I created a view with about 30 exposed filters according all the hotel fields.
After I put the site live (~1500 visitors per day), the mysql server went down almost immediately. The problem is crazy sql queries that views module makes. For example, http://pastebin.com/gxNTzwuR . It`s the real user query I`ve took from the query log.
I`ve added indexes to all that 'content_field_xxx' tables (field_xxx_value) - still no luck.
And yes - I use Pressflow on the dedicated server: Quad-core Xeon, 6Gb RAM. Apache is backed up with Varnish. All the tables use MyISAM engine.
Guys, what should I do? Is it my configuration fault, or Drupal just can`t handle such load? If so, it would be a catastrophe, because we`ve chosen Drupal as primary CMS about 3 years ago and we`ve got a lot of sites powered by it.
Oh, the site`s view URL is http://podmoskovie.info/s . But please, don`t check too many options - it will cause me to restart a server :).
Thanks in advance.
Comments
Solr
Might be time to think about using solr search
http://drupal.org/project/apachesolr
Grab views 3 while your at it
Oh, thanks!
So if I understood you correctly, I can use Apache Solr as a indexing engine for all my CCK fields, keep my 'bunch-of-checkboxes' user interface in views3, and this will give me much better performance than mysql?
yep
year old article on the subject
http://acquia.com/blog/views-3-apache-solr-acquia-drupal-future-search
Modules of interest
http://drupal.org/project/apachesolr_ajax
http://drupal.org/project/apachesolr_views
This might be of some assistance
http://drupal.org/project/dbtuner
Thanks, slavojzizek! Already
Thanks, slavojzizek! Already used it. Surprisingly, dbtuner told me that there`s too much full table scans and I should consider adding indexes to CCK tables :).
WAAAAAY OT
And for just a second, I got all excited that Slavoj Zizek used Drupal, as it is, of course, THE sublime object of ideology (or maybe Open Source in general is).
Blog: http://technosophos.com
QueryPath: http://querypath.org
Drupal MySQL Performance
Just looking at the query isn't really enough information to help you out. Some additional information is needed:
Size of indexes:
mysql> SELECT CONCAT(ROUND(SUM(index_length)/(1024*1024*1024), 2), ' GB') AS 'Total Index Size'FROM information_schema.TABLES
WHERE table_schema LIKE 'database';
Total Database size:
mysql> SELECT CONCAT(ROUND(SUM(data_length)/(1024*1024*1024), 2), ' GB') AS 'Total Data Size'FROM information_schema.TABLES
WHERE table_schema LIKE 'database';
Per table sizes:
SELECT CONCAT(table_schema,'.',table_name) AS 'Table Name', CONCAT(ROUND(table_rows/1000000,2),'M') AS 'Number of Rows',CONCAT(ROUND(data_length/(1024*1024*1024),2),'G') AS 'Data Size',
CONCAT(ROUND(index_length/(1024*1024*1024),2),'G') AS 'Index Size' ,
CONCAT(ROUND((data_length+index_length)/(1024*1024*1024),2),'G') AS'Total'
FROM information_schema.TABLES
WHERE table_schema LIKE 'database';
Note: Be VERY careful running the above commands on a busy server. Querying the I_S tables locks all the underlying MyISAM tables to query this information. If you can run this on a backup server that has a copy of your Drupal database that would be best. Replace the word database with the name of your Drupal database.
We would need to see the output of show variables and your currently used my.cnf file.
We would also need to see the output of show stats.
As for the query you posted, what does explain say about it? Adding indexes like you did more than likely caused more harm than good.
It sounds like you've enabled the slow query log. There are two types of queries: long running queries and very frequent queries. Both need to be investigated. We'll need to see this log too.
Once all the information has been posted, we can go from there.
HTH
Mark Schoonover
http://www.thetajoin.com
High Performance Drupal Hosting & IT Services
Hi Mark
I can`t EXPLAIN the query - it just hangs (I`ve waited for about an hour or so).
Here`s the output of sql commands: http://pastebin.com/CzEnxZED
And the slow query log: http://pastebin.com/m5xbbAPV
Maybe I should tune MyISAM engine somehow?
Thanks in advance!
InnoDB
If your going to spend the time, tune it using InnoDB. Did some rough testing via the dbtuner module. On one of my sites I had 50 views that where running "Slow"; I switch DB's over to MyISAM and then number jumped to 60; switched back to InnoDB and it went to 47 (this is because all tables are optimized now).
If you can't run EXPLAIN on a query, thats not a good sign...
But
Isn`t InnoDB slower than MyISAM on SELECTs?
This may have been true in
This may have been true in the days of MySQL 4.x. Or maybe it still is in a controlled setting. But on a real site InnoDB is the clear winner. Especially due to its row level locking, and especially if we are talking about InnoDB plugin or XtraDB.
--
Dave Hansen-Lange
Director of Technical Strategy, Advomatic.com
Pronouns: he/him/his
It's the query!
The main problem is with the query itself. It will never run well on MySQL. I wouldn't worry about Drupal, Views or MyISAM. The query is such that adding indexes even won't help!
The main reason for this is that MySQL can only use one index per table during joins/filtering/sorting. It tries its best to be smart about it and uses what it knows about the tables to pick the best one. In the case of "node n", I would guess that is has selected "n.vid" as it is being used in all of the joins to the content type tables. In this case these joins will be fast. Looking at the query in this light I would guess that you are getting slowdowns for all the clauses in your WHERE statement because the indexes have already been used up. The ORDER BY is probably giving you grief as well as the sort will not use an index. The returned dataset may be too big for a temp table in RAM and it is being dumped to file - resulting in your freeze.
Another insurmountable problem is the number of joins. My rule of thumb is to keep it to 4 joins or below. You could go to 5 but I think MySQL starts to complain when it gets beyond that. ie. massive slowdowns. Maybe things have changed recently with later versions but maybe not. The bottom line is that there are too many joins. This can be worked around sometimes if the JOIN is there to return data. Solution is to ditch the join and run another lookup select to get the data. However, in the case it's there for the WHERE and so is needed.
Your problems seem to be due to the way CCK has split out tables to hold the fields. If they are "one to many" fields then there isn't much you can do as the new table is needed. However, if the field has been split out because another content type has used the same field name, you could bring the data back into one table by changing the field names to something unique across content types. If this can be done then you will reduce the number of JOINs and you will have a fighting chance. You may also be able to come up with multicolmn indexes which can be used in a more flexible way than just n.vid. Probably not though...
If moving the fields back into one table can't be done naturally then you might consider introducing some redundant data into the table to aid lookup. This is a PITA and not recommended but it might work if it is the indexes which are causing you the grief. Look at hook_nodeapi update/insert to keep them synched. Not too hard.
What are all the AND clauses in the INNER JOINS for? I would have thought that a join across to a content type (field) table would not need this. It should be a one to one mapping between vids. Try removing these if you can. This may help matters but probably won't help much.
If you want to debug it, go back to the simplest of queries, "select nid from node" :) and then start adding the joins and see where it dies. Run an EXPLAIN and see why. Crack open the MySQL client and run "SHOW FULL PROCESSLIST;" whilst the query is running. Keep banging the up arrow and enter to watch it progress in real time. Exciting! It is very educational as you can see when MySQL is dumping stuff to the filesystem (and is hanging).
And remember to run RESET QUERY CACHE; just before your SELECT or else you will be fooled into thinking your query is fast when it isn't. If you have a big dataset doing a "LOAD INDEX INTO CACHE node" could help to give you a good baseline as well.
One last suggestion - if you don't like the SQL Views pumps out then it can be changed with hook_views_pre_execute. You can rewrite all the SQL there, just make sure it returns the fields the View expects.
At the end of the day I reckon mikeytown2's Solr suggestion is possibly the only viable solution for you as it allows fast lookups on a number of indexes. However, try these suggestions first as you may be able to get it sorted - no pun intended. Good luck!
Managing Director
Morpht
The main reason for this is
I don't think this is the case for MySQL >= 5.1
http://dev.mysql.com/doc/refman/5.1/en/index-hints.html
I don't think this is specifically a MySQL issue, rather it's a limitation with SQL. Each JOIN makes the query exponentially more complex. 31 JOINs will never perform acceptably fast. I agree with murrayw, finding ways to remove the JOINs will be your best option.
Another problem that you won't be able to get around is that the query is filtering on one table (well more than one, but that's beside the point) and sorting on another. In this scenario MySQL must create a temporary disk table. The only way to avoid this in MySQL is to use David Strauss' materialized views. But then you may as well just use Solr.
http://drupal.org/project/materialized_view
This sounds like a really bad idea.
That method isn't reliable because after you reset the query cache Drupal could run the query before you run your test query from the CLI. The better way is to use
SELECT SQL_NO_CACHE ...That way your production site won't suffer the performance hit of an empty query cache.
I will be very surprised if you manage to get this to an acceptable performance level. SQL is physically incapable of performing well for complex searches. Solr is the most well-integrated-with-Drupal alternative.
--
Dave Hansen-Lange
Director of Technical Strategy, Advomatic.com
Pronouns: he/him/his
Thanks alot, guys!
I`ve installed Solr, imported my CCK fields to it and reindexed my nodes. The results are simply amazing! F.e., the query that choked MySQL is not a problem for Solr - it gives the answer in a split second. So now I`ll throw views and will write custom UI for my site.
Thanks again!