Extremly poor Views2 performance on 30 exposed filters

Events happening in the community are now at Drupal community events on www.drupal.org.
inductor's picture

I`m trying to upgrade our hotel booking site. Each hotel has several characteristics, so I put them in integer CCK fields. After that I created a view with about 30 exposed filters according all the hotel fields.
After I put the site live (~1500 visitors per day), the mysql server went down almost immediately. The problem is crazy sql queries that views module makes. For example, http://pastebin.com/gxNTzwuR . It`s the real user query I`ve took from the query log.
I`ve added indexes to all that 'content_field_xxx' tables (field_xxx_value) - still no luck.

And yes - I use Pressflow on the dedicated server: Quad-core Xeon, 6Gb RAM. Apache is backed up with Varnish. All the tables use MyISAM engine.

Guys, what should I do? Is it my configuration fault, or Drupal just can`t handle such load? If so, it would be a catastrophe, because we`ve chosen Drupal as primary CMS about 3 years ago and we`ve got a lot of sites powered by it.

Oh, the site`s view URL is http://podmoskovie.info/s . But please, don`t check too many options - it will cause me to restart a server :).

Thanks in advance.

Comments

Solr

mikeytown2's picture

Might be time to think about using solr search
http://drupal.org/project/apachesolr
Grab views 3 while your at it

Oh, thanks!

inductor's picture

So if I understood you correctly, I can use Apache Solr as a indexing engine for all my CCK fields, keep my 'bunch-of-checkboxes' user interface in views3, and this will give me much better performance than mysql?

Thanks, slavojzizek! Already

inductor's picture

Thanks, slavojzizek! Already used it. Surprisingly, dbtuner told me that there`s too much full table scans and I should consider adding indexes to CCK tables :).

WAAAAAY OT

mbutcher's picture

And for just a second, I got all excited that Slavoj Zizek used Drupal, as it is, of course, THE sublime object of ideology (or maybe Open Source in general is).

Drupal MySQL Performance

ThetaJoin's picture

Just looking at the query isn't really enough information to help you out. Some additional information is needed:

Size of indexes:

mysql> SELECT CONCAT(ROUND(SUM(index_length)/(1024*1024*1024), 2), ' GB') AS 'Total Index Size'
FROM information_schema.TABLES
WHERE table_schema LIKE 'database';

Total Database size:

mysql> SELECT CONCAT(ROUND(SUM(data_length)/(1024*1024*1024), 2), ' GB') AS 'Total Data Size'
FROM information_schema.TABLES
WHERE table_schema LIKE 'database';

Per table sizes:

SELECT CONCAT(table_schema,'.',table_name) AS 'Table Name', CONCAT(ROUND(table_rows/1000000,2),'M') AS 'Number of Rows',
CONCAT(ROUND(data_length/(1024*1024*1024),2),'G') AS 'Data Size',
CONCAT(ROUND(index_length/(1024*1024*1024),2),'G') AS 'Index Size' ,
CONCAT(ROUND((data_length+index_length)/(1024*1024*1024),2),'G') AS'Total'
FROM information_schema.TABLES
WHERE table_schema LIKE 'database';

Note: Be VERY careful running the above commands on a busy server. Querying the I_S tables locks all the underlying MyISAM tables to query this information. If you can run this on a backup server that has a copy of your Drupal database that would be best. Replace the word database with the name of your Drupal database.

We would need to see the output of show variables and your currently used my.cnf file.

We would also need to see the output of show stats.

As for the query you posted, what does explain say about it? Adding indexes like you did more than likely caused more harm than good.

It sounds like you've enabled the slow query log. There are two types of queries: long running queries and very frequent queries. Both need to be investigated. We'll need to see this log too.

Once all the information has been posted, we can go from there.

HTH

Mark Schoonover
http://www.thetajoin.com
High Performance Drupal Hosting & IT Services

Hi Mark

inductor's picture

I can`t EXPLAIN the query - it just hangs (I`ve waited for about an hour or so).

Here`s the output of sql commands: http://pastebin.com/CzEnxZED
And the slow query log: http://pastebin.com/m5xbbAPV

Maybe I should tune MyISAM engine somehow?

Thanks in advance!

InnoDB

mikeytown2's picture

If your going to spend the time, tune it using InnoDB. Did some rough testing via the dbtuner module. On one of my sites I had 50 views that where running "Slow"; I switch DB's over to MyISAM and then number jumped to 60; switched back to InnoDB and it went to 47 (this is because all tables are optimized now).

If you can't run EXPLAIN on a query, thats not a good sign...

But

inductor's picture

Isn`t InnoDB slower than MyISAM on SELECTs?

This may have been true in

dalin's picture

This may have been true in the days of MySQL 4.x. Or maybe it still is in a controlled setting. But on a real site InnoDB is the clear winner. Especially due to its row level locking, and especially if we are talking about InnoDB plugin or XtraDB.

--


Dave Hansen-Lange
Director of Technical Strategy, Advomatic.com
Pronouns: he/him/his

It's the query!

murrayw's picture

The main problem is with the query itself. It will never run well on MySQL. I wouldn't worry about Drupal, Views or MyISAM. The query is such that adding indexes even won't help!

The main reason for this is that MySQL can only use one index per table during joins/filtering/sorting. It tries its best to be smart about it and uses what it knows about the tables to pick the best one. In the case of "node n", I would guess that is has selected "n.vid" as it is being used in all of the joins to the content type tables. In this case these joins will be fast. Looking at the query in this light I would guess that you are getting slowdowns for all the clauses in your WHERE statement because the indexes have already been used up. The ORDER BY is probably giving you grief as well as the sort will not use an index. The returned dataset may be too big for a temp table in RAM and it is being dumped to file - resulting in your freeze.

Another insurmountable problem is the number of joins. My rule of thumb is to keep it to 4 joins or below. You could go to 5 but I think MySQL starts to complain when it gets beyond that. ie. massive slowdowns. Maybe things have changed recently with later versions but maybe not. The bottom line is that there are too many joins. This can be worked around sometimes if the JOIN is there to return data. Solution is to ditch the join and run another lookup select to get the data. However, in the case it's there for the WHERE and so is needed.

Your problems seem to be due to the way CCK has split out tables to hold the fields. If they are "one to many" fields then there isn't much you can do as the new table is needed. However, if the field has been split out because another content type has used the same field name, you could bring the data back into one table by changing the field names to something unique across content types. If this can be done then you will reduce the number of JOINs and you will have a fighting chance. You may also be able to come up with multicolmn indexes which can be used in a more flexible way than just n.vid. Probably not though...

If moving the fields back into one table can't be done naturally then you might consider introducing some redundant data into the table to aid lookup. This is a PITA and not recommended but it might work if it is the indexes which are causing you the grief. Look at hook_nodeapi update/insert to keep them synched. Not too hard.

What are all the AND clauses in the INNER JOINS for? I would have thought that a join across to a content type (field) table would not need this. It should be a one to one mapping between vids. Try removing these if you can. This may help matters but probably won't help much.

If you want to debug it, go back to the simplest of queries, "select nid from node" :) and then start adding the joins and see where it dies. Run an EXPLAIN and see why. Crack open the MySQL client and run "SHOW FULL PROCESSLIST;" whilst the query is running. Keep banging the up arrow and enter to watch it progress in real time. Exciting! It is very educational as you can see when MySQL is dumping stuff to the filesystem (and is hanging).

And remember to run RESET QUERY CACHE; just before your SELECT or else you will be fooled into thinking your query is fast when it isn't. If you have a big dataset doing a "LOAD INDEX INTO CACHE node" could help to give you a good baseline as well.

One last suggestion - if you don't like the SQL Views pumps out then it can be changed with hook_views_pre_execute. You can rewrite all the SQL there, just make sure it returns the fields the View expects.

At the end of the day I reckon mikeytown2's Solr suggestion is possibly the only viable solution for you as it allows fast lookups on a number of indexes. However, try these suggestions first as you may be able to get it sorted - no pun intended. Good luck!

Managing Director
Morpht

The main reason for this is

dalin's picture

The main reason for this is that MySQL can only use one index per table during joins/filtering/sorting.

I don't think this is the case for MySQL >= 5.1
http://dev.mysql.com/doc/refman/5.1/en/index-hints.html

You could go to 5 but I think MySQL starts to complain when it gets beyond that.

I don't think this is specifically a MySQL issue, rather it's a limitation with SQL. Each JOIN makes the query exponentially more complex. 31 JOINs will never perform acceptably fast. I agree with murrayw, finding ways to remove the JOINs will be your best option.

Another problem that you won't be able to get around is that the query is filtering on one table (well more than one, but that's beside the point) and sorting on another. In this scenario MySQL must create a temporary disk table. The only way to avoid this in MySQL is to use David Strauss' materialized views. But then you may as well just use Solr.
http://drupal.org/project/materialized_view

If moving the fields back into one table can't be done naturally then you might consider introducing some redundant data into the table to aid lookup. This is a PITA and not recommended but it might work if it is the indexes which are causing you the grief. Look at hook_nodeapi update/insert to keep them synched. Not too hard.

This sounds like a really bad idea.

And remember to run RESET QUERY CACHE; just before your SELECT or else you will be fooled into thinking your query is fast when it isn't.

That method isn't reliable because after you reset the query cache Drupal could run the query before you run your test query from the CLI. The better way is to use
SELECT SQL_NO_CACHE ...
That way your production site won't suffer the performance hit of an empty query cache.

At the end of the day I reckon mikeytown2's Solr suggestion is possibly the only viable solution for you as it allows fast lookups on a number of indexes. However, try these suggestions first as you may be able to get it sorted - no pun intended. Good luck!

I will be very surprised if you manage to get this to an acceptable performance level. SQL is physically incapable of performing well for complex searches. Solr is the most well-integrated-with-Drupal alternative.

--


Dave Hansen-Lange
Director of Technical Strategy, Advomatic.com
Pronouns: he/him/his

Thanks alot, guys!

inductor's picture

I`ve installed Solr, imported my CCK fields to it and reindexed my nodes. The results are simply amazing! F.e., the query that choked MySQL is not a problem for Solr - it gives the answer in a split second. So now I`ll throw views and will write custom UI for my site.

Thanks again!

High performance

Group notifications

This group offers an RSS feed. Or subscribe to these personalized, sitewide feeds:

Hot content this week