I'm running a high traffic Drupal site and having trouble with large MySQL binary logs being created (up to 1Gb every ~30mins at peak times). MySQL server disk space gets filled up before logs can rotate so the site falls over.
The site is multi-lingual, with normal-mode page caching enabled. I wondered whether any other High Performance Drupal ninjas had any experience with such problems. Any help or insight much appreciated.
I've implemented a belt-and-braces cron job to purge the binary logs older than X time, which will keep us going for the time being. If necessary I will turn off the MySQL binary logs completely, but I'm reluctant to do that since we'll have to put in place some other procedures to protect us against db failure.
The problems seem to be similar to those outlined on the Agaric Design blog - http://agaricdesign.com/note/mysql-binary-logs-death - but there is no definite solution on offer there, other than turning off the log-bin.
Site set-up:
- Ubuntu 7.10 Gutsy Gibbon
- Drupal 5.7
- Apache/2.2.4 (Ubuntu)
- PHP/5.2.3-1ubuntu6.3

Comments
An idea, use different MySQL for cache tables
Maybe you could setup a different MySQL server to host the cache tables. On that server, you could disable the binlog. The cache tables are accessed from cache.inc, which can be replaced from settings.php, so you could create a version of it that uses a different DB server to do its job.
Depending on what the site is doing, it might be updating variables, or other big records for each page, so every change in the DB has to be logged into the binlog. You could use the devel module to see which SQL statements are executed for each page, so this could give you an idea of what's going on.
Or... if you don't have a procedure to performs point in time recoveries, which is the main benefit of using the binlog, then you could disable the binlog completely. I think it depends on whether it can be assumed data loss since last full backup.
Different DB Same Server
You could follow what Markus said, but also do it on the same db server. Just add a new database, then do a binlog-ignore-db=cache_db. The other option, if you have the memory, is to use something like cache_router then set up memory based caching. If your using APC for op-code caching, I would suggest using that for Drupal caching, as APC seems faster then memcached, but it's only good if your running on a single Apache server. Otherwise you need to go with a memory caching type server that can share the data.
HollyIT - Grab the Netbeans Drupal Development Tool at GitHub.
Exclude watchdog, etc.
I would inventory your tables and see which one do not require replication.
You may loose some administrative functionality which you can correct by have an instance of Drupal pointed to your master for both reads and writes.
Check for example how often you are writing to watchdog -- I would exclude that.
You may have other high update tables such as accesslog.
If you want to trade more notes ping me.
Or just nuke them
You don't have to keep those logs around to save you from failure if you're replicating somewhere, which makes backing up easier anyway. So you know, you're not the only person who chose to simply delete logs on a regular basis; we simply delete all but the last 3 at any time.
Check also that this issue
Isn't causing your cache to be constantly refreshed.
http://pajamadesign.com/2008/06/29/the-t-that-took-down-a-webserver/
benjamin, Agaric Design Collective
benjamin, agaric
Binlogs don't allow table exclusion
"Exclude watchdog, etc."
I am in the same situation as joe-b, and we thought of limiting the tables to be included, but from what I understand the binlogs are by database only and cannot select only some of the tables.
162 Unlimited - Drupal Development and Web Design
binlog exclude table
binlog-ignore-db=table list...
what is binlog for?
MySQL binlog is to keep an audit trail for ALL DB writes.
ALL!!!
If you want to convert that to ascii and then parse out the statements for particular tables, that would be fine.
Otherwise, you risk people using an incomplete binlog to restore a DB! NIGHTMARE!!!!!!! :-O
Jeremy Donson
Database and Systems Engineer
New York City
binlog for backups?
If you are using binary logs, like running through them query by query for backups, your problems are already deep. A site with even medium sized traffic that involes writes will outscale the point where you can keep a whole set of binlogs very quickly. If the Drupal cache needs to go into a backup, something that is easily rebuilt, then I wonder if that makes sense.
a way not use binlog
I got same problem.if you can use 2 server, then my solution is use drbd+lvm+mysql. In this way you can backup all db by snapshot,and get a standby server. Noticed it will take more diskIO and networkIO.
drbd
I have seen examples of that, it can work, but it is more plates to keep spinning vs using Percona-InnoDB and using their Hotbackup(they call it xtrabackup) tool. I know there is reason to avoid InnoDB on older Drupals. I think that has largely gone away on 7, but ymmv. What you get for the switch though is a tool that in most cases can backup from a running server with very light overhead(a final lock near the end to get the last few transactions).. Even if you can not do InnoDB, I would still tend towards using a slave machine for that second machine vs drbd, mainly for the plate spinning reason and the disk and network effect are smaller and fairly consistent. Being able to have a snapshot approach is another point though that is a little more difficult with a replication slave(not that you couldnt do it). I would be interested in how you choose your snapshot periods and why.
Thanks,
Eric
It's not necessary to switch
It's not necessary to switch MySQL to Percona, XtraBackup will work with plain-old InnoDB as well.
--
Dave Hansen-Lange
Director of Technical Strategy, Advomatic.com
Pronouns: he/him/his