Load Balanced Boost ?

We encourage users to post events happening in the community to the community events group on https://www.drupal.org.
mburak's picture

Hi guys,

I've been working with Boost module for a long time in a single server and I have no problems. Our traffic is increasing day by day and I'm thinking in adding a new Apache server a make a load balanced environment. But here is my concern: What's the best way to confiugre Boost so it still works ok? Is it a NFS for the "cache" folder a good aproach? What do you think?

Comments

If you're still using Apache,

jcisio's picture

If you're still using Apache, switch to nginx. In my environment, Apache can handle maximum 1000-1200 req/s (with mod_cache) but nginx can go over 13k req/s.

You will be surprised!
http://groups.drupal.org/nginx

Have a look at my server 15 hours after setting nginx (Apache still at the backend, nginx just serves css/js and boost cached pages):
Only local images are allowed.

Apache with mod_php?

kbahey's picture

Are you using Apache with mod_php?

No wonder if it is so.

Apache can be configured as a threaded server, with very low overhead, and using fcgid to run PHP.

See our article: Apache fcgid: acceptable performance and better resource utilization for details.

Drupal performance tuning, development, customization and consulting: 2bits.com, Inc..
Personal blog: Baheyeldin.com.

Sorry for still off topic.

jcisio's picture

Sorry for still off topic. I've read your article several times ;) So I give some stats here. Almost no change in memory usage, but netstats peak (don't know why, maybe the reverse proxy) and very low iowait in CPU usage.
Only local images are allowed.
Only local images are allowed.
Only local images are allowed.

So it seems that nginx is a good choice. In your benchmark, you can only go up to 700 req/s but I have 1300 req/s with Apache (specs: X3220, 4 GB, single SATA 7200 rpm). And nginx blows them all away.

Off topic ...

kbahey's picture

I think you should post these to another topic about ngix or fcgid, and then we can discuss it there.

Otherwise, we are taking the discussion away from the original topic and not being respectful of the original question.

Drupal performance tuning, development, customization and consulting: 2bits.com, Inc..
Personal blog: Baheyeldin.com.

No, I use mod_fcgid with

jcisio's picture

No, I use mod_fcgid with suphp. But Apache is MPM Prefork as recommended by cPanel. I didn't benchmark the Worker mpm, either.

In fact, I made two optimizations at the same time: MySQL optimization with Views queries and switching to nginx on the frontend. So, indeed not only nginx made this happen.

NFS or rsync

kbahey's picture

Actually, either will work, NFS or rsync. The devil is in the details though.

With NFS you can add CPU load on the hosts, as well as lag in serving the files as well. It depends on your specific setup (data size, network speed, number and speed of CPUs, configuration, ...etc.)

Another option is rsync every minute or 5 minutes from cron for the cache directory.

Drupal performance tuning, development, customization and consulting: 2bits.com, Inc..
Personal blog: Baheyeldin.com.

Thread & Ideas

mikeytown2's picture

Here is a d.o thread on the subject
http://drupal.org/node/660598

  1. What is being proposed is some sort of framework that makes the database aware of which server has what cached file. If a server gets a request for a file that is cached, but it's not on this server it can copy that file locally, inject the html contents into the php buffer and close the connection. Next request will not hit php.
  2. Other option is to make each write hit all servers. This is fine since for an uncached hit, the DOM is sent, connection closed and then the files are written.



Cache flushing would need to tell the other servers what files are going to be nuked. Use an md5-ed random string for authentication (like I use on the crawler).





Any other ideas on the best way this could work? I like option 2 right now.

Interesting Article - tmpfs

mikeytown2's picture

This is for WP-Super cache but since they work the same way, it should cross over nicely
http://www.askapache.com/web-hosting/super-speed-secrets.html

Quickly skimmed this but looks interesting.

But they still use only one

jcisio's picture

But they still use only one server, don't they? nginx can serve boost static cache directly from memcache, too.

Technosophos has 53,900% speed up, and with a small modification, acts like nginx => memcache => boost => drupal. So the server doesn't go down again after reboot, when there is nothing in memory, thus no cache.

I'm sceptical whether tmpfs

dalin's picture

I'm sceptical whether tmpfs will make Boost faster/more scalable. The author of that article assumes that because clearing the cache was 30x faster, page loads should be too. The OS will cache commonly used files and will be effectively the same as using tmpfs. Except tmpfs locks a portion of RAM for this. OS file caching only happens if there's RAM available which in my mind is the preferable way to do it. You don't want to be blocking RAM from more intensive processes (ex. MySQL). I'd need to see benchmarks to be convinced.

--


Dave Hansen-Lange
Director of Technical Strategy, Advomatic.com
Pronouns: he/him/his

We've been running drupal in

jonvk's picture

We've been running drupal in a distributed environment, and we have indeed found that putting nginx in front of apache to serve boost's cached files helped enormously. We are still sharing the cached files using glusterfs to have replicated cache and files directories. I've written a blog post about it here http://evolvingweb.ca/story/drupal-cloud. The gist of it is that the cache files are being replicated on all apache and nginx nodes, and adding extra nodes on the fly becomes quite easy.

On the other hand, if you are having trouble with anonymous users only, we have found that nginx typically is not the bottleneck in cpu usage, but simply in network bandwith, and indeed, it can serve 10k-16k queries per second of cached files from gluster. Adding apache nodes helps for logged in users.

Sweet Setup

mikeytown2's picture

Sweet setup! How much slower is the cluster with glusterfs vs ext3? Also at what point does adding more computer nodes to gluster effect nginx & the cache? Would a memcache layer be better then disk? Problem with memory cache is when you restart you loose the cache.

want to know glusterfs vs

joetsuihk's picture

want to know glusterfs vs ext3 also. glusterfs will add complexity to server admin. Is it a huge difference?

GlusterFS config

tjwallace's picture

jonvk and I worked together on the above mentioned deployment, and I specially worked with the configuration of Gluster. The complexity of it depends on a few things, but mostly on how often you server infrastructure changes. Every time you add a new Gluster server, you have to edit all of the client config files. All of the replication and distribution is handled by the clients; the server is fairly "dumb" and just acts as a place to store data. All of the replication configuration is defined in the client configuration. Below is a snippet from a Gluster client config.

volume vol-0
  type protocol/client
  option transport-type tcp
  option remote-host server1
  option transport.socket.nodelay on
  option remote-port 6996
  option remote-subvolume iothreads
  option username glusterfs
  option password glusterfs
end-volume

volume vol-1
  type protocol/client
  option transport-type tcp
  option remote-host server2
  option transport.socket.nodelay on
  option remote-port 6996
  option remote-subvolume iothreads
  option username glusterfs
  option password glusterfs
end-volume

volume mirror-0
  type cluster/replicate
  subvolumes vol-0 vol-1
end-volume

Here you can see that I have two volumes, vol-1 and vol-2, coming from servers server1 and server2. These volumes are then mirrored into the mirror-0 volume, which is what ends up being mounted to the file-system. If I add another server, then I would have to add another volume to the config. In our case we have used Puppet to automate most of this for us, so adding a new web-node is as easy as editing a Puppet config file and then spinning up and registering a new machine. The config changes get picked up by the other nodes when they check in with the puppet master.

I have just started playing around with Chef and adding new web nodes becomes even easier. Chef has the ability to get a list (more specifically search attributes) of all the nodes acting as Gluster servers, and build the config file from that information. I plan on publishing my cookbooks soon to GitHub to get feedback and for others to check out.

Using tools like Puppet or Chef almost become a necessity when dealing with distributed systems. It keeps configuration consistent, and make setting up new boxes easy.

Chef Cookbooks

tjwallace's picture

As a follow up I have posted our cookbooks onto github here. I can work on getting our Puppet modules up as well if anyone is interested.

Puppet modules

tjwallace's picture

Some of our Puppet modules have been posted to GitHub as well: http://github.com/evolvingweb

glusterfs is slower, but there is an nginx work around

jonvk's picture

glusterfs is ~20 times slower for both reads and writes using small files (10-500KB) and about 40 times slower when nginx is reading directly from it in some quick benchmarks. This is using full replication.

However, the trick we use is that we set nginx's document root to the brick, so it reads of the ext3 file system. Since every read or write to the glusterfs volume will update the brick, nginx will be up to date. If nginx gets partitionned from the network, it can also still serve the stale data, and once it comes back on, running an ls -R on the glusterfs volume or remounting glusterfs will sync everything up again.

As to using memcache for the cached files, the other disadvantage is that the cache must be rebuilt if a server goes down or is added, since the hashing is done on the server pool. I've also read in many places that it's better to leave the memory caching to the OS, so for now I haven't experimented with those.

Some benchmarks with glusterfs (4 nodes, all on rackspace 2gb instances). We haven't played with more than 4 nodes, so glusterfs never was much of a problem.

Using ab to benchmark the same page being read by nginx from both ext3 and glusterfs:

ext3
17k requests / second
393MB / second

glusterfs
435 requests / second

File system io (with PostMark)
Current configuration is:
The base number of files is 200
15 subdirectories will be used
Transactions: 8000
Files range between 9.77 kilobytes and 488.28 kilobytes in size

Ext3
Data:
1284.86 megabytes read (183.55 megabytes per second)
1349.71 megabytes written (192.82 megabytes per second)

Glusterfs
Data:
159.17 megabytes read (7.96 megabytes per second)
206.96 megabytes written (10.35 megabytes per second)