Apache server configurations for high availability, load balancing

We encourage users to post events happening in the community to the community events group on https://www.drupal.org.
Amazon's picture

Currently Drupal.org has multiple WebServers that I believe are synchronized to load balance and provide a degree of high availability.

I believe this is done by rsyncing the file directories served by the web. I know there have been criticism of this design on the infrastructure list. I am curious what configurations people are recommending for Apache servers. I am assuming DBs are on separate servers. I am assuming different web servers are used for static pages or cached pages, and for dynamic pages.

Are people using Rsync, NFS mounts, S3?

What has worked best for you and why does it meet your needs?

Kieran

Comments

Mirroring with rsync

Amazon's picture

I believe this is how D.O. is configured. Please correct me if this is incorrect.

http://www.howtoforge.com/mirroring_with_rsync

Kieran

To seek, to strive, to find, and not to yield

< a href="http://www.youtube.com/watch?v=COg-orloxlY">Support the Drupal installer, Install profiles, and module install forms
<a href="http://ia310107.us.archive.org/1/items/organicgroups_og2list/dru

Yes, that is how D.O. is configured

Boris Mann's picture

And there are issues every time there is an update, and issues whenever you add new sites and don't set up Rsync correctly.

NFS is another point of failure, yes, but if you do a mount from a SAN or a NAS, it's also highly redundant storage, so you're not relying on the internal disks on your cheapo web front ends.

We use NFS/SAN too, but not on Drupal, yet.

markus_petrux's picture

We also use NFS, installed on simple PC, which has 5 units (10GB each) connected to a SAN, then shared to the LAN. The bottleneck of this NFS server is, of course, the interface to the LAN, however we haven't reached the limit by far, nor even during backups, which are dumped to local disk, gzip'd and moved to NFS.

The site is pretty big, during september we have had about 45 million page views and 1,9 unique visitors. There are 2 separate databases (on 2 separate MySQL boxes, no master/slave, all tables InnoDB), one DB is about 5GB, the other one is a bit less.

Our problem with the NFS filesystem is that we have millions of static pages and images stored. That makes a huge directory structure, which is really hard to manage.

This site in NOT using Drupal, yet. It uses a propietary CMS that's causing some other headaches, so I'm trying to fight against them until we get enough stability, and planning to move to Drupal. I hope our numbers may also help in this context.

Cheers

GFS

meba's picture

Red Hat GFS is an option too + MySQL master/slave replication. I tested once and worked, you can contact me for details.

NFS failure equals web server death

Amazon's picture

Ok, I asked around. The complain with NFS is that when it fails all your web server fails. I guess you need really good NFS infrastructure.

I am still researching this issue.

Kieran

To seek, to strive, to find, and not to yield

< a href="http://www.youtube.com/watch?v=COg-orloxlY">Support the Drupal installer, Install profiles, and module install forms
<a href="http://ia310107.us.archive.org/1/items/organicgroups_og2list/dru

will123195's picture

You can use rsync for the code base, but most people will probably be using SVN updates to keep the code base in sync. In regard to web app data (like user images, etc), I like to take advantage of the database for distributing images and other files in real time--sometimes actually storing binaries in the database then saving the data to the distributed file systems when needed. For stuff like video, fancier coding in the application layer can intelligently distribute files on-demand...instead of 'hoping' the rsync has been done. Here is link to a network diagram of my preferred server architecture.

NFS

moshe weitzman's picture

i have worked with Bryght on a large project and we used NFS. Rsync is OK if you have sticky port enabled on the load balancer otherwise your uploaded file will be unavailable if you happen to switch web servers after upload.

Rsync and Unison

techsoldaten's picture

We use rsync for actually loading files and unison for syncing changes between servers. While unision is really no longer supported, it is still a great tool and is good at maintaining changes on servers over time.

M

Post your unison script

pearcec's picture

Can you post your unison script for reference?

Christian

--
Christian

We use rsync at

rkerr's picture

We use rsync at StandardInteractive but currently have the luxury of sending all our site editors to a single "edit" site.

we hacked the file.inc

slantview_old's picture

we do a ftp of every file that is uploaded to a second redundant load balanced server. now once we add another server we are going to have to re-evaluate, most likely moving to nfs shared directory.

Steve Rude

update

slantview's picture

For the site that we had hacked the file.inc, this has now been moved to an NFS setup. So now all changes are instantly propagated to all of the front end web servers.

steve

What non-interactive websites like TheOnion do?

sgottlieb's picture

What do the really large, less interactive websites (like www.theonion.com) do? Do they use the Boost module? Or do they wget the whole site to a bunch of static web servers?

For more interactive sites, is anyone using a hardware based SAN (storage area network)?

Thanks, Seth

cache

slantview's picture

After a small amount of research I found some publicly available information about theonion.com. In a press release dated 11/2005 Mirror Image (www.mirror-image.com) announced that they were serving up theonion.com on their content delivery network.

I connected directly to the servers and they are serving up their content via Red Hat Linux (looks like RHEL4) running Apache 2.0.52 and PHP 5.1.5. They appear to have load balanced servers (at least 3, likely more) running at Rackspace.com for their host. For a pretty static site without a lot of interaction, they can easily have mirror image cache the site and make browsing very fast and have low overhead.

For a more interactive site you might want to split up the content for your site and have all static content served out of a fast web cache and have all php driven content served out of a load balanced application server farm.

For scaling the database layer you would have to rewrite the backend and split all read/write functionality. This could easily be accomplished by creating a new database.cluster.inc file. You would then send all writes to the master db or master-master db(s) and balance all reads across the slaves. For more information, see the mysql replication page

For the storage layer, you can use a NAS or SAN at first, but eventually you will hit a road block and that road block is called disk I/O. This is a much larger problem, and you will likely have at least 20 million pageviews per month at this point (possibly more). This can be solved several ways (MogileFS or GoogleFS or roll your own).

A lot of this information can be found in Cal Henderson's book (of the flickr fame) called "Building Scalable Web Sites", or on various sources on the internet.

Here are the headers from their own app/web servers:

HTTP/1.1 200 OK
Date: Fri, 26 Jan 2007 17:56:38 GMT
Server: Apache/2.0.52 (Red Hat)
X-Powered-By: PHP/5.1.5
Last-Modified: Fri, 26 Jan 2007 16:25:37 GMT
ETag: "6dc8dd7b72b755267057a5fb49af7a6a"
X-Generator: 94676-migrationcontent3
Expires: Fri, 26 Jan 2007 18:16:42 GMT
Connection: close
Content-Type: text/html; charset=utf-8

and here are the headers from their content delivery network:

X-Powered-By: PHP/5.1.5
Last-Modified: Fri, 26 Jan 2007 16:25:37 GMT
X-Generator: 94676-migrationcontent3
Accept-Ranges: bytes
Cache-Control: public
Date: Fri, 26 Jan 2007 17:36:32 GMT
Etag: "6dc8dd7b72b755267057a5fb49af7a6a"
Expires: Fri, 26 Jan 2007 17:56:32 GMT
X-Message: ret x-front
Server: Apache-Coyote/1.1
Content-Type: text/html;charset=utf-8
Via: 1.1 ics_server.xpc-mii.net (XLR 2.3.0.2.23a)
Age: 278
Content-Length: 45486

*** disclaimer ***
I am in no way affiliated with theonion.com, cal henderson, or mysql.com.

steve

a few options

moshe weitzman's picture

for the flat html site, try http://drupal.org/project/fastpath_fscache or a squid proxy in front of drupal. squid might require some drupal changes to http headers. i haven't tried this.

i have had clients using a SAN and thats nice and fast disk access. Sometimes disk access can be the bottle neck. i can't really comment on details about their setup though. was a black box for me.

TexasNomad's picture

THIS IS NOT A DRUPAL SETUP but soon will be.

At my day job,

We have 2 clusters- one for the loadbalancers, one for the back-end NFS mounts. (The NFS mounts are only used for big FTP files)

Everything is replicated every hour (or by the dev) by running an rsync script on the active backend cluster node and pushed to the 4 webservers, which are your basic lamp stack setup on Centos. This setup has worked very good for 5 years! Very little down time including the single point of failure for the DB server. Our next rollout will include mysql cluster in some kind, alleviating the single point of faliure with the db.

We did have a slave mysql server running on each web server but it became hard to manage.

In considering moving to Drupal, I am concerned about some of what I am reading about sql locks and such as we ran into those and similar issues with OSCommerce. We moved some tables to InnoDB storage, tweaked some SQL code in OSCommerce. We found some design flaws with that product as the stats for each item were kept in the same table as the item itself. Meaning that in order to update the stats of an item because it was just viewed, we had to lock the table to update the number. Readers block writers and writers block readers so it brought the Cart to a slow crawl.

anisotropic's picture

community.activestate.com is running on a single box that is otherwise part of a 3 node cluster that serves our other UNIX apps. We tried loading Drupal (4.7) on all of the boxes but ran into two problems:

  1. file uploads will break. We had a fancy solution for this where all file upload POSTS and download requests are directed at a single server through redirection, but this didn't work. NFS and rsync solutions are prone to failure and it is hard to say what the best solution is. I had thoughts of triggering rsync jobs as needed via an Apache not found handler?

  2. MySQL (5) master <--> replication breaks due to problems in the Drupal datamodel, I think specifically with the cache and session tables.

For a small-to-medium size business like ours having a cluster in this configuration is ideal for redundancy with lrelatively few front-end boxes. Master <--> master replication in this case is also ideal because the data is redundant and you eliminate the point of failure of having your network connection die on a dedicated MySQL server box. Having the mysql database running local on each machine is just plain faster and less prone to breakage. For this site in particular speed is not the issue, redundancy is.

NFS/rsync ok

moshe weitzman's picture

I know many sites successfully using NFS or rsync to share the files directories. No big deal there. NFS is ideal but rsync every 1-2 minutes works too.

edit: rsync is only ok if it is acceptable for uploads not to appear on node views on other web servers for a brief period.

hrmz, hacky...

anisotropic's picture

Our IT has dismissed NFS as hacky and prone to break other sites running on each machine. rsync and particularly spawning an rsync job from an apache 404 event for the file makes a little more sense to me.

I think if we dug into it we could come up with a better solution for static files but there isn't much point without the master/master replication capabilities. If we only have the database on one server, we still have a single point of failure and a reliance on networking to server content. Our idea of redundancy is for each of our three front-end servers to be completely independent.

iSCSI SAN

blender1968's picture

One option is to use an iSCSI SAN with a parallel file system that allows concurrent read/write access. Someone mentioned GFS - there is also OCFS2. Each web node mounts the device that is formatted with OCFS2 filesys. Make sure the SAN itself is fault tolerant.

Cheers

iSCSI Enterprise Target?

markus_petrux's picture

Looks interesting, though it might be a bit expensive... Could an iSCSI software emmulator such as iET work as an alternative? Not an alternative to an iSCSI SAN of course, but may be to NFS? :-)

http://iscsitarget.sourceforge.net/

Hosting Provider should offer iSCSI SAN option

blender1968's picture

Your hosting provider should offer iSCSI access to their SAN. Since you access a block device you can format the device with the filesystem of your choice - in this scenario OCFS2.

Would be costly to build your own or buy an appliance.

It's pretty easy to do a proof-of-concept with iET in your own sandbox- a Hosting Provider will have a commercial SAN (with iSCSI target).

Cheers

Not our hosting provider :-(

markus_petrux's picture

We're in Spain, and happy with our hosting provider, but... they don't offer this kind of service, at least, not yet. Maybe we have to push them to, a little bit... :-)

iSCSI Enterprise Target + GFS + 1Gbps Ethernet?

markus_petrux's picture

I'd like to find a way to store static content better/faster than NFS, which has some problems because changes are not reflected synchoronously among all NFS clients. Decreasing cache times is not possible because of big performance penalty, at least with our environment. In regards to connections, we have a private LAN (1Gbps Ethernet), 11 servers that need to share a common storage resource.

I'm even thinking in using a dedicated MySQL box for caching static content, and things that would fall into drupal files, such as user files/images, etc, plus a proxy cache in accelerator mode in front, which is something that we already use and works pretty well (SQUID). MySQL with a big enough InnoDB bufferpool is quite fast. I think it could a valid and scalable storage alternative as well.

Then, I've also been looking at iET, though if you need to share a "target" read/write, then you need something like GFS. However, I've no experience with GFS. I've been reading a lot about it recently. It requires RedHat Cluster installed on all boxes, which makes the whole infrastructe more complex. I could try to get a couple of boxes for playing around, but I'm not sure if that would really worth.

I think we'll finally use MySQL as a file storage for things that change often, caching, user files, etc. and then keep NFS for stable files, such as programs, theme images, js, css, etc.

Anyone experience with iSCSI Enterprise Target + GFS + 1Gbps Ethernet? I haven't found anything outthere about this particular combination. Our current hosting provider does not offer any iSCSI/SAN similar solution. :(

Cheers

MogileFS?

blender1968's picture

You might consider MogileFS. You would incur some development burden most likely as you access the storage via an API but it might be a potential solution.

Excerpt from the Wikipedia entry:

It is designed for high volume applications, such as high traffic websites, to spread storage across cheaper machines without relying on technologies such as NFS.

Cheers

AmazingGroups.com

Netzarim's picture

We are currently deployed with multiple apache servers with memcache running as VMs behind a load balancer with caching. The servers are images of each other with changes to IP and server name. Apache maps to a common NFS mount holding all the site information. We have one backup NFS server that stays offline with the exact same config and the production one and we make backups of the export on a schedule. Although we currently are running off a single MySQL, we are looking to move to either a cluster or a master slave arrangement but are still determining suitability due to some cache being held in the database.

The biggest issue we've seen is if NFS goes offline or loses connectivity for any reason, the NFS mounts need to be redone and apache restarted. We have a script that monitors for this condition every few minutes.

All servers are VMs - except the iscsi server and mysql. We are in the process of converting the mysql server to VM as well. The iscsi target may go that way too as a clustered solution, but I am not seeing a real advantage to that right now. Data storage for the VM system is currently from the dedicated iscsi target using bonded gigabit ports with a large in ram cache for read ahead.

This has allowed us to create a very portable system that is easily and quickly expanded based on potential growth. VMs can be brought up across several hardware platforms and if needed all on one.

We are considering some restructure options, but are int he process of building a new VM host system right now. We would still like to see the site perform much faster, but we have tweaked beyond belief.

NFS plus local backups

adamfranco's picture

Another alternative (that I've detailed in this blog post) is to make use of an NFS share for file storage, but to also maintain a local read-only copy of the uploaded files on the file-system of each web-server. A script monitors the NFS share every minute and if it becomes unavailable, changes a symbolic link to point at the read-only backup. This setup provides the immediate availability of an NFS share under normal operation, but with graceful degradation (files are still readable, just not writable) if the NFS server goes down.

In our testing we found that if we yank out the network connection from the NFS server the web-servers hang for 2 minutes (the default 'soft mount' timeout for NFS) and then we get switched to the backup files. After reconnecting the network for the NFS server we are back to read/write on the NFS share within one minute.

This setup meets our requirement for high availability where we aren't too worried about super high traffic. Hope this can help someone in the same position. (My thanks to all for the previous comments in this thread that helped me to get to this solution.)

  • Adam

Presentation GlusterFS

ducdebreme's picture

File serving in a clustered environment is a very difficult thing. At the Drupalcon Pairs there was a very interesting presentation about High availability. They recommended GlusterFS for file sharing. The info about the presentation can be found here :
http://paris2009.drupalcon.org/session/performance-and-high-availability...
... and i hope there will be the video available soon!
Stefan

Video from drupalcon paris

pmcdougl's picture

Here is the video that ducdebreme mentioned in his post over 4 years ago. I'm posting it here for people who find this old thread.

http://archive.org/details/BuildingScalableHighPerformanceDrupalSitesInT...

High performance

Group notifications

This group offers an RSS feed. Or subscribe to these personalized, sitewide feeds:

Hot content this week