If you want to have multiple webservers for a Drupal site you need to keep your code and files directory in sync across all the webservers.
- For synchronizing the code solutions like capistrano, drush deploy, fabric, phing are all good solutions. Sharing the code using a network file system doesn't usually perform well because the files are re-read (or at least stat'd) during page load which is slow.
There are a few previous discussions on this topic over the years.
- November 2008: Load Balanced Servers Questions - Suggestions focus on nfs+symlink the sites directory
- March 2011: Best practice shared file system - NFS3 is working but they are hitting some limitations. Considering GlusterFS, Mogile, NFS4, Lustre - most commenters recommended GlusterFS.
- December 2011: GlusterFS and Drupal 7 horizontal scaling - Suggestions include: Keep code files local, but share the files directory. Use NFS, lsyncd, sshfs/fuse, DRBD, for files directory and try to offload the files directory to Varnish or a CDN so that each request for a file isn't hitting the shared directory.
- March 2013: Shared DocRoot For Redundant Web Servers - The idea was to put the drupal root in S3 instead of using NFS. 1 commenter disliked the idea.
- November 2013: The Drupal module S3 File System adds the ability to store uploaded files in an Amazon S3 bucket by adding a new Drupal filesystem alongside public and private.
- June 2014: The Drupal module Storage API adds the ability to store uploaded files in multiple containers, such as local filesystem (can be used together with other mounted directories), FTP, S3 and Database. Since it's considered as an API there are other contrib modules extending it's backends. Storage API adds stream wrappers and has a bridge to core file and image fields (including image styles). Since Files can be stored on multiple containers there are various configurations possible, such as storing uploaded files first locally, migrating them to multiple storage backends, provide backup methods and failovers, add more capacity by only uploading new files to a new server marked for population, while still serving already existing/ uploaded files from the old backends.
Comment on technology choices:
- GlusterFS: Used by Acquia and several other companies.
- NFS: Seems to be a commonly recommended solution. Drupal.org uses nfs.
- lsyncd: works well if you like lua
- rsyncing in cron and sticky sessions in the load balancer: good if you like rsync, may not work well for js/css unless js/css are served from a cdn.