Drupal 8: Load Balancing with HAProxy

Events happening in the community are now at Drupal community events on www.drupal.org.
AkshayKalose's picture

Hi, I am a Google Code-In 2014 Student. I wrote a great blog post and recorded a screen-cast of setting up load balancing with 4 servers. I wanted to share it with the community. :D

Here is the link: http://www.kalose.net/oss/drupal-8-load-balancing-haproxy/

Any feedback is highly appreciated!

Comments

a few thoughts

tloudon's picture

Hi Akshay,

I'm not sure where you are looking for feedback--the presentation or the content. Overall, I thought the presentation was good--the writing seemed clear and well-organized.

In terms of the content, I would probably not go w/ HA Proxy and have not seen it used much in the wild. At the levels of traffic that require a load-balancer, I have mostly seen companies using a hardware load-balancer like Big IP F5 and/or use Varnish since it offers simple round-robin load-balancing plus very flexible, well-understood, and widely-used caching. I do see HA Proxy used quite a bit w/ Rails systems and understand there are quite a few more load-balancing features; the article would be more interesting to me if it went into HA Proxy's relative merits. (I would probably gloss over the Drupal install and MySQL stuff as well, so the article would be more focused but I can also see how you would want it for completeness.)

Additionally, using a network file share is IMO a must. Syncing--especially with a shell script--seems far less robust, and I wouldn't want to use that in a setup w/ more than 2 web servers. Most of the higher traffic set ups I've seen use GlusterFS, but know of a few w/ NFS as well. The only major issue I've seen, is when the entire Drupal root is mounted; so every PHP file read--ie every request--hits networked disk. I remember this disastrous setup a client had that w/ 23 web servers with everything on NFS and the only thing that kept the site from crashing was that APC was set to cache the PHP until Apache was restarted--their actual authenticated traffic was minimal too. That setup coupled w/ the knowledge that a large part of the bootstrap process is spent including module files, makes me super wary of mounting anything but static assets. (See index.php -> drupal_bootstrap() ->_drupal_bootstrap_variables() -> module_load_all(TRUE). This runs drupal_get_filename() to list all of the modules and then runs drupal_load() which PHP includes the .module file)

Minor nitpicks aside, I thought it was a neat article. How did you end up going w/ D8 and HA Proxy?

Cheers,
tim

for nfs or glusterfs mount

Andre-B's picture

for nfs or glusterfs mount only the sites/default folder, and keep the regular files on the machines themselves. I had some issues with nfs (chmod, permissions etc.) - stopped using it, if you have any information on that let me know.

public files in D7

mikeytown2's picture

when it comes to public files in Drupal 7 I highly recommend using http://drupal.org/project/advagg/ for CSS/JS aggregates and https://www.drupal.org/project/imageinfo_cache for image style derivatives. They make the generation of said things more robust and where designed with a network mounted filesystem in mind.

NFS/GlusterFS for public files is a must

fabianx's picture

Thanks for the nice article.

Use rsync or a similar deploy mechanism for CORE and Contrib (Code), use Gluster FS or NFS for the public files (Data).

Unison might lead to more trouble than it is worth, especially with public files.

inotify WILL most likely lead to corruption problems and de-synchronization. It could also lead to an infinite loop easily.

I think you should add something like jenkins into the mix.

Really when using several servers, the workflow should be optimized to deploy automatically.

Keep up the good work and make it even better.

High performance

Group notifications

This group offers an RSS feed. Or subscribe to these personalized, sitewide feeds: