GlusterFS with Boost - any known good methods?

Events happening in the community are now at Drupal community events on www.drupal.org.
timdeeson's picture

We frequently use GlusterFS ( http://www.gluster.org ) in load balanced clusters to synchronise the /files directory between the web heads. That works great but we haven't found (or investigated fully) a method that the Boost module could be used easily alongside? I'm hoping someone's already doing it and can provide a proven working example.

Boost in this scenario is being used to complement memcache and standard page caching to provide static page delivery for high volumes of anonymous users, open to other equivalent suggestions appropriate to the environment too.

The three issues I could think of are -

a)
the Boost paths need to be mapped to Gluster so that the heads all share and manipulate the same static files.

b)
Could Boost generate huge numbers of changes that generate a service impacting I/O - CPU hit when Gluster passes them around? We've scaled Gluster fairly far before but it is a virtual FS on top of a real one, synced over the network.

c)
Are there potentially concurrency issues if multiple Boost changes are triggered and multiple heads start interacting with the Boost store?

If someone has a proven setup they can share configuration for that would be ideal and avoids exploring some of the hypothetical problems too much!

Thanks in advance

Tim

Comments

I think most people that are

dalin's picture

I think most people that are using multiple web heads are using Varnish rather than Boost.

--


Dave Hansen-Lange
Director of Technical Strategy, Advomatic.com
Pronouns: he/him/his

Thanks, thinking about it I

timdeeson's picture

Thanks, thinking about it I think that is the more fundamentally appropriate solution in a multi-web head environment, Boost is addressing the wrong parts of the stack and keeping the problem at wrong layer.

Seen (b) in action

kbahey's picture

I have seen case (b) that you listed in action: Boost becoming a bottleneck even without the cache being shared via NFS/GlusterFS/...etc.

Posting a comment to a node was taking 20 seconds. What happened is that they had many web heads (I think 12 of them) were virtual machines on a beefy physical server but they all shared the same RAID-5 disk.

When a comment was posted, there was too much I/O on the disk, with boost trying to delete files, that everything was slow for tens of seconds.

By eliminating boost from the stack in this case, we were able to go to 6 seconds, I think.

With GlusterFS, the bottleneck will just move to the server where the files are hosted.

Drupal performance tuning, development, customization and consulting: 2bits.com, Inc..
Personal blog: Baheyeldin.com.

High performance

Group notifications

This group offers an RSS feed. Or subscribe to these personalized, sitewide feeds:

Hot content this week