Can a pull based CDN be used inplace of shared file system for multiple webnodes?

Events happening in the community are now at Drupal community events on www.drupal.org.
ajayg's picture

If you have multiple webnodes , you need some kind of shared filesystem to sync uploaded files, static files etc.

Instead, what if you use a pull based CDN system. So all your files can be served from cdn1.yourdomain.com and all webnodes point to CDN.
the benefits are
1) You don't need to worry about managing a shared file system
2) You also get a free frontend to server static files.

This looks like a much tempting option than shared file system. Do you agree? Or there are gotchas I am not thinking through?

Comments

I recently set up a network

As If's picture

I recently set up a network of Drupal sites using separate Drupal installs and some shared DB tables across multiple domains. There are 4 public sites on the network and a 5th site (the "controller" site) that handles the shared tables (users and sessions, and a few others). The client was accustomed to CDNs and wanted something similar to what you are describing for images on all the public sites.

What I did was mount --rbind all public /sites/default/files folders to /var/www/controller_site/sites/default/files (this makes all uploads go to the controller site's files folder). Then I installed the CDN module on all sites (pull-based) and pointed the CDN at a CNAME for the controller site. Works nicely.

I think one potential gotcha

timdeeson's picture

I think one potential gotcha would be that you'd still need a 'master' web head where all content with file assets is uploaded, deleted etc. Normally with a load balancer requests would go to different web heads all the time but you'd need to fixate anyone undertaking these operations onto a master as the others won't have read / write access to that /files directory. This may or may not be possible or acceptable depending on your use case.

I'm also not sure that CSS / JS aggregation would work, as that could be generated on any head and would fail if not done on the master. There's probably other operations like this that could have problems.

Module that could solve the CSS/JS issue

mikeytown2's picture

http://drupal.org/project/advagg can be put into async mode where the CSS/JS file gets generated on request; you can select the IP address to send all requests to in this module. Doing this should make sure they only get generated on the master web head. One more thing to think about is a module I just created that will forward a request made to the files directory to any IP address http://drupal.org/project/files_proxy

Files Proxy looks very

ajayg's picture

Files Proxy looks very interesting but I could not understand how it is set up and works.
How does it solves the issue? If a file is uploaded on node A, how the node B is going to get it, even though it through CDN? Do yoy mind providing more details as this may be the solution I am seeking.

How files proxy works

mikeytown2's picture

Request comes in to x.example.com on 192.168.0.2 and looks like this: x.example.com/sites/y.example.com/files/imagecache/resize/a.jpg. The file doesn't exists so apache hands the request to drupal.

Files proxy picks it up, sees the host in the files path does not match the drupal files path (y.example.com VS x.example.com) and will forward the request to the correct host: y.example.com/sites/y.example.com/files/imagecache/resize/a.jpg and use a different IP address if that is set 192.168.0.8 (think multiple web heads). This allows all imagecache generation to happen on one box if that is desired & allows for a single CDN host to be used for a multisite.

IF this IP address doesn't have the file we are looking for, we can download and save it (or the root file if using imagecache) from another server. so the last request will be forwarded to: cdn.example.com/sites/y.example.com/files/imagecache/resize/a.jpg. This will do a DNS lookup so you don't have full control over the server IP that gets picked. I use this setting on our dev boxes and have it 301 to the CDN so my dev box has all pictures displayed; I could have it download the root file if I wanted to.

High performance

Group notifications

This group offers an RSS feed. Or subscribe to these personalized, sitewide feeds: