Amazon s3 file serving with nginx & proxy_pass

Events happening in the community are now at Drupal community events on www.drupal.org.
mkalbere's picture

Hello,
Problem: Drupal hosted on a VPS and files(images/css/pdfs etc) served from amazon S3

I'm aware of CDN module, but instead of rewrite urls I would like to instruct nginx to serve amazon s3 files transparently. Is it possible ?

It seems to be a basic question, but I wasn't able the setup this and didn't find any reference/info/howto/tips on this.

Does somebody has an idea of what could be wrong ?

location /system/files/ {
resolver MY_DNS_SERVER;
rewrite ^/system/files/(.*)$ static/$1;
proxy_next_upstream error timeout http_500 http_502 http_503 http_504;
proxy_set_header Authorization '';
proxy_set_header Host MY_BUCKET.s3.amazonaws.com;
proxy_pass_request_body off;
proxy_set_header Content-Length '';
proxy_set_header Accept-Encoding "";
# disable buffering
proxy_buffering off;
proxy_max_temp_file_size 0;
proxy_pass http://MY_BUCKET.s3.amazonaws.com;
break;
}
++
Marc

Comments

With some more info

mkalbere's picture


2011/03/13 21:41:44 [debug] 6443#0: *4 http finalize request: -4, "static/test.jpg?" a:1, c:2
2011/03/13 21:41:44 [debug] 6443#0: *4 http request count:2 blk:0
2011/03/13 21:41:44 [debug] 6443#0: *4 http run request: "static/test.jpg?"
2011/03/13 21:41:44 [debug] 6443#0: *4 http upstream check client, write event:1, "static/test.jpg"
2011/03/13 21:41:44 [debug] 6443#0: *4 http upstream recv(): -1 (11: Resource temporarily unavailable)
2011/03/13 21:41:44 [debug] 6443#0: *4 http upstream request: "static/test.jpg?"
2011/03/13 21:41:44 [debug] 6443#0: *4 http upstream send request handler
2011/03/13 21:41:44 [debug] 6443#0: *4 http upstream send request
2011/03/13 21:41:44 [debug] 6443#0: *4 chain writer buf fl:0 s:1192
2011/03/13 21:41:44 [debug] 6443#0: *4 chain writer in: 0000000000D24B90
2011/03/13 21:41:44 [debug] 6443#0: *4 writev: 1192
2011/03/13 21:41:44 [debug] 6443#0: *4 chain writer out: 0000000000000000
2011/03/13 21:41:44 [debug] 6443#0: *4 event timer del: 9: 1300052564671
2011/03/13 21:41:44 [debug] 6443#0: *4 event timer add: 9: 60000:1300052564748
2011/03/13 21:41:44 [debug] 6443#0: *4 http upstream request: "static/test.jpg?"
2011/03/13 21:41:44 [debug] 6443#0: *4 http upstream process header
2011/03/13 21:41:44 [debug] 6443#0: *4 malloc: 0000000000D19F50:4096
2011/03/13 21:41:44 [debug] 6443#0: *4 recv: fd:9 121 of 4096
2011/03/13 21:41:44 [debug] 6443#0: *4 http proxy status 400 "400 Bad Request"
2011/03/13 21:41:44 [debug] 6443#0: *4 http proxy header: "Content-Length: 0"
2011/03/13 21:41:44 [debug] 6443#0: *4 http proxy header: "Date: Sun, 13 Mar 2011 21:41:44 GMT"
2011/03/13 21:41:44 [debug] 6443#0: *4 http proxy header: "Connection: close"
2011/03/13 21:41:44 [debug] 6443#0: *4 http proxy header: "Server: AmazonS3"
2011/03/13 21:41:44 [debug] 6443#0: *4 http proxy header done
2011/03/13 21:41:44 [debug] 6443#0: *4 xslt filter header
2011/03/13 21:41:44 [debug] 6443#0: *4 HTTP/1.1 400 Bad Request

What we did

mikeytown2's picture

We mounted the files directory as an S3 mount with http://code.google.com/p/s3fs/wiki/FuseOverAmazon
Then used these modules to help with the performance of images/css/js files by eliminating calls to file_exists.
http://drupal.org/project/advagg
http://drupal.org/project/imageinfo_cache

Used the CDN module to do the rewrite magic.

Why not use CloudFront to

jmccaffrey's picture

Why not use CloudFront to cache, using your server as the origin, and using the CDN module to rewrite to your CF distribution URL?

Older setup

mikeytown2's picture

That wasn't available 6 months ago. Times are changing. Your option is a better way to do it.

Just wondering what the

dalin's picture

Just wondering what the benefit of S3 would be over a real CDN. I think a real CDN would bring you better performance, reliability, scalability and less effort to setup/maintain. Is there a significant cost savings? CDNs start at about $40/mo.

--


Dave Hansen-Lange
Director of Technical Strategy, Advomatic.com
Pronouns: he/him/his

Amazon offers their

jmccaffrey's picture

Amazon offers their CloudFront CDN, discussed above. This is priced the same way as S3 and also gas commit levels with similar pricing to other CDN providers.

@mikeytown2 : s3fs is a very

mkalbere's picture

@mikeytown2 : s3fs is a very intresting tool, it could have simplified some of my migration procedures ..

@jmccaffrey: CDN vs s3 . In my case s3 is just perfect to handle a large amount of files (rarely served) without having to paid the price of a VPS extension ;-) main point was money ...

Both: To be true, my first question was not perfectly honest, the shown "configuration" was some king of proof of concept.
Nginx is pretty powerfull to handle complicated conditionnal rewriting. If proxy_passing request through nginx is possible, it would allow to write a couple of extra rules, to handle private/public serving and more ....

I'm aware of CDN module, but

Garrett Albright's picture

I'm aware of CDN module, but instead of rewrite urls I would like to instruct nginx to serve amazon s3 files transparently. Is it possible ?

I don't think it's a very good idea. You're basically doubling the network traffic going on (the visitor requests a file from nginx, which requests the file from Amazon), so it will slow things down and increase the traffic to your VPS - two things you probably got a CDN to avoid. Also, rewriting the URL will allow the Amazon stuff to be served from a cookie-free domain name - another reason to use a CDN.

So basically you seem to want to use a CDN in a way that negates most of the benefits of doing so.

You are right, I realized

mkalbere's picture

You are right, I realized that. That method was to complicate and not very efficient. I had to manage some "private files", so
- I use hook_file_url_alter
- I craeted an extra hook_file_path_alter called from cdn.advanced.inc/cdn_advanced_get_servers that allow me to modify the original "searched" file path

High performance

Group notifications

This group offers an RSS feed. Or subscribe to these personalized, sitewide feeds: