I currently have an Nginx (SSL,GZIP) -> Varnish -> Pressflow/Apache setup (more detail config @ bottom)
Atm, all GZIP compression is done @edge by Nginx.
Varnish caches UN-gzip'd content, and Nginx GZIPs at every request.
Couple of questions aimed at further improving caching/compression performance:
(1) Iiuc, if I move GZIP'ing to Apache, Varnish will cache GZIP'd content ... removing the gzip-at-every-request load.
True?
(2) It seems I've got 2 options for GZIP'ing @ the backend --either in/via Apache mod_deflate, or using Drupal's Performance config, i.e.,
Page compression: Disabled --> ENabled
GZip JavaScript: Disabled --> ENabled
Given my config, If I turn off GZIP @ ngingx edge, which is better performing?
(3) With GZIP @ nginx, I've managed to get my headers setup correctly so that content arrives -- at least as seen by LiveHTTPHeaders firefox module -- at my browser properly GZIP'd.
I've experimented a bit with turning mod_deflate "ON" in Apache, instead. In a direct-to-Apache test, all's well. But, with Nginx & APache still in the loop, no content seemed to be arriving @browser GZIP'd. Likely I've got my configs screwed up.
Anyone have some guidance on making GZIP headers "behave" so that (a) varnish caches them properly, and (b) nginx passes the Apache-managed GZIP headers to the outside world properly? Happy to provide any/all config detail if it helps with the question.
Thanks,
BenDJ
details of my setup are:
Nginx (0.8.42)
SSL on
GZIP on
redirect to Varnish backend
Varnish (trunk)
Memcached (1.4.5)
PHP (5.3.2)
+php-memcached (php5-pecl-memcache-3.0.3-2.3.i586)
+APC
Pressflow (6.17.85)
module(varnish-6.x-1.x-dev)
module(memcache-6.x-1.x-dev)
cache_inc -> memcache.inc
session_inc -> memcache-session.inc
reverse_proxy -> varnish
@ Performance
external caching
Page compression: Disabled
Block cache: Disabled
Optimize CSS files: Enabled
Optimize and Minify JavaScript files: Enabled
GZip JavaScript: Disabled
Use JSMin+ instead of JSMin
Apache (2.2.15, prefork MPM)
mod_php
!SSL
!GZIP
Comments
I'm a big fan of Varnish, but
I'm a big fan of Varnish, but in this config what purpose is it serving? Nginx can do all the caching Varnish is doing, so why have the additional layer of overhead?
Simply, choice. Apart from
Simply, choice.
Apart from the general matter that there's a LOT more people "in community" using Varnish to cache, in conjunction with Pressflow, and, as such, a discussion can b more readily had ...
(1) Varnish is setup as a ReverseProxy to pressflow. Afaict, that's -- at least -- recommended, and better documented, if not unique to Varnish.
(2) Varnish is using malloc storage. I.e., caching's in RAM. Iiuc, nginx caches to disk.
(3) I've implemented a 'faux' parallel CDN in Varnish.
(4) I deploy other front ends elsewhere (Pound, Apache TrafficServer). It's convenient for me to keep my 'backends' separate and portable.
(5) When I'd initially benchmarked a with- vs without-Varnish setup, the with-Varnish setup simply blew the doors off of Nginx alone. I went with performance & stability.
But, tbh, it may all be doable in Nginx. I simply find splitting Nginx's & Varnish capabilities more straightforward, and the communities more responsive to usage questions in/re: this arch.
In any case, though fine points, they've nothing to do with my GZIP-related questions ... or have I misread your intent?
Fair enough. :-) It stuck me
Fair enough. :-) It stuck me as a bit odd to have requests hit both nginx and Varnish when nginx can do it all - I'm pretty sure nginx can do malloc storage (although odds are the "disk" storage is already in the kernel cache anyway).
As for the GZIP question, my first impression is to do it at the Apache/mod_deflate level and compress if before it's cached - therefore squeezing a bit more capacity out of your cache space.
I'm pretty sure nginx can do
if you can find it,
http://wiki.nginx.org/Special:Search?search=cache
let me know -- it sure didn't leap out at me.
Forgot to mention a significant driver for me -- Poul Henning-Kamp's behind Varnish (which is, additionally, a company that'll sell support, as req'd). When it comes to memory management & kernel interactions, his FreeBSD credentials are, imo, a big plus!
Now, onwards ...
Sure, that's what I'm attemtping. Not for the reasons of capacity, but rather for performance -- in this scenario, the CPU isn't tasked -- i think -- with RE-compressing every request served from Varnish. Rather the user's simply served the already-compressed content from the Varnish cache.
That said, I'm having a heck of a time -- per my OP -- getting my "this content is GZIp'd" headers to survive the traverse from Apache's mod_deflate, thru Varnish, and Ngingx. Something's wrong/missing ... atm, I'm clearing caches and staring at Firebug output :-/
That said, I'm having a heck
Have you monitored the headers coming out of Varnish with something like a "varnishlog -c -o ReqStart (client IP address)"? That'll show you the entire conversation between Varnish and nginx. You should be able to see the headers there. Or simplify for testing and temporarily bypass one or two of the elements in the stack to see where along the line the headers are getting stripped. Hit Varnish directly (i.e. take nginx out of the loop). Do the headers make it through Varnish? Alternately, if nginx is set to point directly at Apache and bypass Varnish, do the headers survive?
Have you monitored the
actually, no, I hadn't. I'd been using Firebug ... but that was not looking at the right part of the transaction.
Turns out step 1 was easy (obvious?) enough -- removing from nginx conf:
proxy_set_header Accept-Encoding "";
for my SSL listener. With that, I see the gzip'd content & headers properly traverse the stack and 'out' to the client browser.
unfortunately, there's some cache issue -- @ access to new pages, e.g. after login -- the actual result page is not displayed in browser. seems i get something from "the" cache (in varnish? in drupal? in broswer?) history. one-to-multiple pages refreshes clear the problem, and I eventually see the 'right' page.
if i swtich back to my original, gzip-in-nginx config, page refreshes are all OK. something's wrong in my setup if/when gzip is @apache backend.
oddly, perhaps coincidentally, i see a similar behavior -- occasionally -- when logging into THIS site to post a reply. sometimes I have to multiple-refresh for the "Comment" box to appear, even though I'm logged in.
which leads my to suspect, atm, caching "versus" authentication. still don't know/see enough yet to know.
A couple of points on your
A couple of points on your OP.
First, if you have GZip enabled anywhere in your stack, you should definitely disable Drupal's GZipping ("Page compression" on the performance page). This can theoretically cause problems where your site content is double-compressed and appears as gibberish once it hits browsers. Either way, if something further down in the stack is doing the compression, then we don't want to do it at the PHP level.
Second, how Lighty's GZip implementation works is that it can just GZip stuff on every page load, but it can also cache the GZipped data as a file and then just serve the pre-compressed file instead of redundantly compressing something when it's served the second time. It uses hashes and file paths to determine what data it may have compressed before and can serve from cache. I'm not certain, but it's possible that Nginx's GZip implementation can similarly cache like this and avoid redundantly compressing things on every page load. Peek through the configuration documentation, or perhaps post a query to the Nginx group here on g.d.o.
The Boise Drupal Guy!
garrett, you should
garrett,
yep. done.
Sure nginx caching's an option, but I wanted to have that @Apache/Varnish.
I finally managed to get where I wanted to be ... Apache doing all compression on the Backend, compressed results being RAM-cached in Varnish, Nginx doing no compress/cache, and headers surviving correctly.
Req'd appropriate tweaks in apache mod_delate/filter conf, varnish vcl, and nginx conf.
Thanks for the comments!
Okay, I think I get what you
Okay, I think I get what you were going for now. Please forgive me if this is a dumb question, as I'm still learning a lot of this stuff myself, but if Apache is doing the PHP and compression work and Varnish is doing its Varnish thing, then what are you using Nginx for? Just to serve static files?
I'm sure Apache fans are sick of us Apache haters asking this, but… Have you considered using Nginx to do the PHP work as well?
The Boise Drupal Guy!
garrett, what are you using
garrett,
i use nginx primarily for three things: (1) conditional redirection, (2) SSL handshake, and (3) convenient header management.
e.g.,
handle the SSL handshaking
(re)direct all port:80 traffic to port:443 SSL sites
provide throttling, and offline redirection -- to local/static nginx sites, as well as other servers
add "Vary: Accept-Encoding" headers, etc.
etc.
until this recent exercise, I also used it to GZIP content on the fly.
I'm never sick of asked questions -- what goes around, comes around.
Personally I find the "haters" perspective ... self-defeating and somewhat foolish. ;-)
of course I have. I found it underperformed, was unstable under load, and far less widely supported/documented that nginx. all these things, of course -- in my case. i.e., "Different Strokes ..." & "YMMV".
bottom line -- use what works best, for you.
Sorry to raise from the dead,
Sorry to raise from the dead, but did your experiment reap any rewards?
We used Drupal's gzipping
We setup a site using nginx and boost for the caching. We found that having Drupal store the gzip version was much faster.
During benchmarking, having nginx serve the pre-gzipped file versus compressing it everytime gave a roughly 3-4 fold improvement on small pages. This was a few months ago, and I don't think I have the exact numbers anymore. Also, this was with keep-alive and on a local network, serving the same file over and over again, so not the most realistic stress test. If the cache is invalidated frequently, so that the read/write ratio is low, it might be better to let nginx deal with the gzipping rather than PHP.
There is a module for nginx (compile with --with-http_gzip_static_module) and set "gzip_static on". Ubuntu Lucid's package of Nginx comes with this module enabled. This checks whether a file with the .gz extension exists when serving files of certain types, and if it exists, serves it. I'm not sure how well that will interact with varnish though, since it will need to check whether the file exists first.
Performance of Drupal Sites Event
Silicon Valley Drupal User Group meeting is planned on "How to Improve the performance of Drupal sites"
Discussion with Kieran Lal. Performance Optimization and latest updates on Drupal 7 and current, Performance improvement features:
o Drupal caching: pages, blocks, views, etc
o CSS & Javascript aggregation
o PHP memory usage
o MySQL queries
o About Core and performance specific improvements in D7
o Module selection and activation
o 3rd party scripts and styles
o Use of asynchronous requests
o Gzip compression
o Core search in use
o Enabled Apache & PHP modules
o Representative page inspection
o Load testing
o Views query performance
o Monitor server load
o Database engine
Mike Mayo : Mike has been an evangelist and a contributor to open source projects, He would field the hosting queries and may demonstrate the best practices and scenarios and discuss eco systems fit for your Drupal sites.
Mike is a mobile developer too and gets the credit to build the Rackspace and Slicehost apps for iPhone, iPad, and Android.
Scalability pointers from server side: Chef to setup Drupal clusters in the cloud.
Also, Halosys engineers will demonstrate few recent launches and how they achieved the drastic improvement in performance of Drupal sites.
For more details please visit http://www.meetup.com/DrupalGroup/calendar/14109188/
Cheers
Vish