How to use nginx proxy_cache to cache all matched uri ?

Events happening in the community are now at Drupal community events on www.drupal.org.
404's picture

I have several drupal sites that rarely change. I want nginx to cache all uri in proxy_cache for x days, except

  location ^ admin/*

For example, a user requests /node/3, nginx looks for the uri in proxy cache. If it finds it, then serves it directly. If it can't find it, proxy the request to backend (apache+php in my case). When apache returns the content, nginx puts it in proxy cache and send it to the user.

My current conf works (I don't use imagecache) but doesn't achieve my goal. Here is my configuration file in whole.

  server {
          listen       80;
          server_name  example.org;
          root   /var/www/pf;
          index  index.html index.php;

  #######  PART 1 ######

          location ~* .+\.(ico|jpg|gif|jpeg|css|js|flv|png|swf)$ {
           expires max;
           proxy_cache cache;
           proxy_cache_key $host$uri#is_args$args;
           proxy_cache_valid 200 304 12h;
           proxy_cache_valid 302 301 12h;
           proxy_cache_valid any 1m;
          }

  ############ PART 2 ################
          # we need this 'location / '  since all drupal requests are handled by index.php
          # and since by simply use location ~* .php$ won't work??!!
          location / {
           proxy_pass       http://apache;
           proxy_set_header  X-Real-IP  $remote_addr;
           proxy_set_header Host $http_host;
          }

  }

I think in PART 1, I put static files in proxy cache. Static file served by nginx but I can't verify whether they come from the proxy cache.

Is there a way to injecting a special header value before putting a file into the cache? So i can tell if a file is coming from the cache simply by looking at the headers.

PART 2 seems to me to make ALL requests go to apache. But I want nginx to try proxy cache first.

How do I do that? Thanks in advance!

Comments

aussielunix's picture

Hello,

You may want to look at combining the Boost module with nginx. Whilst I haven't tried this myself it does look like a nice solution.
You may be able to find some useful info in the following articles.
http://www.go2linux.org/linux/2010/04/how-install-nginx-run-drupal-boost...
http://groups.drupal.org/node/46404

Cheers
Mick

Regards
Mick Pollard

www.lunix.com.au

Boost module is definitely

brianmercer's picture

Boost module is definitely the easier way to go if you're on a single server. I'd recommend using Boost also.

However, if you really want to use nginx caching you'll want to:
1. replace Drupal with Pressflow and set Drupal caching to 'external',
2. install my little header module, http://www.brianmercer.com/nginx-header-module for sending the 'X-Accel-Expires: 0' header so logged in pages aren't cached, and
3. use a configuration more like this:

         location ~* .+.(ico|jpg|gif|jpeg|css|js|flv|png|swf)$ {
           expires max;
          }

          location / {
           proxy_pass       http://apache;
           proxy_set_header  X-Real-IP  $remote_addr;
           proxy_set_header Host $http_host;
           proxy_cache cache;
           proxy_cache_key $host$request_uri$cookie_NO_CACHE;
           proxy_cache_valid 200 304 12h;
           proxy_cache_valid 302 301 12h;
           proxy_cache_valid any 1m;
           proxy_ignore_headers Cache-Control Expires;
           proxy_pass_header Set-Cookie;
          }

Thank you , i will try your

404's picture

Thank you , i will try your configuration.

I love boost and use it on sites that frequently change, because it gives me more control on purging and what to cache and what no to cache.

But for sites that don't change at all, i want to skip boost. Just drupal and nginx. Because boost still invokes apache and that adds another layer, boost also leaves static files on server. Just want to solve all the caching/serving issues in nginx.

I learned that in the new nginx 0.8.44 nginx doesn't cache content when there is set-cookie. Since drupal gives out cookie even to anoymouse users, the following might help us.

Igor Sysoev Wrote:

*) Change: now nginx does not cache by default

backend responses, if
they have a "Set-Cookie" header line.

What's the recommended method to cache even with the Set-Cookie header
present?

Thanks, Igor. As awesome as usual.

Igor Sysoev Wrote:

*) Change: now nginx does not cache by default

backend responses, if
they have a "Set-Cookie" header line.

What's the recommended method to cache even with the Set-Cookie header
present?

proxy_ignore_headers Set-Cookie;
proxy_hide_header Set-Cookie;

Note that Set-Cookie header no longer hidden by default so you
have to explicitly hide it via proxy_hide_header if it's not
intended to be sent to all users.

Also, would you please explain this directive

   proxy_cache_key $host$request_uri$cookie_NO_CACHE;

There have indeed been some

brianmercer's picture

There have indeed been some changes in nginx since I used that config.

Also, would you please explain this directive
proxy_cache_key $host$request_uri$cookie_NO_CACHE;

Pressflow sets a cookie called NO_CACHE with a value of "Y" when a user logs in and changes something, which persists after they log out. So to keep them from getting cached responses, the cache key checks for that cookie. For example: the page http://example.com/node/1 is in the cache with a key of "example.com/node/1", so for an anonymous user nginx will check for "example.com/node/1" and get a cache hit. However someone who is logged in or who made a change and logged out will have the NO_CACHE cookie and nginx will check the cache for "example.com/node/1Y" which will produce a cache miss. No cached pages for people with the NO_CACHE cookie.

I saw Igor has since added a new directive proxy_no_cache which may or may not solve the problem with a line like "proxy_no_cache $cookie_NO_CACHE". I haven't tested that and I'm not sure if he added that to the stable branch. (and I'm not even sure that's what it does. the translated docs are a little ambiguous.)

Since drupal gives out cookie even to anoymouse[sic] users

Pressflow does not send session cookies for anonymous users. That is one of several ways that Pressflow is more cache-friendly. However, most people are still going to send google_analytics cookies so the config may require another proxy_ignore_headers Set-Cookie directive if the new behavior will prevent caching with any Set-Cookie header.

I worked out that config several months ago before some of the nginx changes. It did work fine for me at the time.

Because boost still invokes apache and that adds another layer, boost also leaves static files on server. Just want to solve all the caching/serving issues in nginx.

These days I'm using Boost exclusively. Boost will bypass apache completely if nginx is configured to check for boost static files before going to apache. Besides, nginx caching also creates static files. And if you have enough free memory, your OS will cache frequently requested static files in memory to increase speed and decrease disk IO, but that applies equally to using Boost or nginx caching. If you're using multiple apache backends, then nginx caching may be easier than syncing up Boost files, but for a single apache/php server, there's not much advantage to nginx caching over Boost.

I checked out proxy_no_cache

brianmercer's picture

I checked out

proxy_no_cache $cookie_NO_CACHE

and it does what we need in that it bypasses a cached page if a client has that cookie. However, from some discussion on the Russian mailing list this week, it looks like proxy_no_cache is going to disappear and be replaced by two new directives. One which bypasses the cache on client requests, and one which does not cache the response from the server.

http://translate.google.com/translate?js=y&prev=_t&hl=en&ie=UTF-8&layout...

The nginx-header-module

404's picture

The nginx-header-module mentions here (http://groups.drupal.org/node/79714#comment-247474) works. It's absolutely great!

Thank you brianmercer! I just found out you are the editor of English version of Igor's articles on nginx: http://nginx.org/en/docs/introduction.html . The articles helped me a lot, too.

Nginx

Group organizers

Group notifications

This group offers an RSS feed. Or subscribe to these personalized, sitewide feeds: