Drupal 6 - Boost - Imagecache. FreeBSD & Apache 2.2 & PHP (as Apache module - no FastCGI) & Nginx 0.8.5.4

Events happening in the community are now at Drupal community events on www.drupal.org.
stanislaw's picture

Please help me with my nginx.conf file.
For now only root page ("/") is working, all others give me "The page isn't redirecting properly".
I have already tryed >10 config samples and didn't have a success.

I removed .htaccess from the root of my drupal folder.

I'd love to see the answers of people who really understand nginx.conf, but not only two strings from some tutorial - I've done this stuff a lot already.

Thanks!

Here's my conf:

user www www;
worker_processes  1;

events {
    worker_connections  1024;
}

http {
    include       mime.types;
    default_type  application/octet-stream;

    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';
    access_log  /var/log/nginx-access.log  main;

    reset_timedout_connection on;
    sendfile        on;
    aio sendfile;
    tcp_nopush     on;

    keepalive_timeout  65;
   
    gzip  on;

    upstream backend {
        server 77.72.19.19:81;
    }

    server {
        listen       77.72.19.19:80 default accept_filter=httpready;
        server_name  psyhonetika.org;

        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $remote_addr;
       
        gzip  on;
        gzip_static on;
        gzip_proxied any;

        gzip_types text/plain text/html text/css application/json application/x-javascript
text/xml application/xml application/xml+rss text/javascript;

        set $myroot /usr/local/www/apache22/data/alfa;
        root $myroot;
      
        location ~ ^\. {
            deny all;
        }

        location ~
\.(engine|inc|info|install|module|profile|po|sh|.*sql|theme|tpl(\.php)?|xtmpl)$|^(code-style\.pl|Entries.*|Repository|Root|Tag|Template)$
{
            deny all;
        }

        set $boost "";
        set $boost_query "_";

        if ( $request_method = GET ) {
            set $boost G;
        }

        if ($http_cookie !~ "DRUPAL_UID") {
            set $boost "${boost}D";
        }

        if ($query_string = "") {
            set $boost "${boost}Q";
        }

        if ( -f $myroot/cache/normal/$http_host$request_uri$boost_query$query_string.html )
{
            set $boost "${boost}F";
        }

        if ($boost = GDQF){
            rewrite ^.*$
/cache/normal/$http_host/$request_uri$boost_query$query_string.html break;
        }

        if ( -f $myroot/cache/perm/$http_host$request_uri$boost_query$query_string.css ) {
            set $boost "${boost}F";
        }

        if ($boost = GDQF){
            rewrite ^.*$ /cache/perm/$http_host/$request_uri$boost_query$query_string.css
break;
        }

        if ( -f $myroot/cache/perm/$http_host$request_uri$boost_query$query_string.js ) {
            set $boost "${boost}F";
        }

        if ($boost = GDQF){
            rewrite ^.*$ /cache/perm/$http_host/$request_uri$boost_query$query_string.js
break;
        }

        location ~*
\.(txt|jpg|jpeg|css|js|gif|png|bmp|flv|pdf|ps|doc|mp3|wmv|wma|wav|ogg|mpg|mpeg|mpg4|htm|zip|bz2|rar|xls|docx|avi|djvu|mp4|rtf|ico)$
        {
            root $myroot;
            expires max;
            add_header Vary Accept-Encoding;
            if (-f $request_filename) {
                break;
            }
            if (!-f $request_filename) {
                proxy_pass "http://backend";
                break;
            }
        }

        location ~* \.(html(.gz)?|xml)$ {
            add_header Cache-Control no-cache,no-store,must-validate;
            root $myroot;
            if (-f $request_filename) {
                break;
            }
            if (!-f $request_filename) {
                proxy_pass "http://backend";
                break;
            }
        }       
        if (!-e $request_filename) {
            rewrite  ^/(.*)$   /index.php?q=$1  last;
            break;
        }

        location / {
            proxy_pass http://backend;
        }

    }
}

UPD:
If I replace

        if (!-e $request_filename) {
            rewrite  ^/(.*)$   /index.php?q=$1  last;
            break;
        }

with:
        try_files $uri $uri/ @drupal;
        location @drupal {
            rewrite ^ /index.php?q=$uri last; # for Drupal 6
        }

Then all non-root pages give me 404 "The requested URL was not found on this server".

Comments

Issue is half-solved!

stanislaw's picture

Having nginx configured this way:
https://github.com/stanislaw/config_files/blob/master/nginx.conf led Drupal to following behaviour:
All worked okay, if the site was in maintenance mode. Navigating all non-root urls (signed in as admin) was ok. But if I put site to online mode, then I again began to get all there "Page isn't redirecting properly".

Then I tested the same configuration on fresh drupal installiation - it worked in both off- and online modes. Navigation to all clean_urls worked.

I think that config is indeed working, but the site I'm trying to deploy contains so many modules (I am not the author!), that some of them can cause such improper redirects.

I ended up having clean urls in vhosts section for my site leaving this problem for the future. My present config is very similar to the one on github (see the link) but it has clean_urls section commented (I use location / instead of location ~ .php now).

Anyway, I would be very thankful really for any advices & comments on my current config (see nginx.conf on github - link above).

Looking to your config

perusio's picture

there are things that just don't make sense to me:


        location ~* \.(html(.gz)?|xml)$ {
            add_header Cache-Control no-cache,no-store,must-validate;
            root $myroot;
            if (-f $request_filename) {
                break;
            }
            if (!-f $request_filename) {
                proxy_pass "http://backend";
                break;e
            }
        }

What's this supposed to do? Do you understand that the second if will never be reached? You're saying that if there's no file then stop all rewrite phase directives processing with that breakon the first if. Do you understand that an if is an implicit location?

I also think that the regex is incorrect, shouldn't it be:

location ~* \.(?:html\.(?:gz)?|xml)$ {

Also you can use the gzip_static directive that will make Nginx always do a stat() call requesting a gizipped version of the file for serving before attempting to serve the uncompressed file.

You can enable it at the http, server or location contexts. Therefore there's no need for a regex based location with .gz.

location ~* \.(?:(?:ht|x)?ml)$ {

I used this section

stanislaw's picture

I used this section from:
http://lab.redmallorca.com/deploying-drupal-in-the-cloud-with-nginx-and-...

I think it is for .gz files generated by boost.

"First if" checks whether the file is present. If yes, than it makes break (what happens then? - is it trying to serve existing .gz from root?), and if not, it goes to second if, which proxy_passes request to drupal, right?

Can you explain, why second if will never be reached?

Sorry, I'm just trying to understand.

What do you mean by "if is implicit location"?

You're right

perusio's picture

I "overlooked" the small detail of the !. That's a very un-nginxy way doing a config. You're thinking in terms of the way mod_rewrite does things.

Yes it's reached but there's no need for the first if. The second ifwill only be used as a location if there's no file.

Note that having a if block that does something other than a rewrite, set a variable or do a return is considered bad style. Why? Because if is a rewrite phase directive and proxy_pass is a content phase directive. Meaning that the later defines how the content is going to be served.

I suggest you use a try_files directive:

location ~* .(?:(?:ht|x)?ml)$ {
    add_header Cache-Control no-cache,no-store,must-validate;
    root $myroot;

    try_files $uri @proxy;
}

location @proxy {
  proxy_pass http://backend;
}

If is an implicit location meaning that is just a way to express a location by other means besides paths, regexes or names. In fact when Nginx enters an if he processes all rewrite phase directives available there and inherits (mostly) all the content handlers from the outside environment.

If you have something like:

location drupal-test {
    set $a 1;
    if ($a) {
       rewrite ^ /top-if.html;
    }
   
   if ($a) {
      rewrite ^ /bottom-if.html;
   } 
}

Both if blocks will be used and the last rewrite will be the one that is used. Check the If is Evil wiki page.

Now the other parts of your config

perusio's picture

if (!-e $request_filename) {
     rewrite  ^/(.*)$   /index.php?q=$1  last;
     break;
}

This means that any file (or dir) that it's not found should result in a rewrite to index.php?q=$1 that is to be run last.
Since there's no location for that URI it ends up using the _catch all_ location /. Where you do a proxy_pass.

This capture here doesn't make sense to me. If you're proxying to a backend. Let the backend worry about clean URLs.

location / {
    proxy_pass http://backend;
}

Your config has a lot of baggage from the Apache way of doing things. With nginx your config could be so much simpler:
location / {
 
   location ~* \.(?:jpe*g|css|js|gif|png|bmp|flv|pdf|ps|doc|mp3|wmv|wma|wav|ogg|mpe*(?:g|4)*|htm|zip|bz2|rar|xls|docx|avi|djvu|rtf|ico)$ {
      expires max;
      add_header Vary Accept-Encoding;
   }
 
   location ~* \.(?:(?:ht|x)?ml)$ {
     add_header Cache-Control no-cache,no-store,must-validate;
   }

   ## Let apache handle the query string args capture.
   try_files $uri @proxy;
}

location @proxy {
   proxy_pass http://backend;
}

Now the other parts of your config

perusio's picture

if (!-e $request_filename) {
     rewrite  ^/(.*)$   /index.php?q=$1  last;
     break;
}

This means that any file (or dir) that it's not found should result in a rewrite to index.php?q=$1 that is to be run last.
Since there's no location for that URI it ends up using the _catch all_ location /. Where you do a proxy_pass.

This capture here doesn't make sense to me. If you're proxying to a backend. Let the backend worry about clean URLs.

location / {
    proxy_pass http://backend;
}

Your config has a lot of baggage from the Apache way of doing things. With nginx your config could be so much simpler:
location / {
 
   location ~* \.(?:jpe*g|css|js|gif|png|bmp|flv|pdf|ps|doc|mp3|wmv|wma|wav|ogg|mpe*(?:g|4)*|htm|zip|bz2|rar|xls|docx|avi|djvu|rtf|ico)$ {
      expires max;
      add_header Vary Accept-Encoding;
   }
 
   location ~* \.(?:(?:ht|x)?ml)$ {
     add_header Cache-Control no-cache,no-store,must-validate;
   }

   ## Let apache handle the query string args capture.
   try_files $uri @proxy;
}

location @proxy {
   proxy_pass http://backend;
}

Thank you very much for

stanislaw's picture

Thank you very much for making this nginxy way of doing things clear to me.

I've just tried your conf. All works okay. The only exception - imagecache avatar pictures are not being displayed properly. The same bunch of last logged users avatars is being displayed partially if I use your conf, and properly if I use mine. Can't understand what is it.

One more question (maybe the last I have):

Is there difference beetween your config and its simpler variation (for such simple config case as mine):
Can I just remove wrapping "location /"? Because anyway in my config's case it is implied by, right?


    location ~* .(?:jpeg|css|js|gif|png|bmp|flv|pdf|ps|doc|mp3|wmv|wma|wav|ogg|mpe(?:g|4)|htm|zip|bz2|rar|xls|docx|avi|djvu|rtf|ico)$ {
        expires max;
        add_header Vary Accept-Encoding;
    }

    location ~
.(?:(?:ht|x)?ml)$ {
        add_header Cache-Control no-cache,no-store,must-validate;
    }

    ## Let apache handle the query string args capture.
    try_files $uri @proxy;

    location @proxy {
        proxy_pass http://backend;
    }

Yes you can

perusio's picture

is just that is considered good practice to wrap all regex based locations inside another. The reason for such is that regex based locations are tested sequentially. So you might end up having an unwanted side effect when processing a request.

As for imagecache you can add a location:

location ~* /imagecache/ {
   try_files $uri @proxy;
}

before the regex based location for static files you have.

Also you should add the following directives to the @proxy location:

proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header Host $http_host;

and also use the rpaf module if you're using Apache. Otherwise you'll get always the IP of the reverse proxy as the client IP.

Hi, I'm glad to see that my

jonvk's picture

Hi, I'm glad to see that my post on cloud deployment is still being used. I've significantly changed the nginx configuration. Mainly, it uses try_files rather than if to check for files, which is the recommended way. It also has more custom handling of expires.

Finally, unlike the suggested boost config, it serves cached boost pages even if the user passes a query string, with the cache uri including this query string. I haven't tested this exact config since it's a slightly pared down version of what I actually use.

The "if is evil" page in the nginx documentation is a good read, as well as the following reference to help understand how configuration parsing and "if" works in nginx. http://agentzh.blogspot.com/2011/03/how-nginx-location-if-works.html

server {
  listen   80 default;

  root /var/www/my-site/d7;
  #proxy ip's
  proxy_set_header X-Real-IP  $remote_addr;
  proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
  proxy_set_header Host $http_host;

  access_log  /var/log/nginx/mysite.log;

  gzip  on;
  gzip_static on;

  gzip_types text/plain text/css application/json application/x-javascript text/xml application/xml application/xml+rss text/javascript;
  #charset koi8-r;

  # If a file is not found, proxy to apache / Drupal
  error_page 404 = @drupal;

  # deny access to .ht files, .git, etc
  location ~ ^\. {
    deny all;
  }

  location ~* \.(engine|inc|info|install|module|profile|po|sh|.*sql|theme|tpl(\.php)?|xtmpl)$|^(code-style\.pl|Entries.*|Repository|Root|Tag|Template)$ {
    deny all;
  }

  location ~ boost-crawler {
    deny all;
  }

  # try and serve static content. Set expires max to encourage browser caching.
  location ~* \.(txt|jpg|jpeg|gif|png|bmp|flv|pdf|ps|doc|mp3|wmv|wma|wav|ogg|mpg|mpeg|mpg4|htm|zip|bz2|tar|tgz|rar|xls|docx|avi|djvu|mp4|rtf|ico)(\.gz)?$ {
    expires max;
	try_files $request_uri @drupal;
	break;
  }

  # try and serve static content or their boost cache
  location ~* \.(css|js)(\.gz)?$ {
    expires max;
    try_files $request_uri /cache/perm/$http_host$request_uri$boost_query.$1$2 @drupal;
    break;
  }

  # try and serve xml/html files or proxy to drupal. Do not let browser cache (dynamically generated).
  location ~* \.(html|xml)$ {
    add_header Cache-Control no-cache,no-store,must-validate;
	try_files $request_uri @drupal;
	break;
  }

  set $boost_query "_";

  # Serve the boost html page if it exists or goto drupal. Note: unlike js/css,
  # we do not want to cache in browser.
  # Note: caching turned on with get arguments; cached by argument string.
  location / {
    # Do not serve cached pages if they are not GET requests
    if ( $request_method != GET ) {
      return 404;
	  break;
    }
    # Do not serve cached pages if user is logged in
    if ($http_cookie ~ "DRUPAL_UID") {
      return 404;
	  break;
    }
    add_header Cache-Control no-cache,no-store,must-validate;
    try_files /cms/cache/normal/$http_host$request_uri$boost_query$args.html @drupal;
  }

  # Proxy to drupal
  location @drupal {
    access_log  /var/log/nginx/proxy.mysite.access.log;
    proxy_pass http://127.0.0.1:8000;
  }
}

Your config suffers

perusio's picture

from what seems to be a confusion between rewrite phase directives and content handlers or output filters, or special handlers (error_page).

location ~* \.(txt|jpg|jpeg|gif|png|bmp|flv|pdf|ps|doc|mp3|wmv|wma|wav|ogg|mpg|mpeg|mpg4|htm|zip|bz2|tar|tgz|rar|xls|docx|avi|djvu|mp4|rtf|ico)(.gz)?$ {
    expires max;
  try_files $request_uri @drupal;
    break;
  }

As explained above the .gz part of the config is superfluous if you're using gzip_static on.

The break makes no sense. You have no rewrite phase directives on this location. There's nothing to break from.

The $request_uri variable has also the args. There's no need to use it. You should use $uri since nginx always passes the args for a given location.

# Do not serve cached pages if user is logged in
    if ($http_cookie ~ "DRUPAL_UID") {
      return 404;
    break;
    }

another spurious break. The return 404 already finishes all rewrite phase directives. No need for a break.

There are other issues like using error_page instead of try_files on the catch all location ' /', for example.

There are other issue like using error_page...

stanislaw's picture

You say: "There are other issues like using error_page instead of try_files on the catch all location ' /', for example..."

How else we could write ' if ($http_cookie ~ "DRUPAL_UID") { return 404 } ' ?

As I understand, try_files cannot be used here, because it requires at least two arguments: what to try and what do then if it doesn't find what needed. Right?

And we have here only return 404... We don't have "first arg": what to try.

How it should be written then?

Thank you!

Well

perusio's picture

the config has been edited. Glad to see it improved. The error_page inside / is gone. Great :) The superfluous break after return are still there though :(

You should use another status code. Not 404. You're relaying all Not Found situations to the backend. Imagine that I request an image that doesn't exist. If you have imagecache you should try it to generate the image if applicable. If not then you should signal the 404 without any intervention of the backend, unless you use something like the search404 module.

One of the issues that drupal has is a thing called fast 404s. This is related with the unnecessary bootstrapping of drupal to signal a 404. You're paying a hefty price performance wise for a meager 404. And not extracting all the benefits of using Nginx for static file serving.

I suggest you use 405 instead. It's also the usual thing. I also suggest you define a named location @cache and constrain the error_page 405 = @drupal directive to that location.

Check the configs indicated at the top of the group for examples of such.

Thanks

jonvk's picture

Thanks for the comments, I'll rework the config. I'm due to look at the nginx config parsing source code.

Nginx

Group organizers

Group notifications

This group offers an RSS feed. Or subscribe to these personalized, sitewide feeds: