Image styles when imported from apache not creating

Events happening in the community are now at Drupal community events on www.drupal.org.
Anonymous's picture

I have a bunch of image files that have come from an apache staging server which have been converted, e.g a space has been converted to %20, so the imported images are my%20image.jpg. These are not viewable and the subsequent image styles are not being created. If I upload images with spaces directly on to nginx then no problem. I am using Perusio's config which seems to be setup to resolve escape characters, however it doesnt like files with % signs in them. Any ideas how to resolve this?

Comments

Related core patch

mikeytown2's picture

https://www.drupal.org/node/2267639

Related but not the same issue as you

But nginx specific...

olomouc's picture

This is not an issue on our staging server with apache but it is with nginx. Same code.

Nginx + files with % signs in them

Peter Bowey's picture

You may need new variations on the "nginx" use of $uri vs $request_uri

Explain:

1) The nginx $uri variable contains any %20 substituted with its url-decoded value, "a space".

2) The nginx $request_uri variable holds the original request URI as issued by the client, with the %20 sequence.

The built-in variable $uri provided by ngx_http_core is used to fetch the (decoded) URI of the current request, excluding any query string arguments.

The built-in variable $request_uri is provided by ngx_http_core is used to fetch the raw, non-decoded, form of the URI, including any query string.

--
Linux: Web Developer
Peter Bowey Computer Solutions
Australia: GMT+9:30
(¯`·..·[ Peter ]·..·´¯)

Thanks Peter

olomouc's picture

In weeks of headbanging, you have come the closest to an explanation! So this might work:

location ~* /files/styles/ {
   access_log off;
    expires 30d;
    try_files $request_uri @drupal;
}

instead of

location ~* /files/styles/ {
   access_log off;
    expires 30d;
    try_files $uri @drupal;
}
Peter Bowey's picture

@olomouc

Yes, I had to do a similar thing for the URI's generated via D6 Imagecache where I was using Node.JS to proxy_pass URI's with spaces in the path (file name).

--
Linux: Web Developer
Peter Bowey Computer Solutions
Australia: GMT+9:30
(¯`·..·[ Peter ]·..·´¯)

Well..

olomouc's picture

that didnt work. Do you think you still have the config stanza Peter you could share with me?

Peter Bowey's picture

@olomouc

The nginx 'config' is sensitive to the placement of the required $request_uri. It needs be at the correct (usually last) URI 'check point' for nginx handling.

Sharing a portion of mine here:

    # Proxy pass to actual back end server: Get static data direct
    location @uncached_redirect {
        root                        /var/www/virtual/$cdn_requester;
        try_files                   $uri @uncached_fallback_to_php;
        proxy_next_upstream         error timeout http_500 http_502 http_503 http_504;  # The cases on which a request should be passed to the next server
        proxy_pass                  http://unix:/run/lactate/lactate.socket:/$cdn_requester$request_uri;  #was = $uri
        proxy_redirect              off;
        proxy_connect_timeout       30ms;
        proxy_read_timeout          3s;
        proxy_send_timeout          3s;
        proxy_http_version          1.1;
        proxy_set_header            X-Real-IP $remote_addr;
        proxy_set_header            Connection "";
        proxy_set_header            Host $host;
        proxy_buffers               16 640k;
        proxy_buffer_size           640k;
        proxy_busy_buffers_size     640k;
        proxy_temp_file_write_size  640k;
        proxy_method                GET;
        proxy_store                 /var/www/cache/$host$uri;
        proxy_store_access          user:rw group:rw all:r;
        proxy_temp_path             /var/www/tmp 1 2;
        proxy_set_header            Accept-Encoding  "";
        if_modified_since           off;
        gzip                        off;  # Dynamic Gzip is disabled
        open_file_cache_errors      off;
        open_file_cache             off;
        limit_conn                  limit_per_ip 200;
        proxy_ignore_client_abort   on;
    }

The above nginx block @uncached_redirectis referenced by this:

    # imagecache and imagecache_external support
    location ~* /(?:external|system|files/imagecache|files/styles)/ {
        tcp_nodelay         off;
        expires             max;
        # Unset unnecessary headers
        if_modified_since   off;
        add_header          Pragma "";
        add_header          Last-Modified "";
        add_header          Cache-Control "public, must-revalidate, proxy-revalidate";
        add_header          X-Header "IC Generator 1.0";
        add_header          X-Frame-Options SAMEORIGIN;
        add_header          "X-UA-Compatible" "IE=Edge,chrome=1";
        try_files /$host$uri @uncached_redirect;
    }

Note that I had the same issues as you have outlined, but only in response to imagecache paths with spaces as 20% being 'broken' by the normal use of nginx $uri.

Note how I use the $request_uri 'later' than you show. eg; "the last URI handler point" before handing the uri path over to a node listener.

Notes: Without the change to using $request_uri, paths with spaces (my%20image.jpg) were not being sent to the node listen socket. Imagecache files without the "%20" in the file ($uri) were OK. Hence, I studied the strangeness of the nginx types.

In fact, a clone of nginx known as 'tengine' allows the use the of $raw_uri entry in the config. The result is similar to the use of $request_uri, but without the query string args.

I hope that I have not confused you with my own code and use variances :)

--
Linux: Web Developer
Peter Bowey Computer Solutions
Australia: GMT+9:30
(¯`·..·[ Peter ]·..·´¯)

Hopefully, some help -

Peter Bowey's picture

Hopefully, some help - explaining the logic:

$request_uri
This variable is equal to the original request URI as received from the client including the args. It cannot be modified. Look at $uri for the post-rewrite/altered URI. Does not include host name. Example: "/foo/bar.php?arg=baz"

$uri
This variable is the current request URI, without any arguments (see $args for those). This variable will reflect any modifications done so far by internal redirects or the index module. Note this may be different from $request_uri, as $request_uri is what was originally sent by the browser before any such modifications. Does not include the protocol or host name. Example: /foo/bar.html
The $uri variable is an nginx one (see the http core module documentation).

Example: If I visit http://example.com/foobar/hello%20world, the nginx $uri variable contains /foobar/hello world (the %20 has been substituted with its url-decoded value, a space). And then, nginx returns http status code 400 (bad request), or 502 before executing the proxy_pass line (and the backend is not contacted).

The variable $request_uri, which holds the original request URI as issued by the client, would hold the correct (raw) value, with the %20 sequence.

Basically the nginx $uri is about url-decoding :)

--
Linux: Web Developer
Peter Bowey Computer Solutions
Australia: GMT+9:30
(¯`·..·[ Peter ]·..·´¯)

nginx escapes

perusio's picture

all URIs. Which means that you need to unescape it to make it work.

You need Lua to make it work. Read this:
https://github.com/perusio/drupal-with-nginx/blob/D7/README.md#escaped-uris.

You need to pass the $unescaped_uri variable.

Also you should use the transliteration module to avoid such issues.

It seems that I have a mixture

olomouc's picture

I am not sure this is possible to resolve on the existing file names.

This is the original image name

Polenta Cakes Red%20Pepper%20istock_0_0.jpg

This is what the browser tells me its looking for when Drupal attempts to build the image style on page load..

/styles/220_golden_ratio/public/images/recipe/Polenta%2520Cakes%2520Red%2520Pepper%2520istock_0_0.jpg

So this seems to be a mixture of escaped and unescaped, unless I am misunderstanding it.

I have installed lua and point to the drupal_escaped.conf which didn't change anything then I re-read what you said so tried:

set_by_lua $unescaped_uri 'return ngx.unescape_uri(ngx.var.uri)';

which failed when running nginx -t, because it doesn't recgnize the variable. Then I tried:

set_by_lua $escaped_uri 'return ngx.unescape_uri(ngx.var.uri)';

which didnt fail but didnt make any difference.

Peter Bowey's picture

Hi @

It is possible to resolve, for I had the same issue as you; meaning:

/styles/220_golden_ratio/public/images/recipe/Polenta%2520Cakes%2520Red%2520Pepper%2520istock_0_0.jpg

was presented to to the browser (= fail)!

Please re-read the nginx notes I supplied - earlier (tested on the same issue).

Your presented issue has little bearing on the use of "lua", and it (likely) complicates the nginx logic event you presented (originally).

I am so sorry to be the "pain", but did you miss the nginx $uri logic?

** The variable $request_uri, which holds the original request URI as issued by the client, would hold the correct (raw) value, with the %20 sequence.

Your need to use the $request_uri in the last instance of possible nginx $uri translation - re-writes , etc!

--
Linux: Web Developer
Peter Bowey Computer Solutions
Australia: GMT+9:30
(¯`·..·[ Peter ]·..·´¯)

Peter Bowey's picture

My full stanza :: nginx: (linked direct from my server)

nginx.conf

If you are, per chance, using nginx proxy_pass note this:

If using nginx proxy_pass with uri component (note trailing "/"), nginx decodes uri, substitues uri part matched by location ("/") with uri component in your proxy_pass (another "/") and then re-encodes the uri.

I have not seen your "Perusio" nginx config (yet). It would help to do so :)

--
Linux: Web Developer
Peter Bowey Computer Solutions
Australia: GMT+9:30
(¯`·..·[ Peter ]·..·´¯)

Well

perusio's picture

the only difference between $request_uriand $uriis that the first has the arguments in it. See http://nginx.org/en/docs/http/ngx_http_core_module.html#var_request_uri.

As for your problem @oloumouc, please get a debug log and issue a request for an image and see what comes out in the logs. You can post the log in a Gist.

I have to see what try_files does.

Check this thread also please: https://groups.drupal.org/node/229973.

nginx $request_uri -vs- $uri

Peter Bowey's picture

@perusio

Quote: "the only difference between $request_uri and $uriis that the first has the arguments in it"

That statement is to be classed as "incomplete" and stands some need for a good study :) !

English people know how hard it is to read "Russian" docs for nginx.

1) The nginx $uri variable contains /foobar/hello world (the %20 has been substituted with its url-decoded value, a space).

2) The nginx variable $request_uri, holds the original request URI as issued by the client, as the original (raw) value, with the %20 sequence.

--
Linux: Web Developer
Peter Bowey Computer Solutions
Australia: GMT+9:30
(¯`·..·[ Peter ]·..·´¯)

thanks for your help

olomouc's picture

I havent had the chance to digest all this but i really appreciate the feedback and help.

nginx $request_uri -vs- $uri

Peter Bowey's picture

@olomouc

Sorry about the "too much information" scenario, however, the nginx concepts (as outlined above), have been tested at my end of a long-term project, which had a similar issue - as you have well described.

The facts are, that the English API (and use) part of nginx is still very prone to a lot of guess-work, along with some very old and outdated guides :)

--
Linux: Web Developer
Peter Bowey Computer Solutions
Australia: GMT+9:30
(¯`·..·[ Peter ]·..·´¯)

@MichelDelcourt "More simple"

Peter Bowey's picture

@MichelDelcourt

"More simple" = rewrite assets like [my%20image.jpg] to be [my_image.jpg] :)

Otherwise, trace the [nginx] URI responses to files like [my%20image.jpg]..

In the request phase, the "filename" is passed from the address bar (if a type-in) or from a link on a page into the browser, where it is URL-encoded in compliance with HTTP requirements. It is then sent to the network as a URI. In each active network node (e.g. proxy) through which this request passes, it is possible that the URI will be re-URL-encoded.

The encoding rules will differ based on whether the "filename" is passed directly (as the "GET" or "POST" URI) or whether it is passed as a query string appended to the URL-path.

Spaces are not valid characters in URI's and therefore must be encoded to "%20". But "%" must also be encoded, so you end up with a double-encoded string, "%2520" as soon as this request passes through any other HTTP/1.x-compliant "Web agent" such as a proxy. If you piped the request through yet another agent, and you'd end up with "%252520", etc. See RFC 3986.

On Apache and Litespeed (which uses Apache's rewrite rules) there's a re-write option to add to prevent URL-encoding ([NE]).

--
Linux: Web Developer
Peter Bowey Computer Solutions
Australia: GMT+9:30
(¯`·..·[ Peter ]·..·´¯)

nginx urlencode/urldecode functions

Peter Bowey's picture

Nginx: According to Maxim Dounin [Quote]:

"It is believed that correct solution would be to implement some
urlencode/urldecode functions, but there is no consensus on
desired syntax yet. There are patches for $urlencode_* /
$urldecode_* variables by Kirill Korinskiy floating around, but
they were explicitly rejected by Igor.

2. When doing proxy_pass nginx do escape characters which aren't
valid in URI, but it doesn't to escape some chars which aren't
(like "<", ">", <">). That's why you see space escaped, but not
<">."

Link: http://forum.nginx.org/read.php?2,75231,75566#msg-75566

--
Linux: Web Developer
Peter Bowey Computer Solutions
Australia: GMT+9:30
(¯`·..·[ Peter ]·..·´¯)

Nginx

Group organizers

Group notifications

This group offers an RSS feed. Or subscribe to these personalized, sitewide feeds: