Safari + Cache-Control with Vary Bug

rjbrown99's picture

Hopefully I can save someone else a bit of time.

Safari seems to have a bug related to Cache-Control and Vary headers. It's relevant to Drupal in this case since I was reproducing the behavior with Pressflow 6 with the "external cache" setting, which I'm sure many of you are also using. Also this seems to be a Safari-only bug - Chrome is not impacted.

The official Webkit bug report: https://bugs.webkit.org/show_bug.cgi?id=71509

1) Visit a webpage, /myurl. Returned headers are as follows:
Cache-Control: public, max-age=3600
Expires: Sun, 11 Mar 1984 12:00:00 GMT
Vary: Cookie,Accept-Encoding

Note that there are no cookies presented to the client at this time.

2) Authenticate to the site. A single session cookie is created for the client browser (using the default Drupal session management.) During the authentication process, the user is redirected to a different URL. In my case I'm doing the redirect via a form alter to set a query option of destination=mypage, because I want authenticated users to be redirected upon login.

During this step a POST is made to the same URL, and it receives a 302 Found to redirect to a different URL. I have verified at this point that the session cookie has been properly created.

3) Navigate back to the original page, /myurl. Even though a cookie is present in the browser and the URL had Vary:Cookie set during step #1 above, the browser serves up a cached webpage. I have run a network sniffer and verified that it does not make a connection back to the server for a new page.

I did verify that if I modified the original Cache-Control header to no-cache, this does not happen. So it definitely has to do with Cache-Control and Vary.

My temp Varnish solution:

sub vcl_fetch {
  if (req.http.user-agent ~ "Safari" && req.url == "/myurl") {
    unset beresp.http.Cache-Control;
    set beresp.http.Cache-Control = "no-cache";
  }
}

I only really care about that one page not being cached, hence the req.url. Basically I use Varnish to rewrite the Pressflow-set cache control header for Safari users on that one page.

There is little to no information about this which is why I'm chiming in here, just looking for a place to document it since (at least in my case) it has a relation to Drupal.

Comments

Relevant Issue

Thanks

rjbrown99's picture

Thanks, saw that one. Tons of info in that thread, but I consider this a bit different as:

1) There's a Cache-Control: public, max_age=300 setting with Vary: Cookie
2) A cookie is generated
3) No further requests are made - at all - from the browser.

The browser isn't Varying when it has a cookie. That's the kicker for me, which means that really no matter what I do on the Drupal side it isn't going to change the behavior. The browser is relying upon its local cached copy and not making a new HTTP request. This ONLY happens on Safari - all other browsers (including Chrome) are properly respecting the Cache-Control+Vary combination.

This is probably also relevant info for that thread.

Probably the same problem, just presenting differently

David4514's picture

I am probably experiencing the same issue, as pointed out by a user during testing, but presented in a slightly different way.

I have page caching turned on (this causes Cache-Control: public, max-age=86400 to be used for the anonymous user on initial page loads).

Steps to reproduce on two of my test sites (both Drupal 7.10).

  1. Open Safari Browser
  2. Navigate to home page of site. (Safari issues a GET for this, loads and caches the home page.)
  3. Navigate to an alternate page on the site.
  4. Login
  5. Navigate back to the home page (Safari serves up the page from step 2 formatted for the anonymous user from its own cache. No request is sent to the website (Wireshark trace running)). And yes.... there was a cookie available with the login. If the user refreshes the page (F5), it loads correctly. This only seems to happen on Safari (both Windows and Mac).

If page caching is disabled, Cache-Control: no-cache, must-revalidate, post-check=0, pre-check=0 is used instead on the initial page loads.

Unfortunately, this means that I cannot enable page caching on the Drupal website.

Pressflow?

mikeytown2's picture

Are you using Pressflow? Do not set the minimum cache lifetime.

No pressflow used

David4514's picture

No Pressflow is involved. I don't know of any special server side caching outside of Drupal. This problem occurs on two servers, one is at a shared server provided by an ISP, the other is a small Ubuntu 10.04 LTS server I use at home over which I have full control. I have File Cache turned on and thought at first it could be related. The problem persisted with File Cache uninstalled, so it does not seem to be related.

The initial load of the home page has the following headers (from Wireshark) (request followed by response). As long as I first navigate to a different public page and login from it, Safari does not interact with the website again when navigating back to the home page unless I force a browser refresh. Safari instead serves up the home page from its cache which represents what the page looked like to an anoymous user. To the person testing, it appeared as if they were not logged in.

EDIT: Setting Minimum Cache Lifetime to none did not change the behaviour.

GET / HTTP/1.1

Host: taiwv79.dgolson.net

User-Agent: Mozilla/5.0 (Windows NT 6.0) AppleWebKit/534.52.7 (KHTML, like Gecko) Version/5.1.2 Safari/534.52.7

Accept: text/html,application/xhtml+xml,application/xml;q=0.9,/;q=0.8

Accept-Language: en-US

Accept-Encoding: gzip, deflate

If-Modified-Since: Tue, 20 Dec 2011 22:53:43 GMT

If-None-Match: "1324421623"

Cookie: Drupal.toolbar.collapsed=0

Connection: keep-alive

HTTP/1.1 200 OK

Date: Wed, 21 Dec 2011 04:44:47 GMT

Server: Apache/2.2.14 (Ubuntu)

X-Powered-By: PHP/5.3.2-1ubuntu4.11

X-Drupal-Cache: HIT

Etag: "1324421583-0"

Content-Language: en

X-Generator: Drupal 7 (http://drupal.org)

Cache-Control: public, max-age=86400

Last-Modified: Tue, 20 Dec 2011 22:53:03 +0000

Expires: Sun, 19 Nov 1978 05:00:00 GMT

Vary: Cookie,Accept-Encoding

Content-Encoding: gzip

Content-Length: 6995

Keep-Alive: timeout=15, max=100

Connection: Keep-Alive

Content-Type: text/html; charset=utf-8

Varnish

rjbrown99's picture

I believe this is still an issue in current Safari, and it has been in older Safari as I tested a number of older copies. My solution is Varnish VCLs, specifically what was suggested in comment #4 of the official Webkit issue from my initial post. Basically I'm not caching for Safari because of this.

Drupal 7 has lot of what pressflow does

bcmiller0's picture

I believe drupal 7 is serving cache from looking at your headers, and believe that D7 has lots of the same capabilites that pressflow did built in.

you can see: X-Drupal-Cache: HIT

in your header data, so looks like a cache hit from D7 occured.

Thanks

David4514's picture

I'm not using Varnish, Pressflow, or any other special caching mechanism so that approach will not work for me.

I am guessing that, when page cacheing is enabled, the function drupal_serve_page_from_cache(stdClass $cache) in bootstrap.inc is what is setting Cache-Control: public, max-age... values. Other than be hacking this core module, I do not know how to specify that no-cache should be used only for a safari browser.

Core

rjbrown99's picture

I'm guessing you would need to hack core in that case, or perhaps use something like mod_headers if you are using Apache.

We fixed it this

Soul88's picture

We fixed it this way:

<?php
function mymodulename_init() {
  if (
strpos($_SERVER['HTTP_USER_AGENT'], 'Safari')!==FALSE) {
   
$default_headers = array(
     
'Expires' => 'Sun, 19 Nov 1978 05:00:00 GMT',
     
'Last-Modified' => gmdate(DATE_RFC1123, REQUEST_TIME),
     
'Cache-Control' => 'no-cache, must-revalidate, post-check=0, pre-check=0',
     
'ETag' => '"' . REQUEST_TIME . '"',
    );
    foreach(
$default_headers AS $k => $v) {
     
drupal_add_http_header($k, $v);
    }
  }
}
?>

Not ideal solution, but it's much better than turning the cache off.

Reported in drupal.org issue queue

David_Rothstein's picture

I have come up with a minimal set of steps to reproduce this using Drupal core alone and have filed an issue about it in the core queue:
http://drupal.org/node/1910178

Not sure there is anything Drupal can do about it or not, but I figured having the issue open (with an easy set of steps to reproduce) would be helpful in case someone comes up with a good patch.

https://groups.drupal.org/nod

Only Safari not respecting Vary Header?

Cheviot's picture

Same situation as mentioned in the start of this discussion line.
If someone navigates to 'home' and 'about' as anonymous user, and logs in, then browsing back to 'home' or 'about' shows the cached page which has 'login'. Of course the user is logged in. Going to another page then 'home' or 'about' shows correct 'my account' (sample is a kickstart site)

Indeed Safari does not respect the vary header. As mentioned by David4514 on December 21, 2011 at 4:59am, it seems a Drupal issue because other software platforms do not show this behaviour (else even the apple website would not work properly!)

There should be always an expire-check. Just having a cache without any expire-check is not useful. Authenticated users should get an 'expired-signal' the moment they visit again a cached page (i.e. the cached 'home' and 'about' in my sample). By the way, when people use cached pages without (kind of totally offline page) then it is hard to tell how they navigate and may (I could be wrong) be invisible to site behaviour tracking.

Currently because of Safari I can't use the expiration of the cache pages, but I need it to set a browser expiration time.

There has to be a solution possible to have always an extra element in any cached page checking for expiration signals? This should be solved in the core of course, not via patches.

This caught my eye because

JoshRickert's picture

This caught my eye because I've actually had similar issues with Safari not respecting my cache headers on a WordPress site I run. I never did find a good solution though - I just ended up blacklisting the most critical pages from the cache to restore my site's functionality.

My temporary solution

Cheviot's picture

It is clear it needs to be resolved, this is going against the grain of how pages have to be managed for anon and authenticated users. All it needs is a kind of mandatory 'call home'. I wonder who can be contacted to get this implemented in the core.

Commenting the relevant issue

tripper54's picture

Commenting the relevant issue on d.o and submitting a patch if possible would help get things moving.
https://www.drupal.org/node/1910178