Pressflow / Varnish - Removed Google Analytics cookies still stopping response header 'age'?

We encourage users to post events happening in the community to the community events group on https://www.drupal.org.
Elder Brother's picture

Currently using Drupal 6 / Pressflow on Apache with Google Analytics installed. We have Varnish 2.1.5 running on port 80 in front of Apache.

In the attached Varnish VCL (based on Lullabot's example - http://www.lullabot.com/articles/varnish-multiple-web-servers-drupal), I have (probably redundant) regex filtering Google Analytics cookies, as well as a general 'remove all cookies except sessions' rule in sub vcl_recv.

When Google Analytics isn't running on site (using Drupal GA module to add to all pages, which can be disabled) anonymous users get pages served from Varnish with a response header as below -

Age 12

Cache-Control public, max-age=21600

Connection keep-alive

Content-Length 17029

Content-Type text/html; charset=utf-8

Date Tue, 20 May 2014 13:26:17 GMT

Etag "1400592363"

Expires Sun, 11 Mar 1984 12:00:00 GMT

Last-Modified Tue, 20 May 2014 13:26:03 +0000

Server Apache/2.2.15 (CentOS)

Vary Cookie

Via 1.1 varnish

X-Powered-By PHP/5.3.3

X-Varnish 175643970 175643967

X-Varnish-Cache HIT Varnish (1)

X-Varnish-Debug-Age 12

X-Varnish-Debug-Hits 1

But when GA is active and UTM... cookies are served to browser, I get the following header -

Cache-Control public, max-age=21600

Connection keep-alive

Date Thu, 22 May 2014 08:03:45 GMT

Etag "1400745705"

Expires Sun, 11 Mar 1984 12:00:00 GMT

Vary Cookie

Via 1.1 varnish

X-Varnish 175644814

X-Varnish-Cache HIT Varnish (2)

X-Varnish-Debug-Hits 2

I need to check if Varnish is abiding by the max-age (set in Drupal and communicated to Varnish using Drupal Varnish module), but Varnish still appears to cache and serve the page successfully, as it should given the rules in the VCL - but no idea why I lose the age counter, the Content length, X-Varnish-Debug-Age and the second value in the X-Varnish token field, or if this suggests any change in the caching happening. I have noticed that the response time is marginally quicker in the second instance - which is also strange. Can anyone shed light on this?

AttachmentSize
Example VCL7.34 KB

Comments

It sounds like you still have

Beanjammin's picture

It sounds like you still have other cookies getting through.

Rather than stripping out the cookies that you don't want I'd suggest using their seccond approach (a little further down the same page - http://www.lullabot.com/blog/article/configuring-varnish-high-availabili...), which is to strip all cookies except for the ones that you do want (typically only the session and no_cache cookies).

It's much easier to track the cookies you do want vs the ones you don't, especially if other people are implementing javascript based services that require cookies.

Thanks for the reply

Elder Brother's picture

Thanks for the reply Beanjammin.

Looks like the example VLC file didn't attach the first time - rats. Now attached to original post.

As you can see in the VCl, I've included both approaches. In testing I can confirm that with a clean browser with history wiped, visiting site without being logged in, the only cookies present are for Google Analytics -according to Firebug and other Firefox plugins at least.

Hence the mystery - The header information seems to suggest that with the GA cookies present requests aren't missing Varnish entirely - I'm still getting hits - but without age and tokens in the X-Varnish field.

Thanks.