Cache Router: Edge condition 404s valid pages?

Events happening in the community are now at Drupal community events on www.drupal.org.
mshmsh5000's picture

My company recently launched a channel on msn.com -- yes, MSN has gone (partly) Drupal (a little bit)!

http://fitbie.msn.com/

This is a LAMP, PF 6.19 app with Cache Router (6.x-1.0-rc1) and multiple memcached bins, sitting behind Akamai. The home page is constructed in Panels (6.x-3.7).

We've had troubling recurrences of 404 home page responses. The symptoms look like this:

  1. We get monitoring alarms that the home page is serving up 404 responses.
  2. Apache httpd logs indicate 404s.
  3. Watchdog indicates nothing.
  4. Loading from our origin server looks good.
  5. Hitting Akamai from various places in the world intermittently pulls a 404.

We also experienced problems flushing the page cache on demand, and couldn't tell whether these were somehow related.

The flushing issue is a documented bug in the memcache implementation, and you can either use the dev branch for the fix or just roll your own, making sure that $set_expire always has a valid (and meaningful) value:

http://drupal.org/node/765518

The nastier 404 issue appears to be related to this:

http://drupal.org/node/934846

Since the issue reporter is using file caching, and we're using memcache, I gather that the issue lies in Cache Router's fast_cache logic. As I commented in that issue, there seems to be an edge condition where requests can come in for a page that may not yet exist in cache, and the fast_cache response throws a 404 -- but since the response is early in the bootstrap, there's no call to drupal_not_found(), and thus no watchdog error.

Disabling the fast_cache option has solved the problem as of right now -- no 404 home page responses in almost 24 hours, whereas before we saw 1-50 per hour, depending on Akamai traffic.

Has anyone seen anything like this with Cache Router, or any other cache implementation? It's difficult to trace, since we haven't been able to reproduce the behavior anywhere but in production.

Comments

Re: Cache Router: Edge condition 404s valid pages?

Daniel Norton's picture
  1. We get monitoring alarms that the home page is serving up 404 responses.
  2. Apache httpd logs indicate 404s.

2 on the root URL ("GET /")?

Re: Cache Router: Edge condition 404s valid pages?

Daniel Norton's picture

The 404s show up for all browsers? (I just now saw it using MSIE8.)

Not browser-dependent

mshmsh5000's picture

I see many different user agents, and have replicated the 404 myself by wget.

Update: We just got a new round of 404 home page responses with fast_cache off, so I'm back to square one with this analysis.