Varnish errors on loaded sites ... Error 503 Service unavailable... Guru Meditation

We encourage users to post events happening in the community to the community events group on https://www.drupal.org.
liorkesos's picture

In certain instances I see the next error (which I remember vagually seeing in drupal.org.il)
Error 503 Service Unavailable
Service Unavailable
Guru Meditation:
XID: 1525473736
From googlin' i understand that this is apache not responding to varnish, Is this correct?

How do I play with the time varnish gives apache to respond?
best regards
Lior

Comments

Sky's the limit with VCL

joshk's picture

There's a world of possibility in the Varnish Control Language:

http://varnish.projects.linpro.no/wiki/VCL

For instance:

backend www {
   .host = "www.example.com";
   .port = "http";
   .connect_timeout = 1s;
   .first_byte_timeout = 5s;
   .between_bytes_timeout = 2s;
}

I will review these settings in the base install and look at making the defaults more forgiving for apache.

503 errors

gchaix's picture

I've seen 503 errors when my system is under load. We've reduced those greatly by increasing the timeouts in the vcl conf file:

backend default {
    .host = "127.0.0.1";
    .port = "8080";
    .connect_timeout = 600s;
    .first_byte_timeout = 600s;
    .between_bytes_timeout = 600s;
}

600 second timeouts are likely excessively high, but it's worked for us. We've also tweaked the varishd startup options (/etc/conf.d/varnishd):

VARNISHD_OPTS=”-a *:80 \
-T 127.0.0.1:8181 \
-f /etc/varnish/default.vcl \
-p thread_pools=4 \
-p thread_pool_max=1500 \
-p listen_depth=2048 \
-p lru_interval=1800 \
-h classic,169313 \
-p obj_workspace=4096 \
-p connect_timeout=600 \
-p max_restarts=6 \
-s malloc,2G”

We increased the connect_timeout, thread pools and set max_restarts. This combination has virtually eliminated the 503 errors for us. We're seeing 30k hits/day on this config without any load issues (varnish and apache are on the same 12gb/4core box).

Rad

joshk's picture

Thanks for posting this Greg. I'm going to include these settings in the beta release for sure.

malloc storage

gchaix's picture

Note that we chose to tell it to only use RAM for storage by setting it to malloc. Varnish defaults to file-based storage under the philosophy that the kernel will know better how to manage what needs to be buffered in RAM and what can be dropped to disk. In reading the Varnish docs, I expect this is likely not a recommended config in the eyes of the Varnish devs.

Slightly different location in mercury/debian/ubuntu...

liorkesos's picture

Great Advice!
For people looking for this in a debian/mercury box the fixes go in /etc/default/varnish instead of /etc/conf.d/varnishd (I think that's rhel?)
Also it appears debian likes DAEMON_OPTS instead of VARNISHD_OPTS
So my code looks like thsi..

DAEMON_OPTS="-a :80 \
                -T localhost:6082 \
                -f /etc/varnish/default.vcl \
                -p thread_pools=4 \
                -p thread_pool_max=1500 \
                -p listen_depth=2048 \
                -p lru_interval=1800 \
                -h classic,169313 \
                -p obj_workspace=4096 \
                -p connect_timeout=600 \
                -s malloc,2G"

Linnovate - Community Infrastructure Care
Drupal Services in Israel
http://www.linnovate.net

Gentoo, actually :-)

gchaix's picture

Sorry for the confusion. These configs are running on Gentoo, not a Debian variant.

Thanks

joshk's picture

Another thorn in my side is figuring out if/how I can get varnish to listen on multiple ports. The way the opts are configured, adding the correct syntax seems to break the start script. I can do it directly from the command line, but that's no good.

Why multiple ports? Well, because we want to put Varnish over solr too... :)

VCL

rjbrown99's picture

I'm testing the following VCL changes for this.
http://www.varnish-cache.org/lists/pipermail/varnish-misc/2011-March/005...

This happens to me, very similar to that thread - almost exclusively on POSTs.

Check your apache/php logs

joshk's picture

If you get the guru message that means the back-end (aka Drupal) had an unrecoverable sad. Debugging this means figuring out the error.

You can also address the cosmetic/appearance in production by making sure that PHP returns a WSOD without a 503 response code if it fails. You may also want to add your own vcl_error() function to deliver a more awesome fail page. ;)

Thanks

rjbrown99's picture

Thanks Josh, the backend looks quite healthy and this is basically a high powered and unused machine at the moment. It only happens on POST requests, and only very intermittently - perhaps 1 out of every 100 or more.

I implemented the change from the other URL link, and this one actually turned out to be what ultimately worked around it. I'm just sending POST via pipe.
http://www.varnish-cache.org/trac/ticket/849
http://www.varnish-cache.org/trac/wiki/VCLExamplePipe

hello i am windows server and

beto_beto's picture

hello

i am windows server and using D6

i have this error i am using simple news letter

Unable to send e-mail. Please contact the site administrator if the problem persists.
warning: mail() [function.mail]: SMTP server response: 503 This mail server requires authentication when attempting to send to a non-local e-mail address. Please check your mail client settings or contact your administrator to verify that the domain or address is defined for this server. in C:\inetpub\vhosts\example.com\httpdocs\includes\mail.inc on line 192.

Any Ideas !!

thank you

Error in server

LeeBin's picture

Dear Sir ,
I'm leebin and i have using 1 server

but get error same :

Error 503 Service Unavailable

Service Unavailable
Guru Meditation:

XID: 342045646

Varnish cache server

i was try fix it 2 days but it still so with all site other on server :(
Please , if who can help me :( ..
i really need help ..
if who can help me please contact my yahoo : leebin_nguyen

seconds or minutes?

karl_sg's picture

Hi All,
just a small (and silly) question...
On my Varnish I have:
backend Server1 {
.host = "172.XXX.XXX.XXX";
.port = "80";
.connect_timeout = 2s;
.first_byte_timeout = 1m;
.between_bytes_timeout = 8s;

Is it right that first_byte_timeout is configured in minutes? On all Varnish documentation I found this value in seconds, just wondering if it can influence the 503 errors I am facing...
Cheers,
Carlo Alberto

I doubt it's a timeout issue

gchaix's picture

I doubt it's a timeout issue if you're getting 503 errors. 503 generally indicates an error on the web server, usually a PHP error. I'd check your backend server logs for fatal PHP errors.

Thanks for your reply, but I

karl_sg's picture

Thanks for your reply,
but I already check IIS logs and no errors are shown, moreover if I refresh the page is displayed correctly.
I made several test by pointing directly to one Varnish server of my pool (using hosts file) and I wasn't able to replicate twice the 503 on the same page (static xml/html pages), so it is not related to the backend servers...any idea on where to investigate?

Thanks,
Carlo Alberto

Mercury

Group organizers

Group categories

Post Type

Group notifications

This group offers an RSS feed. Or subscribe to these personalized, sitewide feeds: