503: Service Unavailable caused by Segmentation Fault

Events happening in the community are now at Drupal community events on www.drupal.org.
hitfactory's picture

I am seeing the 503: Service Unavailable page for just a single page on my site. All other pages load fine.

When I access the page I see this in the logs:

[Tue Oct 05 13:57:27 2010] [notice] child pid 5229 exit signal Segmentation fault (11)

My server has been set up according to http://library.linode.com/development/frameworks/php/project-mercury/ubuntu-9.10-karmic#configure_varnish

Can anyone provide any advice on how to troubleshoot this?

Comments

Similar issues here. I am

vacilando's picture

Similar issues here.

I am getting more and more of these errors like the following:

[Mon Nov 01 13:36:51 2010] [notice] child pid 3785 exit signal Bus error (7)
[Mon Nov 01 13:38:48 2010] [notice] child pid 4204 exit signal Bus error (7)
[Mon Nov 01 13:48:17 2010] [notice] child pid 4561 exit signal Bus error (7)
[Mon Nov 01 13:48:35 2010] [notice] child pid 3843 exit signal Bus error (7)
[Mon Nov 01 13:49:39 2010] [notice] child pid 2869 exit signal Bus error (7)
[Mon Nov 01 13:49:50 2010] [notice] child pid 4552 exit signal Bus error (7)
[Mon Nov 01 13:50:29 2010] [notice] child pid 4509 exit signal Bus error (7)
[Mon Nov 01 13:59:06 2010] [notice] child pid 3980 exit signal Bus error (7)
[Mon Nov 01 14:00:45 2010] [notice] child pid 4904 exit signal Bus error (7)

The problem is that I find no indication about the source of the problem. Varnish simply returns with its 503 error page, sometimes a split second after requesting a page. Tends to happen with the heavier pages, like the list of modules, but it's hard to find regularity. Maxclients is not reached. Server not particularly overloaded (say 1.5). Pretty vanilla Pantheon 1.1 beta (on Lucid), with all the updates as specified by Pantheon.

I've spent insane time trying to debug this, did a lot of optimizations etc, but the problem still occurs quite frequently. Is there anything I can do to find out what this is caused by, or how to get more information from the apache2 error.log? I'll be thankful for any and all ideas.


---
Tomáš J. Fülöpp
http://twitter.com/vacilandois

Check any custom code

hitfactory's picture

I traced this to wrong use of a Views hook in a custom module as the Preview function in the Views UI also stopped working for a particular view. Try checking any custom code running on the page not loading.

Is that Varnish crashing?

vacilando's picture

@kidrobot - thanks for the idea - I accept it can be caused by an error in the code but it happens on too many diverse pages and on several sites, so I am sure it is something more generic. I wonder what's the underlying cause of this.

One idea is - and I would love to hear from the Varnish ninjas out there - that it may be Varnish crashing. I have a vague memory of somebody at Drupalcon in Copenhagen saying that Varnish re-starts automatically so often people don't notice it was down.

Well, at /admin/reports/varnish I've noticed that "Client uptime" is most of the time rather low - around 100 - 2000 seconds. Is that normal - or is this a sign of crashing Varnish? And if it's crashing, what would be the best next step trying to find the reasons of the crashing?


---
Tomáš J. Fülöpp
http://twitter.com/vacilandois

re: Is that Varnish crashing?

Greg Coit's picture

After some googling, I'm inclined to say this is not varnish crashing. In fact, I think the short uptime on varnish is a reaction to something in PHP or Apache crashing. With "Saint Mode' on, varnish will restart if it receives bad data (see http://www.varnish-cache.org/docs/2.1/tutorial/handling_misbehaving_serv...).

Hope this helps,

Greg

--
Greg Coit
Systems Administrator
http://www.chapterthree.com

Mercury

Group organizers

Group categories

Post Type

Group notifications

This group offers an RSS feed. Or subscribe to these personalized, sitewide feeds:

Hot content this week