Posted by fullerja on January 22, 2014 at 8:42pm
We have a large multisite install with varnish in front of it. We have had an increase in traffic in the last week and are getting a lot of segmentation faults from one of the two web servers.
Our basic architecture is 2 varnish servers, two web servers, 2 db servers (master and slave).
Any ideas where to go to start debugging this stuff?

Comments
Could be anything ...
Segmentation faults means there is a problem with memory access within the concerned binary.
http://en.wikipedia.org/wiki/Segmentation_fault
First find out which binary is causing this (Apache, PHP, Memcache, ...etc.)
Then see if you are running the latest binaries for your distribution. If you are running something that is locally compiled, try running whatever version is in your distro's repository and see if that solves the issue. Normally these versions are tested more because of the large user base and issues would be solved faster than when you locally compile things.
Drupal performance tuning, development, customization and consulting: 2bits.com, Inc..
Personal blog: Baheyeldin.com.
Infinite loops are typical culprits
That's the easiest way to segfault. If you can, strace a running PHP process and see what it does before it dies?
Seems to be an php child process
They seem to be failing as soon as they are spawned, maybe 10+ per second, which makes sense with data that we are getting from varnish (getting 20-ish request per second, and splitting them).
We had apache segfaulting all
We had apache segfaulting all over the place a while back, and I noted down some instructions on how to debug them using gdb. I've pushed the notes up to my blog: http://www.tsphethean.co.uk/blog/2014/01/24/Debugging-segfaults/
Will all be a bit rough and ready, and no guarrantees that its apache that is segfaulting, but hopefully this will get you on the right track. Hope it helps!
Thanks for this, I appreciate
Thanks for this, I appreciate the write-up!
is your issue - "Segmentation
is your issue - "Segmentation faults" - fixed now ?