Load increase troubleshooting help

Events happening in the community are now at Drupal community events on www.drupal.org.
mattman's picture

Hey there BOA users.

I'm looking for a smarter sysadmin than myself to help me troubleshoot a load increase on my BOA box.

I'm running on Linode(Xen) and the system is locked down as best I know how, no ssh root login, no clear passwords, only key based and everything has been running great for years.

In the past few weeks, I've had a spike in cpu use and io. I don't think this is traffic related, but I guess I need to learn a bit more about troubleshooting to find out the cause of things.

I am running some non BOA stuff, primarily an instance of java based Confluence. I also run Solr (jetty8) and some VERY low traffic non Drupal sites. Starting in mid/late Dec I started seeing a lot of lfd alerts regarding blocked sshd attempts and distributed sshd attempts.

I'm familiar with top, htop, netstat, lsof and other tools to look at connections, ip, ports, etc. However, I'm still feeling like I don't know enough to determine what is happening.

Attached is an image of my latest graphs from linode for CPU use.

Can anyone provide any insight or pointers for looking into this type of stuff?

AttachmentSize
cpu_load.png78.62 KB
linode2.jpg386.08 KB

Comments

Is there a pattern in the

zkrebs's picture

Is there a pattern in the time of day that the cpu/io spikes?

Hey @slavojzizek thanks for

mattman's picture

Hey @slavojzizek thanks for the reply.

No spikes, it was just a sudden increase and then it stayed there. There was no relative increase in outbound or inbound network just cpu increase. Yesterday, BOA had done a self-update and installed a new curl, which I mistook as a hack attempt.

I shut the vm down and cpu use, upon restart, throttled down. I'm watching it now as I setup a fallback server for redundancy. I'm still going to be watching things. It's just a pretty big sudden jump in cpu use.

same thing

jimsmith's picture

I have been experiencing this and reported it here: https://groups.drupal.org/node/306518. Unfortunately, I never could find the cause. Linode was not very responsive to helping solve the problem.

We have received an almost

omega8cc's picture

We have received an almost identical report related to BOA hosted on Linode.

Your CPU usage stats:
linode1

The other BOA server on Linode stats:
linode2

The problem is that we haven't seen such pattern on our own and other non-Linode servers. This could suggest that it is Linode specific issue.

Reported also at:
https://github.com/omega8cc/boa/issues/558#issuecomment-69737298

I'm seeing the same thing on

marko42's picture

I'm seeing the same thing on my Linode instance.

Add me to the list as well

leelive's picture

I've been seeing this for some time on my Linode as well. Only running BOA on 64 bit Debian kernal (3.18.1)

I think this is a linode issue

ddols's picture

I have been having the same issue since about December 15th. I think this happened to my set up a bit over a year ago. Linode moved me to a different server and it went away. Has anyone had any luck dealing with linode on this?

This needs further testing

omega8cc's picture

We have changed task scheduler default speed to the previous default to see if this helps for Linode based systems, so it is not clear enough yet what exactly could cause the problem.

Load on my linode instance

marko42's picture

Load on my linode instance dropped from a steady 80% to a steady 30% at 10pm ET. Will keep an eye on it.

could lfd have anything to do with it?

ddols's picture

Mine dropped from 80% to 30% a little over 24 hours ago too.

That probably negates my other observation. Quite a while ago the LFD emails stopped. I checked the logs and LFD was still running and doing its thing. Thought maybe a default for the emails was changed during an update since there are so many emails, and it potentially may be adding extra load to the server. Skimmed through the change logs and didn't see anything mentioning a change. About the time the jump to the 80% CPU load started, so did the LFD email alerts again. Coincidence?

BOA

Group organizers

Group notifications

This group offers an RSS feed. Or subscribe to these personalized, sitewide feeds: