Posted by j0k3z on July 23, 2010 at 7:54pm
My Mercury EC2 site has been running very slow for the past few days. I enabled cloudwatch for a bit and it is showing high CPU usuage. My site typically doesnt have more then 5-10 authenticated users online at once.
Here is a graph http://img708.imageshack.us/img708/8926/picture1ptf.png
What is the best way to track down what is slowing everything down?
Also, the "Online Users" block doesnt show how many guests are online because of the caching I guess - any way to fix that or get a semi accurate number about how many users are connected but not logged in?
Comments
Although the CPU is spiking,
Although the CPU is spiking, it's also staying relatively high usage throughout. Standard Linux tools like top should help you diagnose exactly what's taking up the large amount of CPU. SSH in and run 'top' and you'll see a list of processes sorted by amount of CPU usage. Keep an eye on it for a while to see what (if anything) keeps jumping up and down with occasional high CPU usage, and also what's staying at relatively high CPU usage.
My response times have gone
My response times have gone from 300ms closer to 2000ms -- anyway to track down what might be causing that?
re: My response times have gone
Regarding the CPU issue, we're tracking a bug (https://bugs.launchpad.net/pantheon/+bug/574910) that appears to be affecting Lucid servers running on EC2 and other VPSs. At first this bug appeared to be a reporting error - indications are now that it's it's also affecting performance. If you're using Jaunty or Karmic (which aren't affected by this bug) you might use htop - it's more accurate in reporting cpu and mem usage than top.
This bug could be affecting response times too but another thing worth looking at is your memory usage. Swapping to disk can decrease performance a great deal.
Hope this helps,
Greg
--
Greg Coit
Systems Administrator
http://www.chapterthree.com
With just 7-8 logged in users
With just 7-8 logged in users I am seeing very high cpu load averages and the site becomes very slow.
re: With just 7-8 logged in users
This behavior matches what people have been describing with the bug I linked to above. Unfortunately, this is an upstream bug in the Ubuntu Lucid kernel.
Greg
--
Greg Coit
Systems Administrator
http://www.chapterthree.com
Bummer. I moved to Mercury +
Bummer. I moved to Mercury + EC2 to increase performance and now my site is crawling with 10 users. Hopefully this bug gets worked out soon.
Are there any VPS's where this bug isnt affecting performance? Or do I need to switch to a different kernel? Or just wait patiently?
re: Bummer. I moved to Mercury +
At this point, there doesn't appear to be a Lucid kernel that doesn't have this issue
, and it appears to be across all VPSs (but it was first reported for AWS).Update 8/4/10: Reading over https://bugs.launchpad.net/pantheon/+bug/574910 again and looking at some test servers we have running on Rackspace, i'm not convinced that this issue appears outside of the EC2 kernels.The good news is this bug is receiving lots of attention by Canonical and Ubuntu users.
Greg
--
Greg Coit
Systems Administrator
http://www.chapterthree.com
I haven't seen anything about
I haven't seen anything about this on Linode's paravirt kernels. They use Xen just like AWS does.
I am actually using the 1.0
I am actually using the 1.0 ami so I shouldnt be affected by this bug. Something is seriously wrong with my server though, its completely crashing with just 10 authenticated users.
Is anyone available to help me troubleshoot / fix / tune this? I need help asap, please email me with your rate if you are interested.
re: I am actually using the 1.0
You might look at the values we have in Mercury 1.0 (etc/mercury/config_mem.sh/config_mem.sh) and see how we've changed them in 1.1 (http://groups.drupal.org/node/70258). I think you'll find these are more sane numbers.
Hope this helps,
Greg
--
Greg Coit
Systems Administrator
http://www.chapterthree.com