Hi,
I have installed pantheon on Ubuntu (lucid). But please note I have downgraded the php to 5.2.10 (karmic version) since the site was giving some errors with drupal 6.13.
Server softwares
Ubuntu Lucid
Apache/2.2.14
PHP 5.2.13 (cli)
Mysql 5.1.41-3ubuntu12.3
Varnish-2.1
Mercury 1.1
The site is working good.. but from time to time load increases tremendously and server just hangs. Seems like php connections are not getting closed.
This is from error.log
[Mon Jul 12 20:47:16 2010] [error] child died with signal 9
[Mon Jul 12 20:47:24 2010] [error] child died with signal 9
[Mon Jul 12 20:47:24 2010] [error] child died with signal 9
[Mon Jul 12 20:47:36 2010] [error] child died with signal 9
My server tuneables
/etc/apache2/apache2.conf
export APACHE_MAXCLIENTS="10"
/etc/apparmor.d/usr.sbin.mysqld
export APPARMOR_MYSQLD=""
/etc/default/tomcat6
export TOMCAT_MEMORY="128"
/etc/default/varnish
export VARNISH_MEMORY="64"
/etc/memcached.conf
export MEMCACHED_MEMORY="128"
/etc/mysql/my.cnf
export INNODB_BUFFER_POOL_SIZE="64"
in bytes (ie, 1Gb = 1073741824 bytes)
export INNODB_LOG_FILE_SIZE="1073741824"
export KEY_BUFFER_SIZE="8"
export MYSQL_MAX_CONNECTIONS="20"
/etc/php5/apache2/php.ini
export PHP_MEMORY="96"
/etc/php5/conf.d/apc.ini
export APC_MEMORY="128"
Could anyone tell me how i can make sure client connections are closed correctly. Due to this server load my site and even my server is going down. I think this happens after around 3-4 hours of running.
Regards
Sreyas
Comments
Update
Hi,
Well I think this has something to do with OS itself. I am attaching some more errors from /var/log/messages
Jul 16 12:04:31 srv kernel: [53987.519079] 247206 pages non-shared
Jul 16 12:16:55 srv kernel: [54736.388249] apache2 invoked oom-killer: gfp_mask=0x200da, order=0, oom_adj=0
Jul 16 12:16:55 srv kernel: [54736.388257] Pid: 5376, comm: apache2 Not tainted 2.6.33.5-rscloud #2
Jul 16 12:16:55 srv kernel: [54736.388261] Call Trace:
Jul 16 12:16:55 srv kernel: [54736.388275] [] ? T.429+0x4f/0x145
Jul 16 12:16:55 srv kernel: [54736.388284] [] ? _raw_spin_unlock_irqrestore+0xf/0x10
Jul 16 12:16:55 srv kernel: [54736.388291] [] ? ___ratelimit+0xe2/0xfc
Jul 16 12:16:55 srv kernel: [54736.388296] [] ? T.428+0x37/0xfe
Jul 16 12:16:55 srv kernel: [54736.388301] [] ? __out_of_memory+0x140/0x157
Jul 16 12:16:55 srv kernel: [54736.388306] [] ? out_of_memory+0x82/0xac
Jul 16 12:16:55 srv kernel: [54736.388312] [] ? __alloc_pages_nodemask+0x489/0x57c
Jul 16 12:16:55 srv kernel: [54736.388319] [] ? read_swap_cache_async+0x54/0xe9
Jul 16 12:16:57 srv kernel: [54736.388324] [] ? swapin_readahead+0x57/0x98
Jul 16 12:16:57 srv kernel: [54736.388330] [] ? __raw_callee_save_xen_pte_val+0x11/0x1e
Jul 16 12:16:57 srv kernel: [54736.388336] [] ? handle_mm_fault+0x39a/0x6e0
Jul 16 12:16:57 srv kernel: [54736.388341] [] ? xen_force_evtchn_callback+0x9/0xa
Jul 16 12:16:57 srv kernel: [54736.388347] [] ? check_events+0x12/0x20
Jul 16 12:16:57 srv kernel: [54736.388351] [] ? check_events+0x12/0x20
Jul 16 12:16:57 srv kernel: [54736.388357] [] ? do_page_fault+0x277/0x293
Jul 16 12:16:57 srv kernel: [54736.388363] [] ? page_fault+0x25/0x30
Jul 16 12:16:57 srv kernel: [54736.388367] Mem-Info:
Jul 16 12:16:58 srv kernel: [54736.388369] DMA per-cpu:
Jul 16 12:16:58 srv kernel: [54736.388372] CPU 0: hi: 0, btch: 1 usd: 0
Jul 16 12:16:58 srv kernel: [54736.388375] CPU 1: hi: 0, btch: 1 usd: 0
Jul 16 12:16:58 srv kernel: [54736.388378] CPU 2: hi: 0, btch: 1 usd: 0
Jul 16 12:16:58 srv kernel: [54736.388381] CPU 3: hi: 0, btch: 1 usd: 0
Jul 16 12:16:58 srv kernel: [54736.388384] DMA32 per-cpu:
Jul 16 12:16:58 srv kernel: [54736.388387] CPU 0: hi: 186, btch: 31 usd: 58
Jul 16 12:16:58 srv kernel: [54736.388390] CPU 1: hi: 186, btch: 31 usd: 108
Jul 16 12:16:58 srv kernel: [54736.388393] CPU 2: hi: 186, btch: 31 usd: 116
Jul 16 12:16:58 srv kernel: [54736.388396] CPU 3: hi: 186, btch: 31 usd: 28
Jul 16 12:16:58 srv kernel: [54736.388402] active_anon:114370 inactive_anon:115510 isolated_anon:69
Jul 16 12:16:58 srv kernel: [54736.388403] active_file:123 inactive_file:2128 isolated_file:1
Jul 16 12:16:58 srv kernel: [54736.388405] unevictable:0 dirty:0 writeback:14 unstable:0
Jul 16 12:16:58 srv kernel: [54736.388407] free:1946 slab_reclaimable:2171 slab_unreclaimable:3364
Jul 16 12:16:58 srv kernel: [54736.388408] mapped:4531 shmem:7643 pagetables:12401 bounce:0
Jul 16 12:16:58 srv kernel: [54736.388416] DMA free:4020kB min:52kB low:64kB high:76kB active_anon:4708kB inactive_anon:4928kB active_file:24kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:13812kB mlocked:0kB dirty:0kB writeback:0kB mapped:148kB shmem:252kB slab_reclaimable:16kB slab_unreclaimable:228kB kernel_stack:104kB pagetables:12kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
Jul 16 12:16:58 srv kernel: [54736.388425] lowmem_reserve[]: 0 994 994 994
Jul 16 12:16:58 srv kernel: [54736.388437] DMA32 free:3764kB min:4004kB low:5004kB high:6004kB active_anon:452772kB inactive_anon:457112kB active_file:468kB inactive_file:8512kB unevictable:0kB isolated(anon):276kB isolated(file):4kB present:1018080kB mlocked:0kB dirty:0kB writeback:56kB mapped:17976kB shmem:30320kB slab_reclaimable:8668kB slab_unreclaimable:13228kB kernel_stack:2272kB pagetables:49592kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:1586 all_unreclaimable? yes
Jul 16 12:16:58 srv kernel: [54736.388447] lowmem_reserve[]: 0 0 0 0
Jul 16 12:16:58 srv kernel: [54736.388454] DMA: 1*4kB 4*8kB 3*16kB 1*32kB 1*64kB 0*128kB 1*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 4020kB
Jul 16 12:16:58 srv kernel: [54736.388472] DMA32: 913*4kB 2*8kB 6*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 3764kB
Jul 16 12:16:58 srv kernel: [54736.388490] 49539 total pagecache pages
Jul 16 12:16:58 srv kernel: [54736.388492] 39596 pages in swap cache
Jul 16 12:16:58 srv kernel: [54736.388495] Swap cache stats: add 14019230, delete 13979634, find 2273103/3914685
Jul 16 12:16:58 srv kernel: [54736.388499] Free swap = 648kB
Jul 16 12:16:58 srv kernel: [54736.388501] Total swap = 2097144kB
Jul 16 12:16:58 srv kernel: [54736.395069] 262144 pages RAM
Jul 16 12:16:58 srv kernel: [54736.395072] 6476 pages reserved
Jul 16 12:16:58 srv kernel: [54736.395075] 71705 pages shared
Jul 16 12:16:58 srv kernel: [54736.395077] 247738 pages non-shared
Regards
Sreyas
Downgraded apache to apache
Downgraded apache to apache 2.2.10 still the server load increases beyond control.. Looking for other options!!:(
Have you tried to profile
Have you tried to profile Drupal to understand what takes so long?
Swap
you're going into a swap spiral, looks like. you've also got a non-standard stack. You should profile your application to figure out what is happening with it, but this looks like a problem with Drupal and your traffic.
https://pantheon.io | http://www.chapterthree.com | https://www.outlandishjosh.com
Hi josh, Actually I tried
Hi josh,
Actually I tried this same website on karmic also. There my server configurations are
Ubuntu Karmic
Apache/2.2.12 (Ubuntu)
PHP Version 5.2.10-2ubuntu6.4
Mysql 5.0.83-0ubuntu3 (Ubuntu)
varnish-2.0.4
Mercury 1.0
Actually followed this installation instruction. http://groups.drupal.org/node/50408
Earlier i had a perfectly working server for some another site, so I think this one is due to drupal.. but not sure where it is wrong as there is not much information from dblog also.
Regards
Sreyas
Some more information. This
Some more information.
This is a live site and I have been moving site between multiple servers for getting the best performance. While testing the website only newly installed server everything works perfect. Thats is there is perfectly no load(server load always less than 1). I even tested it using Jmeter.
With 1000 users and tested for continuously 4 hours and the load never went more than 1. Also the Jmeter report shows only .4% error(which is normal i think).
Earlier site was residing on low resource VPS with bare drupal now moved the site to rackspace cloud with pantheon and mercury profile. So its definitely should be giving a boost to the sites performance. Rather than giving a boost site was going down every 3-4 hours due to server load.
So testing is not working out here, as I do not have any problem with the site in test environment. Server load increases only when site is brought live.
Regards
Sreyas
xdebug report
Showing the 20 most costly calls sorted by 'memory-own'.
Inclusive Own
function #calls time memory time memory
--------------------------------------------------------------------------------------------------------
MemcachePool->get 88 0.2388 18694216 0.2388 18694216
module_list 298 0.0718 9739040 0.0388 9304408
views_plugin_display->option_definition 288 0.0374 9317408 0.0321 9295864
ob_start 168 0.0040 7017848 0.0040 7017848
array_keys 2200 0.0424 5640824 0.0424 5640824
func_get_args 2284 0.0456 5168312 0.0456 5168312
str_replace 5019 0.0886 5168008 0.0886 5168008
array_merge 506 0.0130 4716592 0.0130 4716592
views_handler_field->option_definition 250 0.0197 3983032 0.0154 3904872
drupal_load 218 0.0624 4112664 0.0329 3890432
module_hook 10694 0.5985 3513888 0.4189 3513888
t 2926 0.3052 3711784 0.1357 3484616
date_part_extract 537 0.0274 3321384 0.0193 3291096
unserialize 301 0.0192 2948712 0.0192 2948712
explode 1449 0.0237 2667432 0.0237 2667432
variable_get 4294 0.0896 2350280 0.0896 2350280
preg_replace 3938 0.0864 1945864 0.0864 1945864
mysql_fetch_object 915 0.0239 1840104 0.0239 1840104
_db_query_callback 3037 0.1914 1867120 0.1152 1757128
url 651 5.0089 2557576 0.0876 1601952
subscribing
subscribing