I have an issue site performance after running update_mercury.sh I think it has something to do with APC only caching some .php and .inc files, (but leaving out many and not caching any .module files).
Here's what I've done:
I installed Mercury 1.1 according to: http://groups.drupal.org/node/70268
I installed my sites db over the pantheon db and copied the sites/ directory over (accept for the settings.php file)
It seemed to work nicely, page load times and memory usage were slightly improved over my local machine, running plain drupal with authcache, cacherouter with apc and boost.
I then installed xhprof and the latest apc 3.1.6 (using: pecl install apc) and modified some of the server_tunables:
APC_MEMORY="64M", VARNISH_VCL_RECV="[some code to remove additional cookies]", TOMCAT_MEMORY="88", PHP_MEMORY="80M"
I wanted to test that the changes would be preserved so I ran update_mercury.sh
At this point page generation times more than doubled as did memory usage (from around 8-12mb and ~250ms to 25-35mb and ~650ms).
xhprof indicates that everything is taking longer and using more memory, but especially the drupal boostrap. So I checked the apc cache (using apc.php that comes with the package source) and saw that only a small fraction of the files were being cached (about 4mb) compared to the number on my local machine (about 30mb), after a server restart and 1 page load on my linode and local machine.
I think pressflow uses memcached for key/value storage, hence apc's 'cached variables' not being used. But I don't understand why so few files are being cached and wonder if this is the cause of the increased load time and memory usage.
So far I've tried to disable bits of the system to locate the issue. I disabled varnish (and adjusted the ports.conf and 000-default apache config files to listen to port 80 again). No change to above. I then disabled memcache (and adjusted settings.php). No change (accept for slightly increased memory usage). So I installed authcache and cache router, and the memory usage went down to the 30mb mark again. I've tried server restarts, clearing the caches (drush cc all) and tweaking the apc.ini file, all to no avail. I have no errors in apache's error.log. Running swapon -s on my linode shows that 13Mb of swap is being used, but then that's inline with my local machine. free -m shows between 50mb and 5mb free when I view various pages (not load testing) on the site from different browsers. So now I'm at a loss.
I'd really appreciate some clues about what might be causing this problem and how to fix it!

Comments
still not figured this
still not figured this out.
running top typically gives me this:
top - 04:11:04 up 2 days, 2:33, 2 users, load average: 0.03, 0.03, 0.00
Tasks: 118 total, 1 running, 117 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 510652k total, 440552k used, 70100k free, 8360k buffers
Swap: 262136k total, 147156k used, 114980k free, 83488k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
9799 www-data 20 0 136m 53m 7720 S 0 10.7 0:24.02 apache2
9801 www-data 20 0 134m 52m 9528 S 0 10.5 0:47.56 apache2
9796 www-data 20 0 129m 45m 7184 S 0 9.2 0:28.56 apache2
9800 www-data 20 0 125m 41m 7008 S 0 8.3 0:20.12 apache2
2135 hudson 20 0 420m 40m 4488 S 0 8.2 0:18.05 java
9798 www-data 20 0 124m 40m 6968 S 0 8.0 0:18.02 apache2
2371 tomcat6 20 0 246m 29m 4736 S 0 5.9 0:13.11 java
2040 mysql 20 0 360m 27m 3492 S 0 5.5 0:26.20 mysqld
2137 nobody 20 0 60208 15m 784 S 0 3.2 0:00.22 memcached
9793 root 20 0 97376 9044 5292 S 0 1.8 0:00.20 apache2
23153 root 20 0 7276 5356 1732 S 0 1.0 0:00.68 munin-node
2105 root 20 0 84428 3692 1212 S 0 0.7 0:09.71 bcfg2-server
3827 joe 20 0 8016 2876 2060 S 0 0.6 0:00.01 mutt
29722 root 20 0 8352 2752 2180 S 0 0.5 0:00.03 sshd
21121 root 20 0 10268 2652 1940 S 0 0.5 0:01.06 dovecot-auth
3299 postfix 20 0 6260 2620 1868 S 0 0.5 0:00.01 tlsmgr
21120 root 20 0 10076 2496 1860 S 0 0.5 0:00.00 dovecot-auth
2274 nobody 20 0 186m 2380 1700 S 0 0.5 0:02.99 varnishd
when I view a page anonymously and it gets served 'via varnish' (according the http headers for the content), one of the apache threads still jumps to 17%... but the memory doesn't change. when I view a page as an authenticated user the cpu goes to 90-100% and the memory usage goes up to around 15% for one of the apache processes.
I cannot find any errors in the logs for db, memcache, varnish, tomcat, syslog or apache (accept for '[error] server reached MaxClients setting, consider raising the MaxClients setting' but only immediately after I restart the server) and the site status is OK. Any tips for fixing this? It just seems ridiculous that on my humble local machine the same site, on standard drupal with a bunch of caching modules loads in almost 1/3 -1/2 the time and uses 1/3 -1/2 the memory per page, compared to pantheon on a linode. I know it can do much better, I just can't figure out what went wrong with the setup.
re: still not figured this
What does your /etc/php5/conf.d/apc.ini look like after the upgrade of APC? It should be similar to this:
extension=apc.so
apc.shm_size=96
apc.include_once_override = 1
apc.stat = 1
apc.num_files_hint = 1000
I don't think the APC_MEMORY variable in the server_tuneables files takes an M.
Hope this helps,
Greg
--
Greg Coit
Systems Administrator
http://www.chapterthree.com
Hi Greg, thanks for your
Hi Greg, thanks for your response. I gave up for a while, but recently came back to it, once again determined to sort it out.
I believe the M after the apc memory size is necessary is recent versions of APC. As I said above, I installed 3.1.6
My apc.ini file does look like the above, accept I set the memory to 56M not 96M (56 is more than enough for my site at the moment). include_once_override was set to 0 (I don't remember why) but I set it back to 1. It seems to have made no difference after an apache restart. Same high memory usage and page generation times as previously stated. viewing the apc.php utility page, I see apc is still only using 3-4Mb tops of it's allocated 56M, and is still caching very few files (no module files, as far as I can see, only some of .inc files). Would be great to have some clues about how to fix that, or at least where to look.
On a positive note, I figured out what was up with Varnish. I'd set varnish_vcl_recv tunable to:
"
set req.http.Cookie = regsuball(req.http.Cookie, \"has_js=[^;]*\", \"\");
if(req.url !~ \"cart\"){
set req.http.Cookie = regsub(req.http.Cookie, \"hascart=[^;]*\", \"\");
}
"
(I wanted it to work with regular Drupal too and some code to control the display of shopping carts)
It turns out this code was causes Varnish to always miss. I moved the above code to just under the expression to remove the Google Analytics cookies, in the /etc/varnish/defaul.vcl file, just on a hunch, and then suddenly varnish was getting cache hits again. hurray! My guess is that the above code was setting the req.http.Cookie to a blank value, when it had been unset before. So if want to add such code via the server_tunable file I need to test whether req.http.Cookie already unset.
Since most of my site usage will be anonymous traffic I can relax about it now with Varnish in full effect, but it would be great to have the Drupal cache also working as well as I know it can. Any tips to fix the above issue are most welcome!
Thanks.
Joy! turns out the caching
Joy! turns out the caching issue was with apc 3.1.6 (even though it doesn't show the same problem on my local install). I upgraded to 3.1.8 (svn) and all was well again. So glad to finally have this fixed, thanks to the info: http://groups.drupal.org/node/113594