Help figuring out what modules are actually in use.. to reduce memory footprint

We encourage users to post events happening in the community to the community events group on https://www.drupal.org.
gateway69's picture

Hi, I wanted to know if anyone out their knows of a few ways to track down whats using up the most memory on a site. We have been in development of a beer site for the last year or so part time and getting to the point where things from a site, function and look wise are kicking butt.

But im a bit worried about how much memory the site is using up and need a way to figure out what modules really are just enabled but not used.

I set up a vm guest of ubuntu 10.04lts, with pressflow, mercury profile, varnish, solr, memcache and then ported over my site and modules... right now with the site idle top prints out this

top - 10:32:57 up 13:28,  1 user,  load average: 0.00, 0.03, 0.00
Tasks: 100 total,   1 running,  99 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.0%us,  0.2%sy,  0.0%ni, 99.8%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   2054340k total,  2001376k used,    52964k free,   120692k buffers
Swap:   577528k total,        0k used,   577528k free,   525140k cached

My apc setting is at 256megs, varnish 135megs bin file..

any pointers to track down whats using the most memory, whats not being used etc..

thanks

Comments

You get no help from Drupal

moshe weitzman's picture

You get no help from Drupal answering this sort of question. You could consider looking at traces from xhprof

Moshe, im a bit confused,

gateway69's picture

Moshe, im a bit confused, arnt you supposed to post in groups for help, this is deals with high performance since im running all the various things such as pressflow, varnish, apc, memcache etc.. i just figured some pointers from the guys who deal with sort of thing wouldnt be out of the question.

Sorry, I worded that poorly.

moshe weitzman's picture

Sorry, I worded that poorly. I meant to say that Drupal has no audit features to tell you about code that is enabled but not in use. You have to look to outside tools such as xhprof.

He didn't mean you wouldn't

brianmercer's picture

He didn't mean you wouldn't get help from the Drupal community.

The "free -m" is a good first step to understand the amount of memory being used by the file system cache and buffers.

You should also install htop to show you which processes are using up the memory, i.e. Varnish, memcache, or apache2 and how much RESident memory your apache2/php processes are using and if they are effectively using SHaRed memory for the APC cache depending on your apache2 configuration.

Then when you want to check memory usage at the module level you will want to install the devel module and xhprof as a php extension. You can use my repo for easy ubuntu installation of the xhprof extension as described here: http://groups.drupal.org/node/82889 Though you probably won't use the nginx bits.

And if you're using APC you

brianmercer's picture

And if you're using APC you need to use apc.php to view that you're not hitting the shared memory limit.

Thanks Brian, I was pointed

gateway69's picture

Thanks Brian, I was pointed to this thread at the bottom also, however was unable to get your package to install on Ubuntu 10.04 something about add-apt-repository missing.. Eventually after googling around I found some people who installed it manually and well lets just say it wasnt that simple and people forget to mention x,y,z in their blogs :)

I followed this guide a bit.. http://techportal.ibuildings.com/2009/12/01/profiling-with-xhprof/

Now its on to figuring out whats what and what I should look for.. of course these stand out

Number of Calls
Memory Usage
Peak Memory Usage
CPU time (i.e. CPU time in both kernel and user space)
Wall time (i.e. elapsed time: if you perform a network call, that's the CPU time to call the service and parse the response, plus the time spent waiting for the response itself and other resources)

Here is an example of one of my pages.. it loads quickly, its on my local network but poor VM guest image is blowing up and crashing most of the time

https://skitch.com/gateway69/gmbgg/xhprof-hierarchical-profiler-report

Holy Graphviz output batman!

https://skitch.com/gateway69/gmbg5/callgraph.php-7087x7760

I guess, I follow the dark line, ..

anyhow thanks guys, gonna take me a bit to completely understand and what i should really be looking for..

If you have high memory

catch's picture

If you have high memory usage, the first place to start is the xhprof detail view of cache_get(). You can then see the amount of memory used by caches like views defaults, the theme registry, schema, variables, cck, field API etc. - on many sites those are the single largest use of memory.

I just found that out on one of my sites ....

davea's picture

I just attended the DrupalCamp Austin and heard the presentation on xhprof. Good Stuff!

Then I started looking at things using that tool and found out exactly what Catch is saying.

Now to dig deeper into those views....

yea.. cache_get is eating up 66 megs with peek at 71

gateway69's picture

Profile Image snap shot.. http://grab.by/bivx

Ugg.. Ok so at first glance it looks cache_get like this is the biggest memory hog.. with

theme_load_registry
variable_init
content_type_info
views_cache_get

seeming to be the top contenders making up this most of this.

Any idea where or what I can do to dindle this down.. sorry guys I'm just trying to get a handle on this as a first time performance, memory kinda tweaking person.. I have been working with Drupal for a few years and feel pretty confident on that side but when it comes to this, im still a noob..

Here is htop output
http://grab.by/bivz

looks like

mysql = 193 megs
varnish 432 megs (if im reading this right)
tomcat 612 megs.. ???

Ideally I could I think go this direction, where I have separate vm boxes for varnish, tomcat (solr). Im thinking of going to the web with either aws, rackspace, linode..

anyhow thanks for the posts on this.. I went to BadCamp a while back and didnt see to much technical info or a class on how to deal with memory and or performance.. I do have a trial at Pantheon to test their services out but want to get this nailed down a bit more here locally on my vm machine before moving forward.

patches

catch's picture

i've been doing a lot of work on this the past few months, and there are various patches at: http://drupal.org/project/issues/search?projects=&issue_tags=memory

I'd recommend:

  • Apply one patch.

  • profile before and after (make sure caches are fully primed both times, some of these change caching strategy)

  • If it works for you and the patch isn't committed yet, update the issue with screenshots to show the difference on your site.

Then keep repeating that for each one.

Some of the Drupal 6 core patches are never going to be applied to D6 (for example PHP 5.2 dependencies), to keep track of these there is also a Pressflow fork at https://github.com/tag1consulting/pressflow6/branches (each branch roughly corresponds to one patch).

Your variable and theme registry caches look way too big compared to usual. You should also do this in cache_get() somewhere:

file_put_contents('/tmp/cache_get_' . $cid . '.txt', print_r($cached->data, 1), FILE_APPEND);

Then check the contents of the largest cache entries to see what's in there. For example lightbox2 had an issue with hook_theme() that could cause the theme registry to grow extremely large. You may also have a module creating an unreasonable number of variables.

Regarding htop...hit F2 for

brianmercer's picture

Regarding htop...hit F2 for Setup, down to Display Options, checkoff Hide userland threads, then F10 for done. Then his F6 for Sortby and choose MEM%.

Your web server and php must be on the bottom off the screen in that screengrab.

You shouldn't pay much attention to the VIRTual column. That's sort of a theoretical maximum. You're more interested in the RESident column.

It looks like 503MB used out of 2000MB. Memcache and Varnish are probably not being used since their memory use is minimal. Tomcat is about 97MB which is normal depending on your data set. Mysql is about 55MB and is fine for a server that's just been restarted. With a 2GB machine you'll probably allocate a significant portion to mysql caches, depending on your data size and the nature of your queries and traffic.

thanks.. ok did this and got

gateway69's picture

thanks.. ok did this and got it sorted.. but now while I have been working on the site this weekend, every-now and then i get a white page or server error page.. php crashing?

Here is the latest memory.. http://grab.by/bj1d

While I feel like im making a step forward it feels like going backwards..

Also just heads up im running Pressflow 6.22 with only one patch to core and thats the xml parser issue..

What patches does anyone recommend, I can test them out and report back, as I can just make a quick clone of my vm images before it blows up :)

Patches Wiki

mikeytown2's picture

Here's a list of patches that might be helpful for speed. Some might make your memory usage worse.
http://groups.drupal.org/node/187209

Thanks for the feedback. I

brianmercer's picture

Thanks for the feedback. I added the line:

aptitude install python-software-properties

so people won't have that problem again.

Possible module

darrell_ulm's picture

Could this be a possible future module?

Might be tricky because it would need to check the usage of all modules even if enabled, perhaps by looking at created content. Interesting.

There is one important thing

Schnitzel's picture

There is one important thing to know about Linux:
The Kernel is using free memory to cache Harddisk blocks, which actually makes a lot of sense to use the free memory for useful things, but shows to much memory usage.

So if you use:
free -m
it shows you a "cached" column which is the amount of memory used for this block cache. You should remove this amount from the "used" number to know how much memory is actually used by the processes.

thanks.. all this time I have

gateway69's picture

thanks.. all this time I have been using unix and never knew of this command. cheers

gateway@ubuntu:~$ free -m
total used free shared buffers cached
Mem: 2006 1950 55 0 114 491
-/+ buffers/cache: 1344 661
Swap: 563 0 563

Im also going to be looking at apc cache , im surprised their is no way to really tell what modules eat up the most and backtrack from their, aka like looking for big files on your hardisk :)

I know its not that simple.. I could use features to export all my content types, since this tries to pull together most things connected but not sure if this will catch everything since features is still in dev and modules need to support it

I've always thought that a

Jamie Holly's picture

I've always thought that a profiling system for Drupal hook's, similar to that of the database query log would be helpful, but that would require changes to core. Moshe's right though, XHProf is the way to go. You get much more detail out of that then you will ever get from within PHP. The Devel module even has support for it. It's a quick install via PECL and provides unbelievable information into what is happening behind the scenes. You can even install it to give you a nice graph representation of what's going on:

http://static.larsmichelsen.com/wp-content/uploads/nagvis-xhprof-test.png


HollyIT - Grab the Netbeans Drupal Development Tool at GitHub.

Cheers, yea thats on my

gateway69's picture

Cheers, yea thats on my list.. is their a nice set up guide for drupal for this.. ?

This link that Brian Mercer

Jamie Holly's picture

This link that Brian Mercer posted:

http://groups.drupal.org/node/82889


HollyIT - Grab the Netbeans Drupal Development Tool at GitHub.

Indeed

perusio's picture

To get the real memory you can do:

free -m | sed -n '2p' | awk '{print $3 - $NF}'

NewRelic

rjbrown99's picture

NewRelic does exactly this. It will enumerate all of your call time across both modules as well as PHP function calls. Here's an example:
Only local images are allowed.

I suggest against using the Drupal modules feature in prod as it does have some overhead, but the php function and database calls you can leave on and they are soooo valuable. I can't begin to explain how useful NewRelic has been for performance tuning.

yea saw that a at BadCamp..

gateway69's picture

yea saw that a at BadCamp.. right now ill see about setting up the trial, but need only give 14 days and well i would want to use it when im ready..

thanks for the tip!

Email them

rjbrown99's picture

Email them and ask for an extended trial, in my experience they are generally good about it and sometimes give out extended trials at conferences and what not. Either way they have a free option which is still excellent.