Getting to the point, I have found that Drupal performance for authenticated users is pretty sluggish with pages generally being generated somewhere between 600ms and 1+ seconds on a fast, idle server.
This great question about baseline Drupal performance expectations, was a real eye opener to me because I had, wrongly it appears, assumed that Drupal was simply slow for authenticated users. Clearly this not the case and it appears it's the modules which are causing the problem. (Obviously, I realised Drupal would be faster with less modules, I just didn't realise by just how much!)
I'm wondering what is considered reasonable for a more feature rich (ie realistic) site and what I might be missing in terms of Drupal performance optimisation.
I've built somewhere in the region of 15 Drupal sites over the last 2 years that have each used in the region of 100 modules. The site I'm currently building uses 145 modules.
Drupal tuning advice that I've found can be roughly summarised as follows:
- Reduce the number of modules
- Check your mySQL query log for slow queries and tweak code / add indexes as necessary.
- Caching (and hoping most of your users are not authenticated)
I get the impression that 100 modules is considered excessive and 145 modules simply ludicrous?
I would imagine that by the time I have what I want (think enterprise b2b e-commerce), I will be using something like 200+ non-trivial and essential modules. Furthermore, I would like to use "lazy registration" techniques so that every visitor to my site is an authenticated user and potentially receives personalised content.
Bear in mind, I don't currently care so much about massive scalability. I'm generally building corporate marketing and e-commerce sites that get maybe around 500 to 5000 visitors per day. Each visitor is a potential or existing customer and therefore high value, so personalised content is important and fast page generation time is critical. I would consider any page generation time over 100ms to be sub-optimal, 200ms concerning and any response times even semi-regularly over 500ms to be commercial suicide.
My server is a reasonably fast quad-core, 8 GB RAM, SSD running Apache / mySQL (innoDB) / APC
Having done a fair amount of research into this issue (I ported the Authcache module to D7) I have the following observations:
- With my current 140 modules each request uses in the region of
60Mb22Mb of RAM. - Page generation times vary between
550ms and 1000ms350ms and 650ms. (db queries < 20ms ). Putting this in perspective, on the same machine a fresh Ubuntu install boots in ~3 seconds ! - Roughly 33% of requests, a dblog_watchdog insert query will take ~200 ms. (66% the insert takes ~5ms)
?XHProf indicates that ~250ms is taken by module_load_alas a result of proper APC configuration, this is down to 36msMany modules have surprisingly large per request memory footprints (webform, entity_cache, user, date to name but a few).again, proper APC configuration has all but eliminated the memory footprint of modules- The Filecache module is about 10% faster than memcache
So, my questions:
- Is what I want (ie full set of modules and 100ms per request) realistic with Drupal?
- Am I missing something (like maybe nginx being sooo much faster than Apache)?
- Has anyone developed a way to speed up module_load_all (I'm wondering if it's worth experimenting with “compiling” all enabled modules into a single file or something similar) ?
- How are people who need truly dynamic personalised content handling this with Drupal. Authcache + javascript personalisation? Brute force, ie a massive server farm ? Something else ?
As per Roberts excellent question about baseline performance it would be useful if people could post details of their real word sites and performance. Eg:
Server: Q6600 quad core, 8 GB RAM, SSD running Apache / mySQL (innoDB) / APC
Number of Modules : 145
Performance improvements: Filecache, Entity_cache
Authenticated request time: 550ms – 1000ms 350ms - 650ms
Max authenticated requests per second: 3.5
If you're using Authcache, for this purpose that's considered "cheating" because I'm interested in dynamic personalised content generation.
I'm not so much looking for specific performance improvement suggestions (although they would not be unwelcome) . It's a case of wanting to know if I'm being realistic, missing something obvious and what other people have achieved so that I know what I'm aiming at.
Many thanks in advance for any input.

Comments
168 modules - 493 ms
Devel: Executed 195 queries in 61.76 ms. Queries exceeding 5 ms are highlighted. Page execution time was 493.72 ms. Memory used at: devel_boot()=0.86 MB, devel_shutdown()=16.59 MB, PHP peak=17 MB.
Authenticated user. 168 modules installed. The actual page requested had only ~8 blocks rendered. One included a View that is fairly complex. Several menus.
Specs: Windows 7 - 2 Core @ 2.5 GHz - 8 GB RAM. 64-bit OS. This is a development laptop with many apps open, very little free RAM, and not much scratch disk space.
I will post perf results from a dedicated Windows 2008 server on Monday.
NOTE: This is using WinCache, but otherwise a stock php.ini. The MySQL config file is also stock, but using innodb. So very little tuning.
Pre-configured 'real-world' Drupal benchmark site
Maybe we should be running tests against a specific page of a pre-configured 'real-world' Drupal benchmark site? Could use Devel to generate the content. Otherwise, I don't know how comparable/useful the numbers will be.
Good idea
Thanks Roger. That's a very good idea although for my purposes right now "ball park" figures and "sanity check" type suggestions would be good enough.
I'm very interested in the idea that is suggested in many Drupal performance tuning articles I've read that it's important to keep the number of modules to a minimum. ie you can have functionality OR performance but not both. If that's actually considered to be true then that's a major problem that we need to find a solution to.
The fewer modules will always
The fewer modules will always help. That's just basic CS. With fewer modules you don't have as much processing to do (less CPU cycles) and use less memory. If you don't have any sort of op-code caching (very common on shared hosting) it really makes a difference.
But it's also a very vanilla statement. A lot depends on what the module does, how it's engineered and how the end user sets it up. That's where being a good site builder comes into play. The builder needs to be able to identify what works good together and what doesn't. They also need to be able to performance tune, especially when using things like views, where a generated query might end up being a server killer. A lot of times those can be fixed with some index altering, but other times they require manually writing the query in a module.
I try to keep a baseline of what impact common modules have on a Drupal install. Things like processing and memory footprint. When I do a site for a client, then I figure out if that module is really needed or if I can do it better with something custom. For example; if a client has only one or two custom views and won't need to change anything around, instead of throwing Views into the mix, I will just make a custom module. A lot of times it can be done almost as quick, if not quicker. (even more so in Drupal 7 with some of the templating additions). Not only that, but if they want some variation of that data, I can easily throw a few lines of code in the module and push it out to their site via GIT, instead of having to go the whole features route or export/import.
So basically it's best to take a few minutes and benchmark along the way when you are building. Get an idea of what doesn't have that big of a footprint and what makes a footprint the size of Godzilla. A good example is crooksandliars.com (D6). We have close to 100 modules there, with about 75% of them being custom stuff. Even with that much running, here's our devel output:
Page execution time was 95.11 ms.
Memory usage:
Memory used at: devel_init()=0.91 MB, devel_shutdown()=4.1 MB.
That's with Zend optimizer enabled on my devel machine. Without it:
Page execution time was 188.35 ms.
Memory usage:
Memory used at: devel_init()=2.3 MB, devel_shutdown()=15.94 MB.
Some of the custom modules are pretty big too. We have a custom media module, multisite manager and user management module. Those are dinosaurs, but I wrote them with performance in mind to keep things as happy as possible. On a base dedicated server from Voxel, we handled over 60,000 page requests in one hour right before the 2008 elections and our load never went over 5. On caching, we just run memcache and APC - no Boost or Varnish. I did spend a lot of time tuning that server too, so that helped out a lot.
HollyIT - Grab the Netbeans Drupal Development Tool at GitHub.
Many thanks for your reply
Many thanks for your reply Jamie.
Are these really your figures for authenticated requests !?!?
A very large part of my last 3 years with Drupal has been sweating blood trying to do things in what I thought was "the drupal way" ie using contributed modules rather than my own bespoke code.
I can write sql statements in my sleep but I've spent many "happy" hours working out how to bend Views to my will and learning how to write Views handlers when I could have easily written a more flexible sql statement in 5 minutes :(.
I totally agree with you that with many, many things it would be quicker, easier and more flexible to write your own code, but I assumed the purpose and major strength of Drupal was to enable code re-use and sharing through modules?
What you're saying (and I totally respect your viewpoint) is that modules such as Views, Webform, Panels, Display Suite, Rules etc are fine for non-developers but if you can write code then you're probably better off doing it yourself if performance is important to you (and when isn't it!?)
Now that I look in more detail, I notice that for example "Backup and Migrate" adds 350Kb to each and every request! Views bulk operations, another 380Kb. Google Analytics, 170Kb !!!!!!!! The list of useful but infrequently used modules that add up to a performance killing 60Mb goes on and on.
The thing is though, Drupal isn't really a tool for someone like me to build websites, I would just use some simple php and/or text files and maybe a database to build whatever I like. To me, Drupal's greatest strength is that it's a tool for "end users" (ie my clients) to be able to build and manage their own websites via a friendly user interface without having to "bother" (ie pay) me so much.
Many thanks again for your perspective, if yours is the only response I get, it will still have been worth asking my question. But I need to find a way to keep all the modules AND have the performance (if at all possible).
Update: After writing all this, I spotted that my APC was configured with the default shm_size of 32MB and had a cache hit rate of about 2.5%. Increasing this to 128MB got my 600 -1000ms page generation time down to 300-500 ms and memory usage down to 22Mb. A long way to go, but a good step forward :). Sadly, there are still other, not that complex pages taking in the region of 1.2 seconds :/
As I look through the XHProf output, it's now a totally different story so far as memory usage is concerned. Webform for example is down from 1.3Mb per request to 13Kb per request !!!! Backup and Migrate 2.7K !!! etc etc.
That's for a full admin
That's for a full admin account logged in. It is running on Pressflow, which also helps some.
Deciding to go with a contrib or custom module takes some figuring out. Take webform. I have taken over a few sites from other shops in the past that used Webform just to make a glorified contact form. That's a really large module, and while powerful it is overkill for something like that. Instead I just did a quick, custom module and instead of having a 136kb .module file that loads, go with a file about 5kb that loads and then only loads the other pages/components when needed. Now if that client wants a form they can change at any point without needing me, or even add more forms, then I go ahead and put/keep webform in there.
The Google analytics module is another great example. On Crooks and Liars we do some custom tracking, but most of that I can actually do inside of Analytics through filters. On the custom event stuff, I just got some small Javascript triggering that. Instead of a 21kb module file loading on every request, the code is injected through hook_footer in a catch all miscellaneous module I put together.
I'm not trying to down these modules or anything. They are great and can make site building a breeze, but when it comes to going something where you want to squeak out all the performance you can, custom is a lot of times much better. You find out you don't need all the UI stuff and a majority of the options available in these modules.
It just boils down to these modules are built to handle a wide range of different scenarios for different sites, when a decent coder can come up with something tailored to a single site much more efficiently.
Of course development time available also plays a big part in all this.
HollyIT - Grab the Netbeans Drupal Development Tool at GitHub.
A couple things: Many
A couple things:
Those modules create rather large objects in memory. PHP does an amazing job at variable storage, but those large objects do take up memory and you can see that grow with certain modules. Here's a good article on how PHP handles variables and garbage collection:
http://phpmaster.com/better-understanding-phps-garbage-collection
Memcache requires a TCP connection, which makes it slower. If you're running your web head on a single server, then it's best to go with filecache or APC. That depends upon the server specs, but generally APC is the best route. With Filecache, if you get a lot of files written, a cache clear can become rather slow, where as APC it's almost instant. If you got the RAM go with APC. If not, go with Filecache.
HollyIT - Grab the Netbeans Drupal Development Tool at GitHub.
thanks
interesting article about php garbage collection, and also good point about APC cache clearing.
thanks :)
Memcache and APC are totally separate things
What gets cached in APC and in memcache are completely different. I haven't looked into Filecache, and it may be that it's a useful substitute for memcache, for storing PHP data.
APC is pretty much a separate concern. It's used for storing compiled PHP code. Where you have a server set up to run a single site, or a small handful of sites, using APC generally saves quite a lot of memory, because more memory is shared between the different php processes. If you're running a shared hosting environment with hundreds of sites, I presume that's not true, but I'm no expert on that, as I tend to deal with larger sites.
Looks like the OP is on target with APC, which would have been my first advice. ie Install the APC.php tool and make sure that the cache hit rate is at or very near 100%.
Profiling is the essential tool. If we're to help further, we need more info on what's taking the time. Sounds like you're doing the profiling simg. What's up top of the list?
Also, I 'm curious if anyone can give a comparison between XHProf and XDebug?
As I understand it, it's
As I understand it, it's possible to use APC as a data cache in the same way as Memcache. Filecache is IMO "better" (ie faster) than memcache because of the way file systems automatically cache frequently requested files in RAM - whilst at the same time having effectively unlimited space.
In terms of what's taking the time I don't see one clearly significant culprit: PDOStatement::execute (96 calls), unserialize (461 calls) and t (2770 calls), serialize (99 calls) are the 4 biggest in terms of exclusive wall time, but they're not dramatically bigger than the many, many other functions called.
I'd be happy to upload my xhprof output somewhere if you think that would help however, it's worth re-iterating that I'm not really asking for specific performance troubleshooting advice (though that isn't un-welcome). What I'm trying to understand is whether it's possible to have a site that makes full use of the available modules and still has reasonable performance.
My moderately complex site with 145 modules has ~600 ms authenticated page generation time and that's a problem (for me at least).
Does anyone else have a similar issue ? Does anybody else agree that this is a problem? How are other people handing this issue?
One suggestion by Jamie (and commonly echoed in many articles) is to reduce the usage of modules. Well, sure, that's pragmatic advice, but if that really is the only solution then that's a critical problem. If to get reasonable performance for authenticated users, I need to code all (or most of) my functionality by hand, might I be better off just building my own CMS ?
What would be most useful is re-assurance that either:
a) my results are atypical and probably indicative of some technical issue
b) er, yes, the drupal community is aware there is a performance problem and discussions / work is afoot to rectify the situation.
c) some confirmation or otherwise that unfortunately 140 modules (let alone 200) really isn't a good idea if you want a performant personalised site
My worry at this point (3 years invested into mastering Drupal) is that a sophisticated site built with Drupal and associate modules is simply going to be impractically slow? There is soooo much to love about Drupal, please tell me I'm wrong :)
upload the cachegrind
It's all a big guessing game until we see the actual cachegrind. Use dropbox or something else like rapidshare to share the file with us.
*duplicate removed*
would be nice if I could delete
FileCache vs memcached and swap
I'm dubious of your logic there, but basically the proof is in the pudding. Do you have performance test data?
File caching and swap use pretty much the same mmap mechanism underneath, so If memcached is on a dedicated system, and you give it more virtual memory than real memory, then using swap for storing more than is in physical memory is much the same as relying on file caching, but with perhaps less overhead with the file system.
[This is not a recommendation]
Connecting to memcache is
Connecting to memcache is really inter-process communication. PHP process calls kernel that calls memcache process. This double calling means that data are moved twice too, and when there's a lot of data it can make percievable difference.
With filecache there's only one moving data around, between PHP process and kernel buffer cache. If mmap is used, even that could be avoided. With APC there 's no moving data between isolated virtual address spaces.
This is only one aspect of comparison between three, but it's important to know IMHO.
interesting points, thanks :)
interesting points, thanks :)
Connecting to an APC
Connecting to an APC data-cache uses SHM (i.e. an IPC method) but can use file-backed mmap; connecting to memcache always uses TCP, even if the memcache server is running on the same host. You can check the code for the memcache/memcached extensions for confirmation.
--
Marcus Deglos
Founder / Technical Architect @ Techito.
IPC
SHM is IPC too, yes, but it usually have extremely low cost, at most a single time at fork/exec of PHP process and after that SHM used as own memory without any speed penalty, same for mmap()ed share memory. In contrast regular open()/socket()/bind()/read()/write()/etc system calls switch CPU context (registers, address space) and (for read/write) move data between kernel and user process. Looking only at this aspect, filecache works with data in kernel buffers via system calls while all communication with memcache first moves data to kernel buffers via system calls and then the other process fetches these data from these kernel buffers via other system calls. This may make measurable difference in larger cache bins like cache_form and cache_page.
For best performance with filecache, memory filesystem like
/dev/shmcan be used for small cache bins, andnoatimecan be used for the used filesystem. Both of these options are generally not available on shared hosting though.never say always
http://drupal.org/node/538426#comment-5172206
You can connect to memcache using unix sockets (IPC).
interesting point about
interesting point about memcache using virtual memory, I hadn't considered that. There is danger of getting sidetracked here but I have done benchmarks between filecache and memcache and found (my tweaked version of) filecache to give a 10% increase in throughput compared to memcache (running memcache on the same machine). i don't have benchmark data to hand, but I'll come back to this :)
Link to a zip containing
Link to a zip containing cachegrind and xhprof outputs (for the same request).
http://www.holisticsystems.co.uk/drupal_performance_242798.zip
Worth noting that enabling xdebug created a 300% increase in page generation time and almost doubled memory usage.
So, for good measure I've included in the zip, a "web page complete" save of the page in question containing the devel output from the site / page in question (with xdebug / xhprof disabled). As I mentioned above, generation time for this particular page varies between about 350ms and 650ms. The html output is from one of the slow ones.
cachegrind
The easiest route for speed in your case is to look at various views caching modules/methods.
You could develop the url() cache patch and knockoff about 150ms
http://drupal.org/node/1327720
The biggest slowdown is coming from views views_plugin_style->render_grouping_sets. It's taking about 100ms to render each node = 800ms. Breaking this down:
display suite ds_entity_variables() is a fairly slow function.
cores field_attach_view() is a fairly slow function.
If you could cache these 2 functions, your views render time would take about 100ms.
Thanks Mikey, will give this
Thanks Mikey,
will give this some thought.
Currently, I'm looking into HVVM (Hiphop runtime virtual machine for php).
Try it out
Turn on views core cache. On the view go to other, and under cache select "Time-based". Disable the query cache and set the render cache to 30 minutes. This is a lot easier to do in comparison to getting hiphop running :)
caching isn't going to cut it
caching isn't going to cut it I'm afraid...
taking a really simple example, a forum (assuming a few bells and whistles as provided by my 140 or so modules) which should be absolute bread and butter to Drupal ? The choices seem to be:
If my target market requires personalisation and fast page loads then it's beginning to look like Drupal may not be the right choice (at least until my prayers for 3. are answered) ?
In terms of one of your
In terms of one of your original concenrs:
"Roughly 33% of requests, a dblog_watchdog insert query will take ~200 ms. (66% the insert takes ~5ms)"
a good general recommendation would be to do your very best to disable dblog. There are solutions which provide much better performance such as the Gelf module which uses UDP to "fire and forget" watchdog entries to a package such as Graylog (via a hook_watchdog implementation) and so will eliminate much of the time delay caused by writing watchdog entries to the database. You will also be able to setup access logs, php error logs etc and potentially even MySQL error/slow query logs to use Graylog too and so have a consolidated place to review all the logs on your platform.
With regard caching, authenticated users and personalised content, you need to be thinking about implementing reverse proxy caching (ie Varnish) and Edge-Side Includes. ESI allow different portions of the page (i,e each block) to have its own caching rules, and cache lifetimes, or even decide not to cache certain blocks at all whilst leaving the majority of the static page content served from cache. Obviously this is a balancing act, as each uncached ESI block will make a request to the web server if you go too far down that route then you're actually making the problem worse.
You could also look at implementing personalisation via Apache SOLR so you're serving the dynamic bits of the content from optimised SOLR indexes rather than hits to the DB.
Without having read the specifics in your xhprof output, I would say its definitely be worth looking at where the your storage for form and session is. If you're storing that in your primary DB then you will see that as an additional hit on the database performance. You will gain significant benefits just by putting forms and session state into memcach, or even better into an alternative backend which persists data better than memcache - this could be as simple as a separate MySQL instance just for storing form and session, using a NoSQL backend like Couch DB, Mongo DB, etc.
I guess in answer to your main question, no, Drupal isnt lightning fast "on its own" with relatively high complexity functionality and high levels of personalisation, Drupal's strength is in being flexible enough to be relatively easily paired with one or more of the additional technologies mentioned throughout this thread to provide excellent performance at the same time as the customisation that is required. Caching isn't just beneficial for speeding up individual requests though, it will also protect the DB in times of peak load (or DoS) and make sure the site has a better chance of staying up through these peaks.
I tried the Hiphop Virtual
I tried the Hiphop Virtual machine (HHVM) and whilst I found that I couldn't get Drupal running under HHVM, I did try some other benchmarks which seem to indicate that it isn't likely to give an (easy) dramatic speed improvement at least in the short term. (I was hoping that running "theme" for example, as compiled code might give a big speed increase).
As indicated by this article, Hiphop doesn't optimise well for string concatenation, and that was reflected in my benchmarks too.
Benchmark1 was a very simple loop incrementing a counter to 100,000,000. Benchmark 2 was the same loop but concatenating an "x" onto the end of a string (only 100,000 times though)