Project Mercury: Pre-configured Drupal+Varnish AMI

joshk's picture

Inspired by my own work over the past year with Amazon EC2 and this great post from Eric Hammond at Alestic on how to bundle public AMIs, today I released my first public machine image. I call the project "mercury," and the goal is to combine the power of Varnish and Pressflow Drupal in one easy-to-run package.

Why is this important? Because Varnish fills the same role as the Boost module, except it can handle 1000s of requests per second. Your constrained resources are going to be network and bandwidth. Getting it working well takes a bit of doing, but thanks to PressFlow and support from davidstrauss and DamZ, I've gotten a vanilla system working.

Currently the state of the project is very much alpha, but I encourage people interested in this kind of system to check it out to learn how the pieces fit together. Also feedback and suggestions are welcome!

At the moment the AMI includes:

The public AMI id is ami-119c7d78. You can also find it pretty easily by searching for "chapter3" in the AMI list. Since the install comes pre-configured and I didn't have time to do this as a profile, you'll need to use the user #1 credentials I set up. Login: root. Pass: drupal. Change this immediately if you spin one up.

Comments

Any test results?

Amazon's picture

Hi Josh, this very cool. Did you settle on this configuration through experience, or did you do some testing and concluded this was the best configuration to deliver a particular workload? If you did testing, have you got some numbers?

Thanks,
Kieran

Drupal community adventure guide, Acquia Inc.
Drupal events, Drupal.org redesign

It's an alpha

joshk's picture

It's an alpha, so still lots to be done, but my preliminary results suggest that Varnish will saturate the ec2 network connection with page requests while giving a small instance a total load of 0.11. That's for non-logged-in users only, but it's better than I've seen robust segmented server architecture do when working with Boost+Apache.

Basically, there's no good reason to bother Apache with a non-drupal request. If it's a cached page, it should not even touch the application stack.

I have some jmeter results and top screenshots that I will post as part of a blog soon. I just wanted to get this out early for feedback, and so people could play with the setup and do their own testing.

http://www.chapterthree.com | http://www.outlandishjosh.com

Interesting

omega8cc's picture

How it could work in comparison to Pressflow Drupal + Boost + Nginx + Ncache.
Definitely something to benchmark.

Thanks for sharing.

~Grace

Not sure about nginex

joshk's picture

I'm not an expert in configuring Nginex, but I can definitely confirm that Varnish is a better solution than Boost. If you've got examples of good nginex config, I'd be happy to do a side-by-side.

You want to be able to treat apache+php+drupal as an application. Requests it serves are going to be resource intensive, but you can reduce the number of threads/processes per server so that "overloading" is no longer possible. Having a reverse proxy layer allows this, because the web/php server no longer has to handle requests for static files or cached pages.

http://www.chapterthree.com | http://www.outlandishjosh.com

Boost: Poor-man's Squid / Poor-man's Varnish

mikeytown2's picture

I consider boost a module that you use on small to medium/large sites. It's main draw IMHO is the fact that you can get $5 dollar hosting to perform like a dedicated server. Squid or Varnish can be tweaked to handle very high loads since the pages are kept in ram, vs boost which uses the file system. 99% of sites will never reach that saturation point (disk or webserver threads), so it's only the select few who actually need these more exotic solutions. That being said, thanks for lowering the degree of difficulty for the exotic setups! Your work will help people use Drupal in new and exciting ways.

One thing that could be useful, for any kind of caching, is the work I'm doing on AJAX Statistics (basic stats work now). Here is the latest boost road-map; any help on any of these issues would be greatly appreciated. Once I get stats done I will probably release it as it's own module so it can be used with other caching schemes.

Amen ...

kbahey's picture

Agreed.

Boost has lots of advantages even on dedicated servers. Using boost, page generation times are less than 10 ms vs. 20 to 40 using the normal page cache.

Compared to Squid/Varnish, these are:

  • simplicity of setting it up, just change .htaccess and you are pretty much done.
  • no need to patch core to get proper HTTP cache headers. If you use Squid you need to patch Drupal 6 for it, as per the patch here, which made it to Pressflow, which is what joshk used for the Amazon image.

I have made some extensions to boost, to make it two tiered (some content will expire after 1 day, some after 1 hour). But it seems that the latest commits to 2.x-dev already has this feature planned.

Drupal performance tuning, development, customization and consulting: 2bits.com, Inc..
Personal blog: Baheyeldin.com.

Drupal performance tuning, development, customization and consulting: 2bits.com, Inc..
Personal blog: Baheyeldin.com.

Excellent

joshk's picture

I will have to check out these new developments. I have not used Boost much since 5.x, when it was much more basic. :)

http://www.chapterthree.com | http://www.outlandishjosh.com

Different Cache Expirations

mikeytown2's picture

This is in the latest dev. You can do it per page or by content type. Granted I haven't made the GUI for it, but if you set it in the DB, it will work. To do it per page, edit the lifetime column in the boost_cache table. To do it per content type add an entry to the boost_cache_settings table. Example: For a view called testimonials

INSERT INTO boost_cache_settings (
csid,
page_callback,
page_arguments,
lifetime,
push
)
VALUES (
'1',
'views_page',
'testimonials',
'300',
'-1'
)

This will make all views of the type testimonials expire in 5 min. -1 means use the default.

Although

joshk's picture

Doing a little doc reading, it appears that Ncache is a varnish-like HTTP accelerator. Same idea of taking squid and doing it better.

My concern overall here is that Nginex has a very small developer community as compared to Apache. Not sure how the project health for Ncache stacks up against Varnish though.

http://www.chapterthree.com | http://www.outlandishjosh.com

With Nginx you can forget about Apache,

omega8cc's picture

and it just starts here. Every idea about reducing load by enhancing Drupal core (Pressflow), by caching for logged in users (Authcache, Cacherouter, Memcache) starts with problem #1, and the name of this problem is... Apache. OK, it's not that simple, but if we can avoid using Apache, then we have a lot of power (RAM/CPU) reduced even if we don't use caching outside of Drupal.

I agree Varnish is much more mature than Ncache itself, but this is not the case with Nginx, which is mature, and who knows how it could work together? I will test both of them (Varnish and Ncache). But again - for anonymous traffic using Boost with Nginx gives you (I suppose) similar results to Varnish as a front-end for heavy and slow Apache based PHP/web server.

I agree also with the idea about removing sources of performance problems and I would really prefer to have all Pressflow enhancements in vanilla Drupal core. I can't believe there are still some performance killers like all those LOWER() on every request (so indexes/cache in MySQL can't be used) etc.

It all should start with making Drupal core right. But it seems it is still a future. So back to caching, proxying etc.

Not true

kbahey's picture

The main problem is not Apache at all. It is complex sites using a lot of PHP CPU, memory and SQL queries. That has little to do with the web server, and is in a different domain.

When people talk about Apache negatively, it is mostly when it is using mod_php with complex sites (100+ modules) and the full weight of that is carried into each Apache process. Lots of memory is wasted that way.

However, Apache can run Drupal in a very nimble way: using MPM Worker + fcgid. Significant memory savings can be obtained that way. Apache is no longer tens of megabytes, but as little as 6 megabytes per process, each with many threads. Static file server is much less resource intensive.

We have several large clients running Apache and PHP that way and the effect is very marked. See the graphs at Apache with fcgid: acceptable performance and better resource utilization for details.

Drupal performance tuning, development, customization and consulting: 2bits.com, Inc..
Personal blog: Baheyeldin.com.

Drupal performance tuning, development, customization and consulting: 2bits.com, Inc..
Personal blog: Baheyeldin.com.

@Khalid

omega8cc's picture

You are right the problem with Apache is not Apache itself but the way it is/was used: with mod_php.

Thanks for the link to mod_fcgid comparison before/after stats. Good example. And BTW this is how Nginx + php-fpm works. There is no need to repeat it all, I see it was discussed before: http://groups.drupal.org/node/20813 (and good last point about Nginx from Josh there).

Well, my daily/work experience with Apache stopped years ago, maybe it's time to test it again and compare with Pound/Nginx/php-fpm.

Thanks,
~Grace

As usual, you're ahead of me :)

joshk's picture

Kahled, as usual I find myself following in your footprints! I've been working with some of these fcgi configurations, but this post is a great one. I will see if I can roll this configuration for apache into the next iteration of my AMI.

http://www.chapterthree.com | http://www.outlandishjosh.com

Hmmm

joshk's picture

Although, I can see this fcgi method will be incompatible with my use of APC for local caching.

Seems my next step is to get a good benchmarking system set up so that I can test things out more easily.

http://www.chapterthree.com | http://www.outlandishjosh.com

True

kbahey's picture

That is true. APC in FastCGI is per-process rather than shared across processes.

In these configurations I use memcache as the cache, and APC is just an op-code cache.

Drupal performance tuning, development, customization and consulting: 2bits.com, Inc..
Personal blog: Baheyeldin.com.

Drupal performance tuning, development, customization and consulting: 2bits.com, Inc..
Personal blog: Baheyeldin.com.

If you do use that ...

kbahey's picture

If you do go that route, here is a sample configuration from a large site. Works very well.

<IfModule mod_fcgid.c>
  AddHandler fcgid-script .fcgi .php

  # Where to look for the php.ini file?
  DefaultInitEnv PHPRC  "/etc/php5/cgi"

  # Where is the PHP executable
  FCGIWrapper /usr/bin/php-cgi .php

  # Maximum requests a process should handle before it is terminated
  MaxRequestsPerProcess 1500

  # Maximum number of PHP processes.
  MaxProcessCount       35

  # Number of seconds of idle time before a php-cgi process is terminated
  IPCCommTimeout        240
  IdleTimeout           240
</IfModule>

<IfModule mpm_worker_module>
  ServerLimit           600
  StartServers           10
  ThreadsPerChild        10
  MaxClients            600
  MinSpareThreads        30
  MaxSpareThreads        50
  MaxRequestsPerChild  3000
</IfModule>

The above should go into /etc/apache2/conf.d/php-fcgid.conf.

You need to install the following Ubuntu/Debian packages:

apache2-mpm-worker
libapache2-mod-fcgid
php5-cgi

Drupal performance tuning, development, customization and consulting: 2bits.com, Inc..
Personal blog: Baheyeldin.com.

Drupal performance tuning, development, customization and consulting: 2bits.com, Inc..
Personal blog: Baheyeldin.com.

Excellent!

joshk's picture

Thanks! I'm currently working out some lower-level issues to make sure I can roll-out production-capable machine instances (e.g. at a minimum everything needs to be set up in /mnt, not on the boot volume) but I will definitely refer to this when I make my next pass on the performance front.

http://www.chapterthree.com | http://www.outlandishjosh.com

We have used Lighty before Nginx

omega8cc's picture

Here is my example of configuration for Nginx with Boost:

http://groups.drupal.org/node/23907

It's 5.x, however, since we switched from Lighty because there was no -f test available in rewrite rules (required by Boost config) and there was still no option to reload without stop/start, while Nginx has it all (graceful restart and even graceful restart for upgrades - no downtime at all).

For FastCGI I prefer this great solution: http://php-fpm.org

HTH,
~Grace

New release coming

joshk's picture

Thanks to much of the great feedback I've gotten here, a new release should be coming in the next day or so. I've worked out what i believe to be a fine (if simple) roll-out scripting system and have beefed-up the installation with some addtional tools. Notably an updated libevent, memcached 1.4 and the new php libmemcached wrapper (support from cacherouter hopefully pending here).

http://www.chapterthree.com | http://www.outlandishjosh.com

Can't find the AMI image

davidseth's picture

Hello,

Thanks for your great work. I have searched the Amazon AMI site high and low and can not find your image, no matter what I search for. Can you post the direct link?

Cheers,

David

The id is there

joshk's picture

The current release Id is ami-0722c36e. You can find it in your console by searching for "chapter3"

http://www.chapterthree.com | http://www.outlandishjosh.com

Choice of database

davidseth's picture

What DB are you using? Mysql? If so are you using InnoDB, ExtraDB?

Thanks,

David

Mysql MyISAM

joshk's picture

The vanilla defaullt for now. Will look at add innoDB/nolocks in future releases.

http://www.chapterthree.com | http://www.outlandishjosh.com

MySQL engine and FastCGI

David Strauss's picture

@David Seth:
I'd recommend XtraDB first, InnoDB second.

@Others:
The non-importance of FastCGI for this setup is kind of complex:

Typically, a web server like Apache that runs Drupal handles two types of workloads: static files (images, CSS, JS) and dynamic pages (PHP). With mod_php, every Apache process is a memory hog: it has all of PHP running despite only using PHP to process some pages. Memory isn't efficiently used. FastCGI is useful here because it decouples the PHP threads from the Apache threads, allowing a relatively lightweight (no PHP) Apache thread to deliver static content. Apache only connects to the FastCGI back-end to process dynamic pages. This allows, say, 50 Apache threads and 30 FastCGI PHP threads, providing a memory requirement reduction of approximately 20 PHP threads. Mixed workload delivery can be further enhanced by using a lighter-weight HTTP server like nginx or lighttpd.

But with a reverse proxy cache in front, almost every request that reaches Apache involves a dynamic page. (99%+ of static content requests are cached.) This leads to matching every Apache thread with a FastCGI PHP one, removing the intended benefit of the decoupling and possibly negating the FastCGI benefit because of the additional Apache to FastCGI communication.

New release

joshk's picture

I've made some udpates and changes and posted a new alpha3 release. Announcement here:

http://www.chapterthree.com/blog/josh_koenig/project_mercury_preconfigur...

The changes were mostly minor compared to the alpha2 posted here. I will do a new g.d.o. post in advance of my next (alpha4) release which should include a better testing framework so I can start benchmarking various configurations.

http://www.chapterthree.com | http://www.outlandishjosh.com

Great work, can't wait to

SeanBannister's picture

Great work, can't wait to try it out.
If I wanted to move an existing Drupal site to Pressflow is it just a matter of copying the sites directory and importing the database? Couldn't find any documentation on this.

Correct. Pressflow makes no

David Strauss's picture

Correct. Pressflow makes no schema changes.

You say varnish fills the

Flying Drupalist's picture

A few questions:

You say varnish fills the same role as boost. Is varnish useful if the site is used entirely by authenticated users?

Why cacherouter rather than any other caching module such as authcache?

Thanks for providing something like this!

Still Useful

joshk's picture

Varnish is still useful in that it keeps requests for everything except Drupal pages off your apache/PHP application server's plate. This will let you tune Apache to be an application server w/your system's resources, rather than having to keep a lot of spare threads around for serving css, js and images. In this way, you can ensure that your application layer is both doing the best it can possibly be, and also won't suffer a "meltdown" under heavy loads (e.g. too many php threads exhausting memory, server dips into swap, game over) since you can safely tune-down variables like MaxClients, etc.

As for the choice of caching modules, I use cacherouter as it provides a good range of options for Drupal core's caching system in the standard manner, and I wanted to use APC for the backend so we didn't have to worry about managing another daemon for memcached.

Authcache is... something more/different. It looks cool but is still under development and carries the caveat that "enabling authenticated user caching will require modifying how your user-customized content displays on your pages." I don't have much experience here, but it seems like taking full advantage really impacts your Drupal development process.

Thank you very much, are

Flying Drupalist's picture

Thank you very much, are there benchmarks for authenticated users on mercury?

@Flying Drupalist

omega8cc's picture

The shortest answer is: NO. If you want to speed up site for authenticated users, you need, always recommended, Pressflow core and modules like cache/cacherouter with enabled memcache/redis and apc/eAccelerator.

HTH ~Grace -- Turnkey Drupal Hosting on Steroids -- http://omega8.cc

High performance

Group notifications

This group offers an RSS feed. Or subscribe to these personalized, sitewide feeds:

Hot content this week