Squid on top of Apache on same machine

We encourage users to post events happening in the community to the community events group on https://www.drupal.org.
Jax's picture

During a recent talk by Mr. Tournoud he said that it is beneficial to run squid on the same machine as apache2 since connections that linger keep a thread and a DB connection busy. Putting squid on port 80 and apache2 on 8080 would also give you the ability to serve files directly from squid and apache2 would have less connections.

I'm a bit skeptic about the benefits and wanted to benchmark it but that seems to be somewhat more complex than just running "ab" a couple of times to get an idea. It seems that one should set up an average site with images and everything and test on that. Since I don't have that infrastructure ready at the moment I was wondering if anyone else has benchmarked this specific configuration? I can't seem to find much about it on the net.

Comments

Squid = good AND can be bad

Slurpee's picture

I highly suggest squid. It is a great piece of software.

Although, squid is not perfect. I recommend running it on cloned sandboxes of your site before going into production.

While squid can greatly help performance of a server.....it may serve a cached website which is not the newest. Working at an ISP we would continually have clients complaining about web sites. An old version would show up, the web site wouldn't not be available, the website would be blank, or the website would show our squid error message. Squid error message is configurable and advised to put instructions with your contact on the page.

It is good to note that squid has all sorts of filter rules which can be setup, such as allowing an IP or IP subnets to bypass squid. Also when working with websites such as youtube you can run into all sorts of fun problems. If you decide to use squid, join the mailing list.

markus_petrux's picture

And you can use a pretty cheap box for squid, while your Apache box is bigger.

It's also easy to scale. You can add more Apache boxes, and tell squid to load balance them using round-robin. Also, you can add more squids, and use round-robin in the DNS records to load balance the squids.

Yes, that's all true. I'm

Jax's picture

Yes, that's all true. I'm still interested in seeing how having squid on the same box as apache affects the performance.

Check my AMI

joshk's picture

The AMI I provide in the post below this one does exactly this, except using Varnish instead of Squid. The principle is the same, and the performance benefits are enormous.

Also, if you start with this infrastructure on one box, it's relatively easy to later-on segment it out into multiple layers as markus_petrux suggests above.

http://www.chapterthree.com | http://www.outlandishjosh.com

Well, if the performance

Jax's picture

Well, if the performance benefits are enormous I really want to see some numbers. I guess I'll need to do a benchmark myself.

Like I said

joshk's picture

With both boost and varnish (I haven't benchmarked squid), serving cached pages runs you out of network I/O first. However, my experience looking at apache under these circumstances (albeit in the prefork-mpm configuration) is that serving static files to the max will cause the load to go up to 0.5 or so; it's not bad at all, but it's more than 0.01. ;)

Also, with varnish, not only is the load non-existant, but page response times are noticeably faster since its turnaround time is less. If you take even 100ms out of the http response time, your users will notice.

http://www.chapterthree.com | http://www.outlandishjosh.com

Quick numbers

joshk's picture

I realize you're asking for a more complex test run -- something that really depends on your use-case -- but here are some quick numbers from ab that were easy for me to cook up with my AMI.

Runing 1000 requests at a concurrencly level of 10.

Varnish

Percentage of the requests served within a certain time (ms)
  50%     15
  66%     15
  75%     15
  80%     15
  90%     16
  95%     16
  98%     16
  99%     69
 100%     70 (longest request)

During the varnigh test, the load on the small ec2 instance topped out at 0.1.

Standard Drupal/Apache w/Aggressive Caching

Percentage of the requests served within a certain time (ms)
  50%     60
  66%     80
  75%     87
  80%     91
  90%    103
  95%    117
  98%    127
  99%    180
 100%    394 (longest request)

During the apache test, the load on the small ec2 instance topped out at 0.7.

The results here would be significantly more pronounced with the addition of images, complex css, etc, and a full browser (rather than ab) test, as each pageload would generate many http requests. Under those circumstances, I would expect apaches performance to degrade further, which varnish would pretty much perform at the same rate.

http://www.chapterthree.com | http://www.outlandishjosh.com

This is for anonymous users.

Jax's picture

This is for anonymous users. I was aware of the performance increase for those but thanks for the numbers. I was actually wondering if it was worth while to put squid on your web box to improve the experience for logged in users.

Indirectly, yes

joshk's picture

A reverse-proxy will help with logged-in users in the following ways:

  • It will take the work from anonymous visitors off apache.
  • It will take the work of serving static files off apache.

This will let Apache+PHP+Drupal perform (and be tuned) as a true application server rather than as a general-purpose webserver. This is important to get good results.

However, if you are seeing bad general performance under not very much load, you need to increase resources, implement a better caching mechanism for drupal's caching system (e.g. memcached or APC). However, you may be experiencing slow pageviews because of poorly-performing queries or other sub-optimal code.

http://www.chapterthree.com | http://www.outlandishjosh.com