My name is Greg Coit, sysadmin for Chapter 3 and I've been helping with Mercury development and testing.
We wanted a get a quick idea of how hard we could push mercury under more "real world" circumstances, so I combined siege and ab to generate a broad spectrum of hits. ab (short for apache benchmark and part of the apache2-utils package) allows you to generate a very large number of hits on one url, while siege (a perl script which comes in a self-titled debian/ubuntu package) lets you spread the hits across many urls, most of which won't be cached. This mixed-load is a much more nuanced and accurate way of looking at performance than peak throughput on a single url.
For this test we used 2 types of target servers:
1) An Amazon Web Services (AWS) small 32-bit instance running Mercury Alpha 0.6. These come with 2GB ram.
2) To test a tighter resource environment, we also used a Slicehost.com VPS with 512MB ram. This Ubuntu 9.04 32-bit server was setup exactly like Mercury Alpha 0.6.
Both target servers had 2000 nodes created using the devel module (with each node having up to 5 comments per node). After Solr had had a chance to index all 2000 nodes (and after turning on Performance logging), memory usage looked like this:
the 2GB RAM Server target server:
total used free shared buffers cached Mem: 1747764 357164 1390600 0 9352 221924 -/+ buffers/cache: 125888 1621876 Swap: 917496 0 917496
the 512MB RAM target server:
total used free shared buffers cached Mem: 524508 395948 128560 0 6052 205684 -/+ buffers/cache: 184212 340296 Swap: 1048568 0 1048568
We booted up a second (source) server on each network to run the test from (all tests were run against the internal network IP to reduce the effects of network lag). We generated a list of url's for siege to use::
#!/bin/bash
for ((a=1; a<=2000 ; a++))
do echo "http://internal_url/node/$a"
done
and redirected the output to urls.txt. We then ran the following command:
siege -c 32 -i -t 5m -d 5 -f urls.txt
this creates 32 concurrents users hitting any of the 2000 nodes randomly with a random sleep of up to 5 seconds per user for 5 minutes.
While that was running, we ran the following command from the same server:
ab -c 100 -n 50000 http://internal_url/
This command generates 100 concurrent users hitting the front page at a time (up to a maximum hits of 50,000).
ab produces the following result against the 2Gb target server:
Concurrency Level: 100 Time taken for tests: 38.985 seconds Complete requests: 50000 Failed requests: 0 Write errors: 0 Total transferred: 827324795 bytes HTML transferred: 802179441 bytes Requests per second: 1282.55 [#/sec] (mean) Time per request: 77.970 [ms] (mean) Time per request: 0.780 [ms] (mean, across all concurrent requests) Transfer rate: 20724.37 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 0 20 86.6 9 3048 Processing: 2 58 34.9 56 576 Waiting: 0 22 25.6 11 563 Total: 3 78 92.0 71 3103 Percentage of the requests served within a certain time (ms) 50% 71 66% 91 75% 98 80% 103 90% 121 95% 134 98% 152 99% 165 100% 3103 (longest request)
The results of siege are:
Transactions: 3645 hits Availability: 99.64 % Elapsed time: 300.39 secs Data transferred: 3.08 MB Response time: 0.16 secs Transaction rate: 12.13 trans/sec Throughput: 0.01 MB/sec Concurrency: 1.95 Successful transactions: 0 Failed transactions: 13 Longest transaction: 6.63 Shortest transaction: 0.00
The load average briefly hit 10, but RAM available never went below 1 GB - no swap was used by the target server in this test. 2 things are clear: we could push much harder (but multiple source servers would need to be used - it's too easy to overwhelm the network socket of the source server) and an AWS small instance running Mercury can handle a huge spike in traffic.
ab generates the following result against the 512MB target server:
Concurrency Level: 100 Time taken for tests: 258.123 seconds Complete requests: 50000 Failed requests: 744 (Connect: 0, Receive: 0, Length: 742, Exceptions: 2) Write errors: 0 Total transferred: 787021946 bytes HTML transferred: 762116931 bytes Requests per second: 193.71 [#/sec] (mean) Time per request: 516.245 [ms] (mean) Time per request: 5.162 [ms] (mean, across all concurrent requests) Transfer rate: 2977.56 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 0 163 552.0 60 9041 Processing: 0 349 709.0 240 9249 Waiting: 0 103 391.4 60 8480 Total: 20 513 900.0 300 9361 Percentage of the requests served within a certain time (ms) 50% 300 66% 340 75% 364 80% 380 90% 440 95% 3196 98% 4200 99% 5100 100% 9361 (longest request)
and the results of siege are:
Transactions: 1692 hits Availability: 98.89 % Elapsed time: 299.92 secs Data transferred: 4.09 MB Response time: 2.84 secs Transaction rate: 5.64 trans/sec Throughput: 0.01 MB/sec Concurrency: 16.01 Successful transactions: 1692 Failed transactions: 19 Longest transaction: 30.78 Shortest transaction: 0.05
The 512MB target server used 10MB of SWAP in this test, and had a peak load average of 2.1. I suspect that the internal network on this VPS kept the number of hits (and the peak load average) much lower than the first test, but it's still many times faster and much more data than you can get from an external network.
These test indicate we are going in the right direction. Next up (after our beta release of Mercury) is adapting Jacob Sing's great jMeter test suite, which will allow us to do much more in-depth (and real-world-like) testing of Mercury.
Comments
Nice work Greg, this is
Nice work Greg, this is really kicking ass. I can't wait to benchmark this on larger instance sizes, that new 68gb instance would make an awesome DB server.
Awesome work. I'm gonna copy
Awesome work. I'm gonna copy it for my nginx tests.
Couple of questions. I've read that Slicehost doesn't currently offer 32-bit kernels, but that you can run a 32-bit chroot on your 64-bit kernel to save memory. Would a 64-bit kernel account for the almost 50% greater memory use on Slicehost? 184212 v 125888?
I'd be interested to see the results on a longer siege. Varnish must have been passing requests through for the first request of a particular node and caching it and then occasionally serving from Varnish on the second random request to a given node. In those 5 minutes, it made 3645 requests. Maybe a statistician could tell us out of 3645 requests to 2000 random nodes how many were Varnish cache hits and how many were misses or maybe pull a hit/miss stat from the Varnish log.
It suggests another test for fully static sites. In that case you could precrawl all 2000 nodes to get them into the Varnish cache first and then measure performance. Or come up with a ratio of static to dynamic pages like 4:1 and tell Varnish not to cache a certain 20% of the site. Then do a random siege test where you know that over a long enough siege that about 80% of the requests would be coming from the Varnish cache and the other 20% would be dynamically generated.
Great work. Thanks.
Varnish is varnish
That's a great idea in terms of pre-warming the cache (and actually there's a way to do this from the back side via the Varnish control channel), but once the cache is set the performance is going to once again fall out on the network interface. There are ways you can squeeze more out of this end, and there are commercial products that can maximize static file serving even further, but really once you're up to thousands of requests per second, your problems are likely solved for those pages.
The really critical question for speed is what to do to maximize the cases where the whole page isn't cached in Varnish. That breaks down to two things, I think:
1) Implementing more next-gen features to push caching up the stack: ESI is the Next Big Thing here.
2) Making Drupal itself faster faster faster with better system tuning, PHP configuration, backend caching, and database speed. The frontier here is rockier, as a lot of this depends on how you build your Drupal site. However, there are some exciting things on the horizon like Drizzle which coud lead to much better database performance, and next-gen PHP optimization methods like Quercus.
https://pantheon.io | http://www.chapterthree.com | https://www.outlandishjosh.com