Hi,
We begun our deployment of a large community website from a single machine that combined DB and web server two years ago. We experienced continuous growth and our system evolved significantly, different performance tunings led to the following setup (so far):
Machine A - httpd
Machine B - httpd
Machine C - httpd + Varnishd (incoming traffic, round robin load balancing between httpd on A, B and C)
Machine D - several memcached instances
Machine E - several memcached instances
Machine F - mysqld + NFSAll machines run on similar hardware (2.5 GHz Intel Dual Quadcores with 8-16 GB of RAM). We use Pressflow 5 with APC and Amazon Cloudfront as CDN in order to reduce load for static content (mainly jpgs, mp3s) as well as to reduce latency for users around the globe. Sessions are handled via memcache. /files is served via NFS from Machine F (DB machine) across the 3 web servers (A, B, C). Code updates are replicated via SVN and rsync.
So far, the system serves well around 15 mio. pageviews per month.
Always looking ahead, we plan to get much more traffic and would like to share some best practices with all you experts out there.
Following questions arise:
- Would it make sense to have 3 identical web servers running httpd AND varnish both on the same machine and load balance them using DNS Round Robin?
- Could you imagine a different, more scalable and fault tolerant structure with this kind of hardware?
Any best practices to share would be greatly appreciated. Could you reveal your server topology as well to learn from your experiences?
Thanks!

Comments
One box, one role when possible
At this level of scalability I would say it is easier to manage when each box has one role.
For example, I would separate the proxy cache from the web servers. That way all web servers can be exactly the same, clones. You can have more than one proxy, with DNS round robin, and tell them to load balance the web servers as well. The proxies can also be clones. These machines can be a lot lighter than the web server boxes.
Then, I would separate MySQL as well. You could run master/slave, or a cluster, if you need to scale the DB layer. The NFS machine, depending on what you do with it, can be a lot lighter. Separating MySQL from NFS, can be easier to tune. For NFS you can probably go with a pretty little box, and maybe you can also create clones here using some kind of rsync or clustered aware fs.
I think this method allows you to work with cheaper boxes, and maybe you can use more, as the site grows with very little architectural changes.
i remember reading mysql +
i remember reading mysql + nfs = slow. i remember seeing on this group but you can google (that is if your tables are stored on the nfs)
Wouldn't a CDN be better than NFS?
It seems to me that using a CDN would deliver better performance and reliability than using NFS on a single server. We actually use Amazon's S3 + CloudFront to deliver all our media and image files. But I've never set up NFS nor benchmarked it.
I'm assuming that MySQL over
I'm assuming that MySQL over NFS is not what the OP is describing (you're right that would be dreadfully slow). Rather that that machine is a MySQL server and an NFS server. It serves the /files directory over NFS to the web heads. Yes this can be a failure point - If NFS goes down the web servers will likely lock up and need to be restarted. What you can do to mitigate this issue is this:
http://groups.drupal.org/node/1648#comment-90877
Also GlusterFS might be worth looking in to as a more fault tolerant solution. I haven't seen a recent performance comparison though.
As for CDNs I'm a fan of pull CDNs. It offers 90% of the performance of a push CDN without the hassle of managing the moving of files. Just let the CDN automatically pull files in as it needs them. The CDN module can do this, or an even lighter weight alternative is the Parallel module. You can be up and running in 15mins.
--
Dave Hansen-Lange
Director of Technical Strategy, Advomatic.com
Pronouns: he/him/his