I have a client running a relatively new Drupal 6 Ubercart based site with approx 50 items. It presently runs on Rackspace Cloud Sites. UPS shipping is enabled and the Auth.net SIM method is used for credit card checkout. No other external web services are in play that I am aware of.
I have completed a High Traffic Event form with Rackspace. I am awaiting further instructions but they have already suggested all static assets be moved to their cdn. (Akamai)
The site will be featured on Good Morning America and I am being told to prepare for up to 20,000 simultaneous users. They will offer a coupon code for a particular item so that particular node might be taking the brunt of the hits. I am looking at implementing Ubercart Discount Coupons to handle the coupon code.
Assuming for the moment that the hardware and network side of things are accounted for by Rackspace what are the chances that this site would survive this barrage and is there additional low hanging fruit I could put in place?
I have about a week to make this so.

Comments
Do you have memcached and
Do you have memcached and varnish covered?
Varnish is great for serving static html (or "half-static" with esi), you should definately think about it if assume there are great percent of users not logging in (nor using ssl).
Memcache is great for reducing load from mysql. You can google for better definition about memcached and what it does, but you propably should have this anyway.
There are many articles about scaling drupal for high-availability, example this one http://www.lullabot.com/articles/varnish-multiple-web-servers-drupal?utm...
RS Cloud Sites
Cloud sites run on clustered Rackspace servers. All caching outside of basic Drupal db cache is handled by Rackspace.
looks like CloudSites does
looks like CloudSites does not provide Memcache, so check this out: http://www.rackspace.com/cloud/blog/2009/07/30/setting-up-memcached-on-c...
Memcache will help you survive the load ;)
memcached
thx, I stand corrected, reading up on it now.
what are the chances that
There's really no way to know without doing load testing. I've used loadstorm.com in the past. You probably want to setup a testing domain on your production gear and setup a testing payment processor. Then if you have something like NewRelic in place you can review what happened during the load test and see where you can improve.
--
Dave Hansen-Lange
Director of Technical Strategy, Advomatic.com
Pronouns: he/him/his
True, NewRelic is great. Also
True,
NewRelic is great. Also you might want to try out browsermob.
Testing also can be done bit cheaper (but not with so accurate resources) with tools like apache benchmark.
But your not going to be able
But your not going to be able to use ab to simulate thousands of people doing a checkout.
--
Dave Hansen-Lange
Director of Technical Strategy, Advomatic.com
Pronouns: he/him/his
I would not suggest
I would not suggest apachebenchmark for this case, because ab cannot provide load like the users will generate, because they will all have a session.
btw, are you using Pressflow
btw, are you using Pressflow as a Drupal6 Core Replacement?
This will give you better caching in combination with Varnish.
Pressflow
No. I had no reason (that I knew of) to choose Pressflow prior to this.
Pressflow
Use it; be sure to fix PHP notices or hack core to set the error reporting level back to normal.
Also I've found these 2 patches for core always help if your using something like memcache for the cache backend; if not, testing of them is needed.
http://drupal.org/node/557542#comment-5104544 - Cache module_implements() - Running this in production right now with good results.
http://drupal.org/node/1327720#comment-5186246 - Cache url() - More experimental; should be good on a smaller site (node count); worried about large sites running out of ram. We've been running this on our dev box for the last 2 weeks.
If you are using memcache be sure to use the replacements for cache.inc, lock.inc, session.inc. Also enable the path alias module.
If you find other interesting bottle necks be sure to add the solution to this wiki.
http://groups.drupal.org/node/187209
Certainly move all static
Certainly move all static assets to [any] good GLOBAL localized POP CDN;
be it Akamai or Edgecast. The difference is [simply] staggering!
http://blog.cloudharmony.com/2010/02/cloud-speed-test-results.html
http://blog.stackoverflow.com/2011/05/the-speed-of-light-sucks/
http://chrismeller.com/2009/10/amazon-cloudfront-vs-rackspace-cloudfiles...
http://serverfault.com/questions/234511/question-regarding-uptime-strate...
--
Linux: Web Developer
Peter Bowey Computer Solutions
Australia: GMT+9:30
(¯`·..·[ Peter ]·..·´¯)
Cloud files
Wow! Looks like RS kicks Amazon's butt on the chrismeller.com link.
Has to be a "dedicated server"
Being told it has to be a "dedicated server". I have been unable to sufficiently explain that cloud is not the same as shared. :(
Awaiting a call on a managed dedicated server solution from RS that can handle 20,000 sim users.
just make sure they will but
just make sure they will but a Varnish in front and configure the servers for Drupal
Varnish requires drupal 7 or
Varnish requires drupal 7 or pressflow.
https://www.varnish-cache.org/trac/wiki/VarnishAndDrupal
Pressflow for Memcached?
Is Pressflow required to use Memcached with Drupal 6? This site is already built without it and I have less than a week to do this and have not used Pressflow previously. I found http://drupal.org/project/memcache but the doc example specifically states it's for a Pressflow install. http://drupal.org/node/1181968
no Pressflow is not required
no Pressflow is not required for Memcache.
But your pointed out installation guide (http://drupal.org/node/1181968) will not work in the RS Cloud, because it uses local sockets, so your memcache needs to run on the same server as the apache, which is bad.
I would suggest to use this one: http://drupal.org/node/1131468
we use this in all our Servers and works perfekt.
Dedicated server
I have to move the site to a dedicated server for the event.
so you host the whole site on
so you host the whole site on one single Server?
20'000 simultaneous users?
Oh have fun…
I would suggest:
1 loadbalancer
2 varnish
2 apache
2 mysql (master/slave)
2 memcache
And build the configurations of the varnish that you can easily add more apaches if the traffic goes really high...
Prehaps an Alternative to Pressflow
I had an existing D6 (not Pressflow) site that I didn't get a chance to upgrade to D7 along with my other D7 sites.
As a way of some consistency in the caching used between the two versions of Drupal, I downloaded and enabled the Cache Backport module (http://ftp.drupal.org/files/projects/cache_backport-6.x-1.0-rc1.tar.gz). I was able to then download the APC (http://ftp.drupal.org/files/projects/apc-7.x-1.0-beta3.tar.gz) and Memcache (http://ftp.drupal.org/files/projects/memcache-7.x-0.2.tar.gz). I use the exact same files in my D7 installations.
Read the notes with the module. The memcache module needs to be patched. You do not enable APC and Memcache modules.
It had been working for me since the beginning of the year.
Help
If anyone reading this can offer a "dedicated" high performance managed server (memcached, varnish etc) Drupal optimized solution and have it up by Monday or Tuesday next week email me at ryan@graham-group.com asap.
The client only needs it between Nov 14 and Nov 22 but paying for an entire month is fine. I would pull the necessary backups from Rackspace Cloud and assist where ever necessary/possible. After Nov 22 I plan to move the site back to Rackspace Cloud.
I have a quote from Rackspace in hand but they will not configure memcached, Drupal or anything really beyond the stack.
Also if someone would be willing to provide support to configure memcached, Varnish etc on the Rackspace dedicated solution that could also work.
Thanks in advance.
Sounds like a job for Pantheon
Have you talked to the guys at Pantheon (https://getpantheon.com/)? They focus on exactly that high performance stack - Drupal, Varnish, memcache, etc.
Clustered offering
You might also want to contact these guys:
https://www.getcadre.com/clustered
They specialize in Drupal hosting and their performance pack add-on includes memcache, varnish, and apc:
https://www.getcadre.com/performance_pack
I could help you with the
I could help you with the support to configure memcached, varnish, etc.
But as the other ppl already said, I would also first check with panteon for example, they do this every day and probably can help you better.
Price is no object
Price is no object at this point btw :)
Also these guys are good and
Also these guys are good and have some core drupal developers on staff
http://www.acquia.com/
Good advice, but not very relevant
You got a lot of good advice, but much of it is not very relevant to the issue at hand.
In your situation, things like Varnish, Memcache, CDN, ...etc. will help only so much, with stuff that is not the main issue. Yes, they may save some server resources, but not to the extent you want.
The main issue is 20,000 users doing stuff other than retrieving pages out of a cache. This means that they will have to have sessions (meaning Pressflow's caching will be bypassed) in order to checkout.
Your course of action should be to benchmark the application using a tool that can simulate sessions (e.g. Jmeter), record a checkout session, then repeat it with 20,000 simulated users.
Before your reach that point, you will find that both PHP execution and MySQL queries will be overburdened. Maybe by only 2,000 users, if not less, doing what they are supposed to do.
Start with benchmarking the site with this line in settings.php
<?php$conf['cache'] = 0;
?>
Which will simulate logged in users which is basically what these users will be like. See where your bottlenecks are.
I am guessing that even one dedicated server or two will not take that kind of load.
Drupal performance tuning, development, customization and consulting: 2bits.com, Inc..
Personal blog: Baheyeldin.com.
OTOH using memcache to store
OTOH using memcache to store sessions would probably offer a lot of help here, as long as you have enough server and memory to keep them all going.
HollyIT - Grab the Netbeans Drupal Development Tool at GitHub.
Thanks!!
I never expected this much advice. It makes me even more proud to be a part of the Drupal Community. I am speechless.
With that time frame
With that time frame definitely go to a vendor that has lots of experience in setting something like this up:
First: Switch to pressflow immediately.
Managed High Performance Hosting
Get a provider that can scale your site up based on demand.
getcadre.com - does a great job and can setup dedicated DB server, several webheads for Apache / PHP, dedicated Memcached Server and Varnish in front of it. Given enough money they can (hopefully) setup as many webheads as you need. They also do complete performance audits, but probably not on the timeframe you give here ...
http://www.acquia.com/products-services/managed-cloud - They have nginx (which can handle a lot more concurrent users than apache), Varnish as loadbalancer with hot spares, and can again give you one or more dedicated DB Servers (master / slave), and as many webheads as you might need. Even during the event, they can add new webheads (as far as I've understood). They also have new relic build in, which should give you some pointers in terms of performance.
Their docs say: "Fully managed elastic response to site demand", which is probably exactly what you need.
Don't do that alone!
Get paid help by any expert that might be available!
I've been on a server once that did go down for a moment, because well "boost should be enough". It was one of the creepiest moments in my life. We had a hot spare and it was up and running very soon, but being on that server and watching the load increase and increase, you can't barely type anymore, you try to stop apache and freeze ...
Provide more information
It would also help if you gave us a link to the site. (PM is fine)
Some obvious problems can be seen directly:
Also if your site loading times now are already (uncached) 700-1500 ms and 90 MB per request, you might get a problem. If they are more like 50-300ms and 20MB, things might work a lot smoother. We can tell you that just by looking at your site from external.
Some more tips
Disable as many modules as you can.
Make content as much static as you can and have caching set to the max.
Analyze, analyze, analyze with any tools that are available (xdebug, newrelic, etc.)
Use fast_404 module, make sure there can be no 404s - they can kill your site.
Make sure that the Google Promotion Links are filtered out either in Varnish or at least in Drupal / .htaccess / settings.php via rewrite rules.
An example of such a rule can be found in my Boosted Varnish configuration here: http://www.trellon.com/sites/default/files/boosted-varnish.vcl_.txt
### START .htaccess rewrite rules
# Strip out Google Analytics campaign variables. They are only needed
# by the javascript running on the page
# utm_source, utm_medium, utm_campaign, gclid
if(req.url ~ "(\?|&)(gclid|utm_[a-z]+)=") {
set req.url = regsuball(req.url, "(gclid|utm_[a-z]+)=[^\&]+&?", "");
set req.url = regsub(req.url, "(\?|&)$", "");
}
### END .htaccess rewrite rules
I saw a system having a really hard time, because /link was cached, but /link?utm_campaign_code=3424342, etc. was not ...
You can also change them to #utm_ with some setting in google_analytics.module to accept those utm IDs.
Make sure not to have superfluous disk accesses. A CDN for your assets in sites/*/files is a must and also helps with 404s - if those are cached properly.
Time your views and set them to cache for a long time, use block caching; again: Remove all non-needed functionality and replace it with static HTML assets (if it needs "some special module").
Make clear to the client that content updates during this time are a bad idea.
Even while not totally relevant, you should have the basics down
I hope that helps also a little.
Best Wishes,
Fabian
switch to nginx before your
switch to nginx before your load hits the fan! i have an old drupal-5 site, getting 650,000 pageviews a month with about 60 objects on each page (lots of thumbnails and galleries), the difference has been amazing, i dare say it would even survive a slashdotting! i should upgrade that site, but for now it works and generates good revenue
nginx is a lot more stable under load than apache2 (i found)
http://dgtlmoon.com
Facing the same - what solution did you choose?
Bayousoft - any chance you could please post which option you chose?
One of my clients might be featured on Ellen DeGeneres' 12 Days of Giveaways. This thread is fantastic for preparing - many great options. Like your client's situation, this would be a one time event. We have very little time to prepare. Not a lot of money for making it happen though, as this is a non-profit. The site is Drupal 6 on shared hosting right now. I set up the site, but am not qualified to handle advanced development needs like this.
Any insight would be much appreciated!
Followup: High Traffic Event
I communicated with several companies via email and telephone. Matt at Acquia was a great resource. Our traffic from Good Morning America was estimated (by Good Morning America) to be as much as 20,000 concurrent users. That never happened. I saw an increase of 10,000 unique visitors over the entire morning.
I host most sites presently with Rackspace Cloud Sites. RS said 20,000 conc would not be an issue, but I got different advice from several other people. In the end, the 10,000 visits was of course not an issue.
I did move all static assets to the RS CDN which I probably should have done anyway.
So, for me it was a learning process for sure.
As far as your situation, I would think shared is probably not going cut it.
Makes sense... and thank you!
Thanks so much for responding so quickly. Reading through your thread and checking the links, Acquia and Rackspace were the two most promising from what I could tell. We'll pursue those resources first.
This is why this forum is so great - your being willing to share your experience saves others time & mistakes! Much obliged!
I think you'll need to
I think you'll need to prepare mostly for anonymous traffic then, which is much easier.
At minimum I would do:
For the event itself:
That should all still be doable even with low budget.
Best Regards,
Fabian
Yes, traffic is entirely anonymous...
There's minimal user account creation, so traffic is primarily anonymous. Your suggestions will help a lot in preparing, as you've given guidance on the steps and the time it will take. Anytime a client need is urgent, there's a risk that outside vendor cost will be inflated - not entirely unreasonable or unfair. For non-profits though, it puts the help out of reach. Thanks for your guidance!
I would contact the hosting
I would contact the hosting company and tell them about this. Since the site is an NPO, they might be willing to help you out on it. If anything offer a small badge on the site mentioning them as the host, kind of an incentive.
Shared hosting will most likely not cut it (I would lay money on this), so you'll want to get something going soon. A lot is going to depend on the functionality of your site. If you have a lot of users that login regularly then things are going to be tougher. If it's mostly anonymous users then there's a really simple solution:
Not all CDN's are created equally. I know this works great off of Voxel's CDN, since I had to do this for a client last year with only a couple days notice. You can even write a quick module to notify the CDN anytime the content changes so that it will notify Voxel via API that the page has changed and the CDN will regenerate the page within 15 minutes. That or setup the CDN to refresh the front page every 15 minutes.
For this to work right you need to make sure all the URLs in your site (internal links, images, css, js, etc.) are relative or else you'll end up getting things that link back to your subdomain for people to login.
The best part about this is that it's extremely simple to set up. As long as your URL's are all relative, it's just basically a couple quick DNS changes and setting up the CDN. It can be done in about 10 minutes, but make sure you do it a couple of days in advance so that the DNS will propagate. Once everything calms down simply change your DNS back and you'll be back where you were.
You might be able to use another service like Amazon's CloudFront for this. I know it worked for me on Voxel, which charges by usage like CloudFront and the price is pretty comparable to them.
If you got a lot of regular users logging in and content changing a lot, then you are pretty much going to be stuck going with either a dedicated (recommended) or cloud hosting. Switching over to Pressflow and using Varnish would be the quickest and easiest way to get it up and running. You might even be able to get away with Boost, which would be easier. The problem about situations like this is that you never know what kind of traffic increase you will see.
No matter which way you are going, make sure you get it setup a few days in advance since you will have some DNS modifications and those need time to propagate.
HollyIT - Grab the Netbeans Drupal Development Tool at GitHub.
Cadre and Voxel look promising
Whether they are invited to appear or not (we'll know early next week), these are excellent next steps you've outlined. The organization is getting more exposure and it's only a matter of time before their site experiences more frequent big spikes. CDN is necessary, clearly.
Content is mostly static and fairly image heavy. There are a number of forms - pledge and contact forms for various types of partners. There's a fair number of links to static files - educational resources. A fairly small shopping section to browse products - but no Ubercart or other cart/payment within their domain.
There are almost no users logging in and changing content beyond the occasional commenter who sets up a user account rather than posting anonymously. I'm the only one doing admin. One of the founders adds blog posts monthly and "green tips" weekly or bi-weekly. They sell reminder products and re-usable bags using a shopping cart and payment processing service from GoEMerchant - all handled by GoEM's servers. (We'll be notifying them if they are invited to appear on the show.) Most of their social community is on their Facebook page.
Does this still sound like a dedicated server is the best solution? VoxCLOUD for a month, then moving back down to a shared plan looks promising and affordable. Cadre's options are slightly more expensive for a month of dedicated server hosting, but they have no setup fee and they have migration assistance included - again, big help there, as this is new development terrain for me. Ideally, I think setting up with a host where they can move up for anticipated spikes would work best for their time & budget constraints.
What is the reasoning behind the admin subdomain? (My apologies if that should have been obvious to me!)
I still need to explore your Pressflow and Varnish recommendation.
Thank you so much for taking the time to lay out so many suggestions, so clearly!
With the commenting and that
With the commenting and that putting the whole site on a CDN wouldn't work out that well, since cookies would mess with you. I would go the dedicated server route and either varnish or Boost. The CDN would help though for the static content you have on the site and can be easily setup with the CDN module.
Commenting can take a big hit on the site. Generally what I do is use a 3rd party service, like Disqus, for comments if the site doesn't have any other user integration stuff involved. That way people don't have to login to have their "identity" and page caches aren't invalidated every time a comment is posted.
Honestly you can probably get away with cloud if you moved the comments to Disqus and used a CDN for all the static stuff.
On the admin subdomain, that's because the normal site you would want cookies ignored and everything coming off the CDN. The admin. would take the site staff directly to the Drupal installation, bypassing the CDN totally. Also you don't want to risk some admin view of a page getting stuck inside the cache on the CDN and this makes sure that doesn't happen.
HollyIT - Grab the Netbeans Drupal Development Tool at GitHub.
Glad I mentioned comments
That narrows down the possibilities. Excellent point about the comments. I've used comment services in the past for WordPress sites for the reasons you mention. Once past the twin constraints of time and budget (sounding like a broken record here!) we intend to move away from Drupal blog services entirely - perhaps to Tumblr - and set it up as a sub-domain. In development there are always the gotchas - like cookie issues you mention - and one could spend endless time finding them out one by one without forums and experienced help like yours and others on this thread!
Thanks for the admin subdomain details - makes sense.