High Traffic Event

Events happening in the community are now at Drupal community events on www.drupal.org.
bayousoft's picture

I have a client running a relatively new Drupal 6 Ubercart based site with approx 50 items. It presently runs on Rackspace Cloud Sites. UPS shipping is enabled and the Auth.net SIM method is used for credit card checkout. No other external web services are in play that I am aware of.

I have completed a High Traffic Event form with Rackspace. I am awaiting further instructions but they have already suggested all static assets be moved to their cdn. (Akamai)

The site will be featured on Good Morning America and I am being told to prepare for up to 20,000 simultaneous users. They will offer a coupon code for a particular item so that particular node might be taking the brunt of the hits. I am looking at implementing Ubercart Discount Coupons to handle the coupon code.

Assuming for the moment that the hardware and network side of things are accounted for by Rackspace what are the chances that this site would survive this barrage and is there additional low hanging fruit I could put in place?

I have about a week to make this so.

Comments

Do you have memcached and

exlin's picture

Do you have memcached and varnish covered?

Varnish is great for serving static html (or "half-static" with esi), you should definately think about it if assume there are great percent of users not logging in (nor using ssl).

Memcache is great for reducing load from mysql. You can google for better definition about memcached and what it does, but you propably should have this anyway.

There are many articles about scaling drupal for high-availability, example this one http://www.lullabot.com/articles/varnish-multiple-web-servers-drupal?utm...

RS Cloud Sites

bayousoft's picture

Cloud sites run on clustered Rackspace servers. All caching outside of basic Drupal db cache is handled by Rackspace.

looks like CloudSites does

schnitzel's picture

looks like CloudSites does not provide Memcache, so check this out: http://www.rackspace.com/cloud/blog/2009/07/30/setting-up-memcached-on-c...

Memcache will help you survive the load ;)

memcached

bayousoft's picture

thx, I stand corrected, reading up on it now.

what are the chances that

dalin's picture

what are the chances that this site would survive this barrage

There's really no way to know without doing load testing. I've used loadstorm.com in the past. You probably want to setup a testing domain on your production gear and setup a testing payment processor. Then if you have something like NewRelic in place you can review what happened during the load test and see where you can improve.

--


Dave Hansen-Lange
Director of Technical Strategy, Advomatic.com
Pronouns: he/him/his

True, NewRelic is great. Also

exlin's picture

True,

NewRelic is great. Also you might want to try out browsermob.

Testing also can be done bit cheaper (but not with so accurate resources) with tools like apache benchmark.

But your not going to be able

dalin's picture

But your not going to be able to use ab to simulate thousands of people doing a checkout.

--


Dave Hansen-Lange
Director of Technical Strategy, Advomatic.com
Pronouns: he/him/his

I would not suggest

schnitzel's picture

I would not suggest apachebenchmark for this case, because ab cannot provide load like the users will generate, because they will all have a session.

btw, are you using Pressflow

schnitzel's picture

btw, are you using Pressflow as a Drupal6 Core Replacement?
This will give you better caching in combination with Varnish.

Pressflow

bayousoft's picture

No. I had no reason (that I knew of) to choose Pressflow prior to this.

Pressflow

mikeytown2's picture

Use it; be sure to fix PHP notices or hack core to set the error reporting level back to normal.
Also I've found these 2 patches for core always help if your using something like memcache for the cache backend; if not, testing of them is needed.
http://drupal.org/node/557542#comment-5104544 - Cache module_implements() - Running this in production right now with good results.
http://drupal.org/node/1327720#comment-5186246 - Cache url() - More experimental; should be good on a smaller site (node count); worried about large sites running out of ram. We've been running this on our dev box for the last 2 weeks.
If you are using memcache be sure to use the replacements for cache.inc, lock.inc, session.inc. Also enable the path alias module.

If you find other interesting bottle necks be sure to add the solution to this wiki.
http://groups.drupal.org/node/187209

Certainly move all static

--
Linux: Web Developer
Peter Bowey Computer Solutions
Australia: GMT+9:30
(¯`·..·[ Peter ]·..·´¯)

Cloud files

bayousoft's picture

Wow! Looks like RS kicks Amazon's butt on the chrismeller.com link.

Has to be a "dedicated server"

bayousoft's picture

Being told it has to be a "dedicated server". I have been unable to sufficiently explain that cloud is not the same as shared. :(

Awaiting a call on a managed dedicated server solution from RS that can handle 20,000 sim users.

just make sure they will but

schnitzel's picture

just make sure they will but a Varnish in front and configure the servers for Drupal

Varnish requires drupal 7 or

exlin's picture

Varnish requires drupal 7 or pressflow.

https://www.varnish-cache.org/trac/wiki/VarnishAndDrupal

Pressflow for Memcached?

bayousoft's picture

Is Pressflow required to use Memcached with Drupal 6? This site is already built without it and I have less than a week to do this and have not used Pressflow previously. I found http://drupal.org/project/memcache but the doc example specifically states it's for a Pressflow install. http://drupal.org/node/1181968

no Pressflow is not required

schnitzel's picture

no Pressflow is not required for Memcache.

But your pointed out installation guide (http://drupal.org/node/1181968) will not work in the RS Cloud, because it uses local sockets, so your memcache needs to run on the same server as the apache, which is bad.

I would suggest to use this one: http://drupal.org/node/1131468
we use this in all our Servers and works perfekt.

Dedicated server

bayousoft's picture

I have to move the site to a dedicated server for the event.

so you host the whole site on

schnitzel's picture

so you host the whole site on one single Server?
20'000 simultaneous users?
Oh have fun…

I would suggest:

1 loadbalancer
2 varnish
2 apache
2 mysql (master/slave)
2 memcache

And build the configurations of the varnish that you can easily add more apaches if the traffic goes really high...

Prehaps an Alternative to Pressflow

waverate's picture

I had an existing D6 (not Pressflow) site that I didn't get a chance to upgrade to D7 along with my other D7 sites.

As a way of some consistency in the caching used between the two versions of Drupal, I downloaded and enabled the Cache Backport module (http://ftp.drupal.org/files/projects/cache_backport-6.x-1.0-rc1.tar.gz). I was able to then download the APC (http://ftp.drupal.org/files/projects/apc-7.x-1.0-beta3.tar.gz) and Memcache (http://ftp.drupal.org/files/projects/memcache-7.x-0.2.tar.gz). I use the exact same files in my D7 installations.

Read the notes with the module. The memcache module needs to be patched. You do not enable APC and Memcache modules.

It had been working for me since the beginning of the year.

Help

bayousoft's picture

If anyone reading this can offer a "dedicated" high performance managed server (memcached, varnish etc) Drupal optimized solution and have it up by Monday or Tuesday next week email me at ryan@graham-group.com asap.

The client only needs it between Nov 14 and Nov 22 but paying for an entire month is fine. I would pull the necessary backups from Rackspace Cloud and assist where ever necessary/possible. After Nov 22 I plan to move the site back to Rackspace Cloud.

I have a quote from Rackspace in hand but they will not configure memcached, Drupal or anything really beyond the stack.

Also if someone would be willing to provide support to configure memcached, Varnish etc on the Rackspace dedicated solution that could also work.

Thanks in advance.

Sounds like a job for Pantheon

gchaix's picture

Have you talked to the guys at Pantheon (https://getpantheon.com/)? They focus on exactly that high performance stack - Drupal, Varnish, memcache, etc.

Clustered offering

afear's picture

You might also want to contact these guys:
https://www.getcadre.com/clustered

They specialize in Drupal hosting and their performance pack add-on includes memcache, varnish, and apc:
https://www.getcadre.com/performance_pack

I could help you with the

schnitzel's picture

I could help you with the support to configure memcached, varnish, etc.
But as the other ppl already said, I would also first check with panteon for example, they do this every day and probably can help you better.

Price is no object

bayousoft's picture

Price is no object at this point btw :)

Also these guys are good and

gateway69's picture

Also these guys are good and have some core drupal developers on staff

http://www.acquia.com/

Good advice, but not very relevant

kbahey's picture

You got a lot of good advice, but much of it is not very relevant to the issue at hand.

In your situation, things like Varnish, Memcache, CDN, ...etc. will help only so much, with stuff that is not the main issue. Yes, they may save some server resources, but not to the extent you want.

The main issue is 20,000 users doing stuff other than retrieving pages out of a cache. This means that they will have to have sessions (meaning Pressflow's caching will be bypassed) in order to checkout.

Your course of action should be to benchmark the application using a tool that can simulate sessions (e.g. Jmeter), record a checkout session, then repeat it with 20,000 simulated users.

Before your reach that point, you will find that both PHP execution and MySQL queries will be overburdened. Maybe by only 2,000 users, if not less, doing what they are supposed to do.

Start with benchmarking the site with this line in settings.php

<?php
$conf
['cache'] = 0;
?>

Which will simulate logged in users which is basically what these users will be like. See where your bottlenecks are.

I am guessing that even one dedicated server or two will not take that kind of load.

Drupal performance tuning, development, customization and consulting: 2bits.com, Inc..
Personal blog: Baheyeldin.com.

OTOH using memcache to store

Jamie Holly's picture

OTOH using memcache to store sessions would probably offer a lot of help here, as long as you have enough server and memory to keep them all going.


HollyIT - Grab the Netbeans Drupal Development Tool at GitHub.

Thanks!!

bayousoft's picture

I never expected this much advice. It makes me even more proud to be a part of the Drupal Community. I am speechless.

With that time frame

fabianx's picture

With that time frame definitely go to a vendor that has lots of experience in setting something like this up:

First: Switch to pressflow immediately.

Managed High Performance Hosting

Get a provider that can scale your site up based on demand.

  • getcadre.com - does a great job and can setup dedicated DB server, several webheads for Apache / PHP, dedicated Memcached Server and Varnish in front of it. Given enough money they can (hopefully) setup as many webheads as you need. They also do complete performance audits, but probably not on the timeframe you give here ...

  • http://www.acquia.com/products-services/managed-cloud - They have nginx (which can handle a lot more concurrent users than apache), Varnish as loadbalancer with hot spares, and can again give you one or more dedicated DB Servers (master / slave), and as many webheads as you might need. Even during the event, they can add new webheads (as far as I've understood). They also have new relic build in, which should give you some pointers in terms of performance.

Their docs say: "Fully managed elastic response to site demand", which is probably exactly what you need.

  • getpantheon.com are great performance guys, but I don't know how their high performance setup is setup.

Don't do that alone!

Get paid help by any expert that might be available!

I've been on a server once that did go down for a moment, because well "boost should be enough". It was one of the creepiest moments in my life. We had a hot spare and it was up and running very soon, but being on that server and watching the load increase and increase, you can't barely type anymore, you try to stop apache and freeze ...

Provide more information

It would also help if you gave us a link to the site. (PM is fine)

Some obvious problems can be seen directly:

Also if your site loading times now are already (uncached) 700-1500 ms and 90 MB per request, you might get a problem. If they are more like 50-300ms and 20MB, things might work a lot smoother. We can tell you that just by looking at your site from external.

Some more tips

  • Disable as many modules as you can.

  • Make content as much static as you can and have caching set to the max.

  • Analyze, analyze, analyze with any tools that are available (xdebug, newrelic, etc.)

  • Use fast_404 module, make sure there can be no 404s - they can kill your site.

  • Make sure that the Google Promotion Links are filtered out either in Varnish or at least in Drupal / .htaccess / settings.php via rewrite rules.

An example of such a rule can be found in my Boosted Varnish configuration here: http://www.trellon.com/sites/default/files/boosted-varnish.vcl_.txt

### START .htaccess rewrite rules

# Strip out Google Analytics campaign variables. They are only needed
# by the javascript running on the page
# utm_source, utm_medium, utm_campaign, gclid
  if(req.url ~ "(\?|&)(gclid|utm_[a-z]+)=") {
    set req.url = regsuball(req.url, "(gclid|utm_[a-z]+)=[^\&]+&?", "");
    set req.url = regsub(req.url, "(\?|&)$", "");
  }

### END .htaccess rewrite rules

I saw a system having a really hard time, because /link was cached, but /link?utm_campaign_code=3424342, etc. was not ...

You can also change them to #utm_ with some setting in google_analytics.module to accept those utm IDs.

  • Make sure not to have superfluous disk accesses. A CDN for your assets in sites/*/files is a must and also helps with 404s - if those are cached properly.

  • Time your views and set them to cache for a long time, use block caching; again: Remove all non-needed functionality and replace it with static HTML assets (if it needs "some special module").

  • Make clear to the client that content updates during this time are a bad idea.

Even while not totally relevant, you should have the basics down

  • APC, Pressflow, Memcache (also for sessions and locking(!!!)), Varnish as Loadbalancer, dedicated DB Server with fast disks, slave servers for DB read access, multiple webheads with the possibility to add more and a CDN (cdn.module) are in my opinion a must for 20'000 (!) concurrent users ...

I hope that helps also a little.

Best Wishes,

Fabian

switch to nginx before your

dgtlmoon's picture

switch to nginx before your load hits the fan! i have an old drupal-5 site, getting 650,000 pageviews a month with about 60 objects on each page (lots of thumbnails and galleries), the difference has been amazing, i dare say it would even survive a slashdotting! i should upgrade that site, but for now it works and generates good revenue

nginx is a lot more stable under load than apache2 (i found)

Facing the same - what solution did you choose?

LARoss's picture

Bayousoft - any chance you could please post which option you chose?

One of my clients might be featured on Ellen DeGeneres' 12 Days of Giveaways. This thread is fantastic for preparing - many great options. Like your client's situation, this would be a one time event. We have very little time to prepare. Not a lot of money for making it happen though, as this is a non-profit. The site is Drupal 6 on shared hosting right now. I set up the site, but am not qualified to handle advanced development needs like this.

Any insight would be much appreciated!

Followup: High Traffic Event

bayousoft's picture

I communicated with several companies via email and telephone. Matt at Acquia was a great resource. Our traffic from Good Morning America was estimated (by Good Morning America) to be as much as 20,000 concurrent users. That never happened. I saw an increase of 10,000 unique visitors over the entire morning.

I host most sites presently with Rackspace Cloud Sites. RS said 20,000 conc would not be an issue, but I got different advice from several other people. In the end, the 10,000 visits was of course not an issue.

I did move all static assets to the RS CDN which I probably should have done anyway.

So, for me it was a learning process for sure.

As far as your situation, I would think shared is probably not going cut it.

Makes sense... and thank you!

LARoss's picture

Thanks so much for responding so quickly. Reading through your thread and checking the links, Acquia and Rackspace were the two most promising from what I could tell. We'll pursue those resources first.

This is why this forum is so great - your being willing to share your experience saves others time & mistakes! Much obliged!

I think you'll need to

fabianx's picture

I think you'll need to prepare mostly for anonymous traffic then, which is much easier.

At minimum I would do:

  • Use some dedicated server (do not use DevCloud or Pantheon if you plan to use custom Varnish configuration)
  • Add Boost and Memcache and make sure APC is properly sized; make sure it is all running well. (That can be done in around 3 hours)
  • Add Varnish to it via Boosted Varnish (unless you have the time to check that all is running properly with switching to Pressflow, no anon sessions, etc.) (another 2 hours)
  • Check list for 404s and/or add fast_404 module (now mostly in D 7.9 core!) (around 1-2 hours)

For the event itself:

  • Prime the Varnish Cache before your big event by manually scraping the most visited sites and set your max-age to up to the end of the event, so that its always served by Varnish. Or at least the boost generation to 12h or so ... ( around 4-6 hours with attendance during the event for monitoring )

That should all still be doable even with low budget.

Best Regards,

Fabian

Yes, traffic is entirely anonymous...

LARoss's picture

There's minimal user account creation, so traffic is primarily anonymous. Your suggestions will help a lot in preparing, as you've given guidance on the steps and the time it will take. Anytime a client need is urgent, there's a risk that outside vendor cost will be inflated - not entirely unreasonable or unfair. For non-profits though, it puts the help out of reach. Thanks for your guidance!

I would contact the hosting

Jamie Holly's picture

I would contact the hosting company and tell them about this. Since the site is an NPO, they might be willing to help you out on it. If anything offer a small badge on the site mentioning them as the host, kind of an incentive.

Shared hosting will most likely not cut it (I would lay money on this), so you'll want to get something going soon. A lot is going to depend on the functionality of your site. If you have a lot of users that login regularly then things are going to be tougher. If it's mostly anonymous users then there's a really simple solution:

  • Get a pull CDN to host the site.
  • Set up a subdomain for people who update the site, such as admin.mydomain.com. This points to the actual server.
  • Serve the regular www.mysite.com from the CDN

Not all CDN's are created equally. I know this works great off of Voxel's CDN, since I had to do this for a client last year with only a couple days notice. You can even write a quick module to notify the CDN anytime the content changes so that it will notify Voxel via API that the page has changed and the CDN will regenerate the page within 15 minutes. That or setup the CDN to refresh the front page every 15 minutes.

For this to work right you need to make sure all the URLs in your site (internal links, images, css, js, etc.) are relative or else you'll end up getting things that link back to your subdomain for people to login.

The best part about this is that it's extremely simple to set up. As long as your URL's are all relative, it's just basically a couple quick DNS changes and setting up the CDN. It can be done in about 10 minutes, but make sure you do it a couple of days in advance so that the DNS will propagate. Once everything calms down simply change your DNS back and you'll be back where you were.

You might be able to use another service like Amazon's CloudFront for this. I know it worked for me on Voxel, which charges by usage like CloudFront and the price is pretty comparable to them.

If you got a lot of regular users logging in and content changing a lot, then you are pretty much going to be stuck going with either a dedicated (recommended) or cloud hosting. Switching over to Pressflow and using Varnish would be the quickest and easiest way to get it up and running. You might even be able to get away with Boost, which would be easier. The problem about situations like this is that you never know what kind of traffic increase you will see.

No matter which way you are going, make sure you get it setup a few days in advance since you will have some DNS modifications and those need time to propagate.


HollyIT - Grab the Netbeans Drupal Development Tool at GitHub.

Cadre and Voxel look promising

LARoss's picture

Whether they are invited to appear or not (we'll know early next week), these are excellent next steps you've outlined. The organization is getting more exposure and it's only a matter of time before their site experiences more frequent big spikes. CDN is necessary, clearly.

Content is mostly static and fairly image heavy. There are a number of forms - pledge and contact forms for various types of partners. There's a fair number of links to static files - educational resources. A fairly small shopping section to browse products - but no Ubercart or other cart/payment within their domain.

There are almost no users logging in and changing content beyond the occasional commenter who sets up a user account rather than posting anonymously. I'm the only one doing admin. One of the founders adds blog posts monthly and "green tips" weekly or bi-weekly. They sell reminder products and re-usable bags using a shopping cart and payment processing service from GoEMerchant - all handled by GoEM's servers. (We'll be notifying them if they are invited to appear on the show.) Most of their social community is on their Facebook page.

Does this still sound like a dedicated server is the best solution? VoxCLOUD for a month, then moving back down to a shared plan looks promising and affordable. Cadre's options are slightly more expensive for a month of dedicated server hosting, but they have no setup fee and they have migration assistance included - again, big help there, as this is new development terrain for me. Ideally, I think setting up with a host where they can move up for anticipated spikes would work best for their time & budget constraints.

What is the reasoning behind the admin subdomain? (My apologies if that should have been obvious to me!)

I still need to explore your Pressflow and Varnish recommendation.

Thank you so much for taking the time to lay out so many suggestions, so clearly!

With the commenting and that

Jamie Holly's picture

With the commenting and that putting the whole site on a CDN wouldn't work out that well, since cookies would mess with you. I would go the dedicated server route and either varnish or Boost. The CDN would help though for the static content you have on the site and can be easily setup with the CDN module.

Commenting can take a big hit on the site. Generally what I do is use a 3rd party service, like Disqus, for comments if the site doesn't have any other user integration stuff involved. That way people don't have to login to have their "identity" and page caches aren't invalidated every time a comment is posted.

Honestly you can probably get away with cloud if you moved the comments to Disqus and used a CDN for all the static stuff.

On the admin subdomain, that's because the normal site you would want cookies ignored and everything coming off the CDN. The admin. would take the site staff directly to the Drupal installation, bypassing the CDN totally. Also you don't want to risk some admin view of a page getting stuck inside the cache on the CDN and this makes sure that doesn't happen.


HollyIT - Grab the Netbeans Drupal Development Tool at GitHub.

Glad I mentioned comments

LARoss's picture

That narrows down the possibilities. Excellent point about the comments. I've used comment services in the past for WordPress sites for the reasons you mention. Once past the twin constraints of time and budget (sounding like a broken record here!) we intend to move away from Drupal blog services entirely - perhaps to Tumblr - and set it up as a sub-domain. In development there are always the gotchas - like cookie issues you mention - and one could spend endless time finding them out one by one without forums and experienced help like yours and others on this thread!

Thanks for the admin subdomain details - makes sense.

High performance

Group notifications

This group offers an RSS feed. Or subscribe to these personalized, sitewide feeds: