How Much can Drupal Handle?

johnvsc's picture

Hello everyone.

A close friend of mine is a Vp for a major media company... etc. heavy hitters. Anyway, he wanted to know "are you familier with load scaleablity of drupal - ie > can it handle extreme traffic... like - major traffic during a world wide ticket sale?"

He mentioned (off the record) another superstars site (done with Drupal) "and the performance is terrible - like very very terrible..." That site wasn't one that they put up... but he wanted to know what could be done about it. Was there any testing / performance data on it. He liked what he saw with Drupal but needed more info before he was willing to launch a site with it....

I said, to be honest, I really don't know. "alot of it just could be the flash audio is sucking up to much processor
anywho - in short - i'm looking for samples of drupal that preform under load - ~100k unique visitors a day, prob in the millions of page impressions a month"

Does anyone have any info on this? Any suggestions?

Comments

It's all in the engineering

ryan_courtnage's picture

Drupal can scale, and there are several very busy sites that use it (ie: theonion.com is in the top 2000).

However, nothing comes for free - it's all in the engineering. Make use of CDNs at times that make sense. Engineer your systems for fault tolerance and high-availability from the ground up.

a lot, if your server and database can handle it too

ipwa's picture

Helo John,

Someone that has been doing a lot of work with performance is Khalid Bahey from 2bits, I always read the articles on their site and they frequently talk about performance. I recommend in particular this one: http://2bits.com/articles/can-a-drupal-web-site-handle-a-million-page-vi... I can also recommend this excelent article by Wim Leers: http://wimleers.com/article/improving-drupals-page-loading-performance it helped me get from a low D to a high B with Yslow and the steps where simple to follow, although I use the Javascript aggregator module instead of the patch they mention on the article.

Big sites like band or record label sites probably have a cluster of servers or something like that, so it is not entirely up to Drupal if it can handle your visitors, one of the most important thing would be your server setup and you MySQL configuration. I've used things like advanced cache and block cache and they help greatly reduce the amount of MySQL queries a Drupal page has. But advanced cache needs some work right now, the patches are not working properly. I used the path patch and it reduced seriously the amount of queries because drupal_lookup_path was getting cached, because most menu items in Drupal are really a query. I wouldn't use it in a production site though because I some of the hunks failed when I applied the patch.

http://nic.ipwa.net

Every Major Label Uses Drupal

Alex UA's picture

I would suggest looking at Dries' site for a list of high profile Drupal sites: http://buytaert.net/tag/drupal-sites . As Dries has noted, all three of the major labels now use Drupal (two for their main sites, Sony for their Music Box site, and all for band sites).

Drupal can handle just about any amount of traffic you can throw at it, BUT, you will need to hire some skilled Drupal Ninjas to test and performance tune your site(s).

Alex Urevick-Ackelsberg
ZivTech: Illuminating Technology

Alex Urevick-Ackelsberg
ZivTech: Illuminating Technology

Universal

patPrzybilla's picture

The company I am working for is maintaining the universal music group site...

There are a couple of millions page views a day and everthing is working fine.

Depends on Server and Database like the guys above me said. For my private projects I am using Lighttpd which is recomend by Dries...

Drupal Rocks !!!

Drupal Rocks !!!

Performance

tom_o_t's picture

Probably worth posting to http://groups.drupal.org/high-performance and reading over the posts there if you've not already.

Absolutely.

DaveNotik's picture

Hi.

It absolutely can scale.

There are folks who specialize in performance and scalability, and we partner with them. Folks like Tag1, who have also made available a book chock full of performance and scalability techniques: http://books.tag1consulting.com/scalability.

We've built high-profile sites like NASCAR's www.roushfenway.com, Bono's www.data.org and more.

I'd be happy to help!

Best,

--D

--
http://www.digital202.com
http://www.wovenlabs.com

Yes it can.

strudeau's picture

Drupal can certainly scale pretty easily in many scenarios if you get your architecture right. The hardest type of site to scale with Drupal is when every page view is unique per session, since tweaking your caching architecture to avoid hitting the database a gazillion times on every page request is more difficult when you can't cache full pages. You can tack a lot on to Drupal (e.g., memcached) that can improve things significantly, though that requires careful engineering like it would with any web app. Drupal can take you pretty far, especially if you're building an app that doesn't fall into the aforementioned categories (many page views, highly unique per session).

There's no doubt that Drupal

ixlr8's picture

There's no doubt that Drupal can scale pretty big. Drupal itself isn't the problem. I used to work for Sony BMG where I managed their multisite install, and now I work for Lifetime. The issue is all in the implementation of hardware. Your bottlenecks are going to be in how much your webserver(s) can handle, how dependent you are on your database (memcache makes this much easier), and how much your database server can handle. The biggest limitation that you may run into is how much MySQL can handle. When you start dealing with sites that have a LOT of content, you're going to lose a lot of performance just in searching the node and user tables. Memcache can help, but you only have so much room in memcache...

If you'd like an extra set of hands, from a consultant experienced with large scale sites, drop me a line, and I'd be happy to help you and your friend design a server architecture that will handle whatever traffic you throw at it.

Mike

Mike, Are going to be at the

johnvsc's picture

Mike,

Are going to be at the next Drupal meeting?

Let's talk then!

thanks

Everyone, thanks....

johnvsc's picture

for your comments.

I will point my friend to this page so that he can read the thread!

Yes, Drupal Rocks!

Scaling drupal...

jims's picture

Scaling Drupal fundamentally comes down to proper design at all levels. Poorly developed sites will be slow on even beefy hardware. And, conversely, properly designed sites on overloaded hardware will be slow. If you link to an ad server that has a slow response, things will be slow. If a site is on a hosted system that is over loaded, the site will be slow. If Apache and MySQL fighting for system resources, the site will be slow.

To design a site to scale, a few of the things you need to factor in:

  1. What functionality do you need?
    • If using third party modules, what impact do those have on overall resource usage/page load times.
    • If going with custom, do you understand how things interact with the system?
  2. How much content are you serving up on a given page?
    • Do you have sufficient bandwidth? Or should things be offloaded to a CDN?
    • Have you optimized the size of said content?
  3. What do database access patterns and load look like?
    • Do you need to set up a more advanced configuration -- master / slave, multiple slaves, etc.
    • What is the network bandwidth like between boxes?
    • Does the DB machine have sufficient ram and disk such that they are not the bottle necks?
  4. Look at stack alternatives:
    • Load balancing between multiple machines
    • LightHTTP/NginX instead of Apache
  5. Proper usage of caching
    • Memcached
    • APC / XCache, etc

This list is by no means extensive, but a few of the things one should look at. But fundamentally things come down to researching the proper design and implementation of your site with an understanding of whatever constraints you may have -- hardware, man power, cash, etc.

-jim spring

Lots involved, but sure, Drupal can scale

akalsey's picture

First, let me clear up some things from a few of the comments above.

MySQL isn't going to be your limiting factor. MySQL can scale to massive traffic. See Yahoo and Craigslist for examples of that. The issue is that in stock Drupal, you're tied to a single database server, limiting your growth to the performance of that single server. You can add patches that will allow you to split reads and writes to different DB servers, sending all writes to a master server and all the reads (which should outnumber writes 20-1) to a cluster of slave servers. These are the patches Drupal.org uses.

Simply loading records from the node or user tables won't impact performance, no matter how large your data is. The tables are well designed, with intelligent indexes. It's possible that some modules might try and read data from them in ways that hurt performance, but you can quickly find those and either tune the query or the database. Where you're going to hit issues with large data sets is in search. You could possibly run into some lock contention problems as well.

Drupal's built in search is fairly intensive on the DB. You can overcome this by adding a third-party search server, either by crawling your site as a user would or by implementing something like Lucene, Solr, or Sphinx that reads right from the database. You've got to think about user access to content with this approach - these search indexes don't know anything about Drupal's access control.

Most sites have their MySQL databases running MyISAM, which can become an issue for tables that have even moderate numbers of writes. Every write locks the whole table, preventing reads from happening. Moving to InnoDB can help there. InnoDB isn't the right choice for every table, through. Tables that are rarely written to will perform better as MyISAM. MyISAM tables are also smaller. So you need to think over your usage and plan things out.

Your traffic profile is going to seriously affect how you do things. For instance, if you have mainly anonymous usage you can do a lot to reduce the load on Drupal and the DB. Reverse proxies can completely offload the traffic from Drupal entirely. And the memcache patch can ensure that most of your content comes from local memory instead of the DB.

There's so much involved in scaling Drupal (or any other web site) that it's hard to hit all the high points in a forum post....

  • Reduce the number of modules you use, paying particular attention to write-heavy modules like Statisitics.

  • Cache, cache, cache. Built in cache, CSS aggregation, Block Cache, Memcached

  • Use Opcode caches to improve PHP performance.

  • Tune the stack -- make Apache perform better, replace apache with Nginx, make sure your DB server is tuned for the traffic and data size

  • Optimize your code -- watch the query usage and performace characteristics of third party modules

  • Push static content to CDNs or external web servers

We've accomplished some massive scaling on Drupal. We got Macworld Expo to handle 10+ million page views in a single day. We helped a major drug company run tens of thousands of concurrent users. So, yes Drupal can scale.

-- Adam Kalsey, WorkHabit.

Ok, Another question

johnvsc's picture

First of all, I had to remove the particulars from the thread above (in my posts) ... thanks to those who did the same...

here is an IM, with another question from my friend

hey - when you do get back - i have this question for you - and perhaps the thread ... so... there is alot of support to "yes" it can be done - but who would i goto if i say - i have a GIANT project that has to be live in 4 weeks and we want it on drupal - who would i go to that can guarantee me that it would be solid? i don't really have the time to "see" if it will be ok .... so, let's have this conversation over a beer... !

Giant project done in 4

ixlr8's picture

Giant project done in 4 weeks? I'd tell him no. It's as simple as that. There are 3 ways to do things. The cheap way, the fast way, and the right way. You can have 2 at most at any given time. The problem with this scenario is that 4 weeks is not a lot of time to get a site spec'ed out, themed, developed, qa'd and redeveloped for all the bugs, and then pushed live. Hell, I doubt a hosting company can even get a server farm prepped in that amount of time, let alone deployed.

You won't find many developers who would be willing to take on such a hit-you-over-the-head-with-a-brick obvious death march project. And I wouldn't trust those that would, because they obviously aren't going to really care about the results.

It also is important to understand what he means by "GIANT." I've had "GIANT" projects that really were just a couple of pages. It's all a matter of your client's frame of reference.

A lot of people think their internet presence is like this:

But really, it's more like this:

First thing's first. Get a more realistic schedule out of your guy, and also find out what he wants out of a consulting firm. Does he want people onsite? Does he only want people who speak English, and/or are relatively local (ie same time zone +- 3). Also what's his budget? Can he afford a firm like Lullabot, or is he expecting to pay $15-50 an hour? Does he have in house developers? Do they need to be trained? The list of questions goes on and on.

There are a lot of firms around, with a lot of talented developers. Each has their own "flavor" I suppose. I had a client who wanted me to take on a project that I couldn't take because I have a day job. So I referred him to a bunch of different shops that might be able to help him better.

If you have more questions, feel free to drop me a line.

Mike