Drupal newspaper site

Events happening in the community are now at Drupal community events on www.drupal.org.
shyamala's picture

We are launching a newspaper website in the next couple of days. The site is in Drupal. Can any comments on the ini settings? Does the server conf enough for the load shown.

The site has to serve:
Number of visits 35,720
Pages 205,129
Hits 1929,027

The server configuration:
Processor: Quadcore 2.66 ghz * 2
Ram: 8GB
Harddisk: 146Gb * 5

My.ini configuration:
[mysqld]
datadir=/Drive1/database
socket=/var/lib/mysql/mysql.sock
set-variable = max_allowed_packet=64M
set-variable=max_connections=1000
log-bin = /Drive1/database/mysql-bin.log
binlog-do-db=cab
server-id=1
old_passwords=1

[mysql.server]
user=mysql
basedir=/var/lib

[mysqld_safe]
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid

Comments

I would suggest bumping

cfuller12's picture

I would suggest bumping max_allowed_packet up to 128M. 64 might be sufficient but this is definitely one area where you can run into issues.

It's all ball bearings these days...

Opcode caching, etc

joshk's picture

Assuming your traffic numbers are daily, you (should) be good. A lot of it will depend on how many modules you have, how they're configured, and what your traffic mix is in terms of logged in/out users.

You may already be doing this, but definitely get APC set up, and be sure it has enough memory allocated to cache your entire stack. Kahled Khalid has a great article about this:

http://2bits.com/articles/importance-tuning-apc-sites-high-number-drupal...

Also, if most of your traffic is readers, you might consider looking into a more advanced caching system (e.g. Boost, Memcache, Cacherouter, etc) as these can improve performance for anonymous visitors.

http://www.chapterthree.com | http://www.outlandishjosh.com

Kahled?

kbahey's picture

Kahled?

Who is that?

:-)

Drupal performance tuning, development, customization and consulting: 2bits.com, Inc..
Personal blog: Baheyeldin.com.

Drupal performance tuning, development, customization and consulting: 2bits.com, Inc..
Personal blog: Baheyeldin.com.

Gah!

I would be very careful with

slantview's picture

I would be very careful with your max connections at 1000 that is a quick way to kill your database server and especially your web server (or both assuming that they are one and the same box)

we do anywhere from 400k to 1m uniques per day, and never have more than about 50 - 75 open database connections. If you have your site serving mostly anonymous traffic, be sure to check out Boost to do static page caching. If you are using a single server, check out Cache Router with APC and if you are using multiple servers check out Cache Router with Memcache.

I would suggest that using APC is a must for a site with decent traffic, and just be careful with your Apache configuration. If you give it too many available connections, it will eat up the memory, crash the box and thrash the disk. Try using tools like ab and pound to make sure you can push the kind of traffic you think you can.

Best,

Steve Rude

Query Cache

roger.gregory's picture

Don't underestimate the value of a well tuned MySQL query cache; even a small cache can provide tremendous benefits in terms of overall load on your webservers.

Drupal performance checklist

Amazon's picture

Before you launch, I recommend you go through this performance checklist.

http://tag1consulting.com/performance_checklist

Kieran

Drupal community adventure guide, Acquia Inc.
Drupal events, Drupal.org redesign

static cache for high visibility pages

chipk's picture

Hi Shyamala,

I work on a newspaper site (http://www.gazettenet.com) and have worked out a static cache'ing technique you might find helpful, assuming similarities in traffic patterns across newspaper sites.

We found that around 40% of all pages served by the system were from a relatively small number of URL targets, namely the home page (25%) and a handful of landing pages (local news, obituaries, sports, etc.). In addition, because landing pages are typically content-rich, they are also often the most expensive to serve from a performance standpoint. In our case, this same small collection of landing pages accounted for more than 85% of server load.

With that understanding in hand, we saw a huge performance boost opportunity if just those few pages could be served from a static cache directly by Apache - i.e. no call to mod_php, drupal, or mysql.

The technique we devised has three pieces:

  1. a cron job on the local server using 'wget' to build the cache for each of the target URL's
  2. Apache .htaccess code to differentiate between the calls from #1 above and all other calls
  3. client-side JS that calls back to the server for user login status

The basic idea is that the 'wget' calls from #1 are passed through to drupal which builds and returns the pages for the target URL's. Those returned pages are stored on disc as the static cache. The .htaccess code in #2 can differentiate between the wget calls (i.e. coming directly from the localhost IP) and calls from users, and so serves the pre-built pages from the static cache for all users. The JS in #3 is needed in our case, so that the static pages can be updated to reflect login status for the user.

Parts #1 and #2 are very easy to implement. Part #3 is fairly complex, but could be eliminated if you don't need to display any user-specific data or status on the cache'd pages.

Overall, the system is very reliable and resulted in dropping server load by 85%.

One other suggestion, if you aren't already configured this way, is to consider separating your front-end and db servers. We run on a single box very similar system to yours, but have it split into 2 virtual boxes with 2 cores and half the memory assigned to each. This allows us to better tune the DB server - always a good thing for performance.

Also, I think rgregory's advice around a well-tuned db query cache is a good one and a must for any high-traffic site.

best - Chip

High performance

Group notifications

This group offers an RSS feed. Or subscribe to these personalized, sitewide feeds: