Pressflow for logged in users + anonymous?

Events happening in the community are now at Drupal community events on www.drupal.org.
mrwhizkid's picture

Hi everyone,

I decided to try Pressflow out today. It seems to work pretty well when I when I replaced Drupal core with it but so far I haven't gotten Varnish to work with it. I did install Varnish but I've had some trouble with getting it to work on my server. Here are a few questions I'm hoping someone could help me with:

1). Does it make sense (or can you) run Varnish on a single server...that is...the same server that is serving up your webpage? I am confused about how I can have Apache listening on 8080 without seriously disrupting my sites. And I have multiple sites but I only want to run Varnish on one.

2). About 90% of my users are logged in members. Yes, I am running memcache with APC and I am also using block and views caching. Is there any advantage for me to use Pressflow if 90% of my users are logged in? I had heard that there might be some PHP 5 optimizations?

3). As my site gets bigger, might it make sense for me to have a master and slave database since my site is very read/write intensive? Lots of posting, updating statuses, and uploading pictures.

Thanks for your help!

Comments

Yes, Pressflow will help a

omega8cc's picture

Yes, Pressflow will help a lot also for logged in users, because it greatly optimizes some heavy database queries (for example there is no LOWER() in Pressflow, which kills mysql caching/indexes in vanilla Drupal 6), and does many more to support high performance sites/setups.

You could try Barracuda + Octopus. It is a high performance LEMP stack, based on Nginx, Pressflow, MariaDB, Memcache + Redis chained caches and also static caching for logged in users (static cache per user but better than Authcache).

This stack is almost as fast as Pressflow + Varnish + Memcache, but offers the highest speed/performance also for logged in users, out of the box.

The system is easy to install and upgrade also for less experienced users, since only very basic sysadmin skills are required - just ability to run bash scripts/installers on command line and waiting with some coffee/tea/beer until it completes the work for you.

The dual core installer is available on GitHub: https://github.com/omega8cc/nginx-for-drupal

Grace ~ http://omega8.cc

Seems greate

konrad1811's picture

Hi Grace!

What You are writing about seems sth greate - maybe exactly what i'm looking for, but I'm quite new to performance issues...
I currently changed my D6 core to Pressflow :)
But I feel lost in PF wiki - the documentation is not like D6 and hard to figure out what to do to have greate performance...

Could You please help me with somequestions?

I'm preparing some "small" social site based on Organic Groups + Facebook Connect. I'd like to be ready for a bigger traffic than 10 views/sec. I hope to have like 100/sec [at least this is my dream :P ]

I'd like to know if I do need some additional modules like AuthCache, Memcache, BlockAlter Cache, Views Cache [shall I check it on my views]?
Or The script You've attached [github] are enough and there's no need to play with additional modules for authenticated users?

Will this work wih some dynamic content and allow to refresh content while user are creating some?

And final... basicaly when I go to https://github.com/omega8cc/nginx-for-drupal I just have to download it [https://github.com/omega8cc/nginx-for-drupal.git] to my server and run some commands ?

Sorry for dumb questions - I'm rather Drupal admin than IT pro :/ Also had some problem with not-drupal addons like APC or Mamcache. It's easy for me to install drupal handlars - regular modules, but feel lost in server software.

However I got a root server so I think I can do everything there.

Yes

rjbrown99's picture

1) Yes, varnish would be on port 80 and Apache/whatever on a different port like 8080. If you have multiple sites on the same server you are going to have to work the config appropriately in that you may need multiple IPs where you bind varnish to one IP:80 and httpd to a different IP:80 for the sites that you don't want to send to Varnish. Or just configure Varnish as a passthrough for your other sites. Either way, if it's one box with one IP and you put varnish on port 80, everything is going to go through that for all of your sites.

2) IMO if you are a PHP5+MySQL Drupal setup - especially on a VPC or site where you have a level of control over the server - there is ALWAYS a benefit to using Pressflow.

3) It might, but only if you knew exactly where your pain points were in terms of time spent in varnish/httpd/memcache/mysql, and mysql is becoming a problem for you. I wrote lots of stuff in the following thread about NewRelic and how it can help you narrow down exactly which part of the stack is creating a bottleneck. It may be helpful.

http://groups.drupal.org/node/138889

@rjbrown99 dang, I step away

muriqui's picture

@rjbrown99 dang, I step away from the keyboard for a few minutes, and you beat me to all my answers. :)

Answering 1 and 3

muriqui's picture

I am confused about how I can have Apache listening on 8080 without seriously disrupting my sites.

If you need to run Varnish on the same host, you could just set it to ignore the sites you don't want to cache. Technically, Varnish will still be in front of everything served by Apache since you're giving it port 80, but the other sites will just bypass it immediately, so it should be invisible to your users.

might it make sense for me to have a master and slave database

It might, but the only way to answer that for sure is to monitor your server stats and see if MySQL becomes your bottleneck. And then even if it does, you might get some mileage out of increasing your query cache (for read performance) or moving your data to faster storage (for reads and writes) before jumping to a master/slave setup.

Thanks to everyone for their

mrwhizkid's picture

Thanks to everyone for their help with this topic. This weekend I installed Varnish but whenever I tried to run it (switch apache to listen on 8080, have varnish on 80) all of my sites, including the Pressflow one would go offline.

Is there something that I need to do in my pressflow settings.php file to get this to work?

And when you talk about having my other sites bypass Varnish, is this something I need to configure in Varnish or in apache?

Thanks for your patience!

The magic is in varnish config

repoman's picture

We did this first before we decided to split things out due to load and performance. Here are the steps we used when we initially setup everything on a single host.

1) Change apache config for pressflow to listen on 8080
a) Verify this is actually listening by hitting that server on that port. try it on the local server first to make sure that is working before adding anything else to the mix. Using host headers, then make sure that is in your hosts file or DNS. Aside from the obvious on that but you will need to schedule 'cron.php' to run at specific interval to keep the site running smooth.

2) Once that is working now it's time to play with Varnish. I never realized how powerful this little monster is. The configuration options are vast so keep it simple at first before venturing down that road. The section in Varnish that you need to be concerned with is the 'backend' as there is a 'port' setting that will forward your port 80 requests to. Check out this: http://drupal.org/node/1054886

Here is a snippet from our configuration

backend svapp01 {
.host = "xxx.xxx.xxx.xxx";
.port = "80";
}

backend svapp02 {
.host = "xxx.xxx.xxx.xxx";
.port = "80";
}

acl unwanted {
"69.60.116.97";
"69.60.116.197";
}

sub vcl_recv {
if (client.ip ~ unwanted) {
error 410;
}
if (req.http.host ~ "^(www.)?staging.uft.org$") {
set req.backend = svapp02;
} elsif (req.http.host ~ "^openx.staging.uft.org$") {
set req.backend = svapp02;
} elsif (req.http.host ~ "^files.uft.org$") {
set req.backend = svapp02;
} elsif (req.http.host ~ "^secure.staging.uft.org$") {
set req.backend = svapp01;
} elsif (req.http.host ~ "^test.uft.org$") {
set req.backend = svapp02;
} else {
error 404 "Unknown virtual host";
}

I hope this helps.

If I might add, You might

vegardx's picture

If I might add,

You might want to unset some cookies, as Drupal seem to be a little cookie monster, at least when it comes to modules, or else you'd probably see very few cache hits, I'm afraid.

But that's the black magic with Varnish, you have to carefully examine what cookies can be unset and what you need to pass to backend, this can differ a lot depending on what kind of visitors you're seeing. To find what cookies are being set you can check both varnishlog (varnishlog -o RxUrl http://someurl.ext/, to limit results) and the HTTP headers of your request.

If you look at fourkitchens VCL for Pressflow it gives a few good pointers on what you need to unset for it to work properly.

--
Vegard

Thanks very much for this

mrwhizkid's picture

Thanks very much for this code. This has been helpful. It seems like the default VCL file doesn't handle headers very well so this may be exactly what I need.

But after testing a bit...

Should the .host value be the actual IP of the host website? For example, for your staging.uft.org, site, would you be using the IP address of staging.uft.org as the backend?

The biggest problem I seem to have is that I have a couple of other sites hosted on this server that are not Pressflow; two other Drupal sites and a couple of WP sites.

When I pull up my pressflow site (after switching apache to listen on 8080), I can access it...www.pressflowsite.com:8080 but not so with the others... www.notpressflowsite.com:8080 comes up with an error.

Will a non-pressflow site work at all WITH varnish installed on the server?

Again, I really appreciate your help here. I am learning so much messing around in SSH.

1) the .host value should be

gchaix's picture

1) the .host value should be the IP of the backend web server.
2) yes, you absolutely can run non-Pressflow sites behind Varnish. I'm running several hundred Moodle sites and a few WordPress sites all behind the same Varnish cache. You need to be careful about handling session cookies, but it's eminently doable. Check your Apache configuration and make sure the vhosts are al listening on 8080.

High performance

Group notifications

This group offers an RSS feed. Or subscribe to these personalized, sitewide feeds: