Authenticated traffic scale options

We encourage users to post events happening in the community to the community events group on https://www.drupal.org.
btopro's picture

I'm wondering if people have any recommendations on scaling authenticated traffic sites?

Context: We've got about 2,500 users, of whom, 50-100 are on at any one time throughout the day. Right now their interactions are on largely static pages so we've gotten away with display_cache / entitycache / APC / Advagg for cache bin optimization so we can deliver pages pretty quickly.

We're going to be shifting to closer to 5k users over the next year as well as increasing the things they do when with us so they'll be more users on the box at the same time doing more things (like submitting content instead of just ingesting it).

We're using Organic Groups / techniques like it to rewrite parts of the content contextually to the user, some times this aligns with user role, other times it aligns with group they are in. For example, if they are a past student, we write the media to not display. If they are in Section 1 vs Section 2, they'll potentially see a page with the dates specific to their group injected into the content of the page via tokens.

I'm wondering if anyone has suggestions for other projects that could help hit higher scale through better resource management. We could get into a memecache / Redis server to host the cache bins but i'm wondering if there's anything beyond shifting bins out of local RAM and into remote that could push things further.

We've done testing and can hit 150-200 authed concurrent users before we start having problems at our current resources of 2 gigs of ram on a decent Dell blade server running a RHEL 6 on VMware.

Looked into..

Authcache https://www.drupal.org/project/authcache - seems out because it says largely static content, much of ours will start to be user submitted and different per role as well as organic group.

Render cache https://www.drupal.org/project/render_cache - still experimental, dev stalled for a few months

Any suggestions / recommendations I'd be happy to hear or that we're just in a "nope, start throwing resources at the problem" which is also possible; can jump up to 64 gigs of ram on a redis cache if needed, just would rather tune heavily at low resources for financial reasons and then scale via physical resources when we've done all we can.

Running patched Apache 2.2; I know NGinx is much faster but unless it's thrown in front of Apache we can't move off it for the time being (need to interpret user accounts via a system that currently doesn't have NGinx support sadly).

Comments

For faster page response, I

pdrake's picture

For faster page response, I have used authcache in the past. It works very well with content that can be cached per-role (or per group of roles) - user submitted or otherwise. It may be a challenge for you, however, because I believe it only works based on roles, not group membership. You could likely extend it to support group membership as well.

Do you have static assets being served from a reverse proxy already (eg. varnish)?

Are you resource constrained on RAM, CPU or both?

Static assets are served

btopro's picture

Static assets are served (largely) from an Asset management system; it's not running reverse proxy yet as it really hasn't needed to (and would need to mess w/ pound / something to decode https). We're not really resource constrained, I just try to get as much as I can out of almost nothing for those using our platform that are resource constrained (aka most of education).

Authcache with Organic Groups

weekbeforenext's picture

I have used Authcache with Organic Groups. In a custom module, I used hook_authcache_key_properties_alter to add Organic Group roles from the user to the cache key.

For the user specific content, I configured Authcache settings for those blocks, panels and views, essentially poking holes in the cached page where user specific content should display. This allows an ajax call to be executed for those pieces of the page.

I think Authcache is a totally doable solution.

Sounds promising; would you

btopro's picture

Sounds promising; would you be willing to share an example of your hook_authcache_key_properties_alter on dropbucket or a gist?

weekbeforenext's picture

Below is a snippet copied from a sandbox module I created called OG Role Access. This was my first attempt at getting Authcache to work and is currently in use, but there is definitely room for improvement.

/**
 * Implements hook_authcache_key_properties_alter().
 */
function og_role_access_authcache_key_properties_alter(&$properties) {
  global $user;

  $properties['og_all_roles'] = array();
  $properties['og_all_groups'] = array();
  $properties['og_gid'] = array();
  $properties['og_group_roles'] = array();

  //$properties['uid'] = $user->uid;
  $active_roles = og_role_access_user_active_og_roles($user->uid);
  $properties['og_all_roles'] = $active_roles;

  $groups = og_get_groups_by_user();
  $og_all_groups = array_keys($groups['node']);
  $properties['og_all_groups'] = $og_all_groups;

  if(isset($_SESSION['og_context'])){
    $og_gid = check_plain($_SESSION['og_context']['gid']);
    $og_type = check_plain($_SESSION['og_context']['group_type']);
    if(og_is_group($og_type, $og_gid)) {
      $properties['og_gid'] = $og_gid;

      $og_user_roles = og_get_user_roles($og_type, $og_gid, $user->uid);
      $og_group_roles = array_keys($og_user_roles);
      $properties['og_group_roles'] = $og_group_roles;
    }
  }
}

wow I didn't know it was that

btopro's picture

wow I didn't know it was that easy as setting a key that's unique to the user which it matches on (I assume). Thanks for the heads up, I have a function that effectively sets a section / session context that works with OG to show blocks, set content outlines and what not; now I can just make this a property and it should ensure people w/ that key see things the same (cause they largely should) :)

Our setup

mikeytown2's picture

TLDR: I've packaged up the solution into a module: Asynchronous Prefetch Database Query Cache. Install and fix all issues found on the status report page.

Due to the nature of what we're displaying (user dashboard) caching is impossible because we pull in real time data from an internal dot net json api (REST). Given these crazy constraints in terms of being unable to cache anything, I've had to make sure the one weak point of drupal doesn't go down, the database.

If you haven't done so make sure you're using the latest version of MySQL (5.6 as of this post). InnoDB when tuned properly is good enough. We stopped using memcache due to the issues we were having with keeping it fault tolerant in a cluster; by dropping memcache it means we got more 9's in terms of uptime. The biggest issue with MySQL thus far are deadlocks and metadata locks in the database cache; this is why memcache helps so much, no more db locks. By eliminating the cache locking issues with MySQL, you can use the database for caching, and not have to worry about the database locking up ever again. And not have a complicated setup with a memcache cluster.

I still have some work to do but on most entity pages (node, user, etc...) APDQC will prefetch from the DB cache the entity you're on and do other prefetching things using various drupal hooks to anticipate what will be needed and get it in the background before Drupal requests it; thus most cache_get calls are instant since APDQC already got the data. In order to do prefetching, async connections to MySQL is a requirement.

You're not going to see some crazy speed improvements with the APDQC module though, the core cache is hard to beat under a read only load when you're going over the network. What you will see though is a vast improvement when writes & clears hit the cache. When data is changing on the database, MySQL will issue locks to make sure we don't lose any data; memcache on the other hand doesn't have these constraints and thus performs a lot better when caches are being flushed and data is being written. By using the correct transaction isolation level, "Insert on duplicate update" instead of db_merge, and async writes APDQC allows for the db cache to not lock like before. The prefetching is something extra I've added; it gives a small increase in speed but this is not the main point of the module. The main benefit of the module is to allow you to scale drupal just using MySQL.

I'm not the only one who thinks that MySQL is all you'll need in the future/now. This session at the LA drupalcon will be talking about this for D8: https://events.drupal.org/losangeles2015/sessions/redis-and-memcached-ar...