I'm posting this as a wiki for others to add notes or links - based on Neil's presentation we attended. I would suggest conversation / disagreement / further discussion about the recommendations herein happen elsewhere - i.e. in the comments, etc. :-)
My notes are now up -Neil
Notes on Scalability
Neil Drumm
SF Drupal User's Group
July 14, 2008
OVERVIEW
- Solve the easy problems first.
- E.g. setting the right Drupal configurations, CSS, etc
- More below
- It's cheaper than programmers
- Hard problems = a few hours to a few days to solve
Scalability is not just fixing Drupal issues
- Whole LAMP stack
- Also CSS, HTML, Javascript problems (what it takes to render a page)
A Complex Process
- Find slowest part
- -(Whether it's the database or javascript)
- Fix
- Repeat
(Ba-dum tshhhh....)
Scaling PHP is always the same
-
You usually start w/ one server
(or Amazon web services, or VPS "virtual private server") -
Split DB server and web server
Drupal DB likes to have a lot of RAM
Apache - take off of DB server -
Add more web servers
Round robin DNS
Load balancer -
Eventually DB clustering
When? Scale example = Drupal.org has DB clustering
More on clustering below
EASY FIXES
-
Turn on Drupal caching
This makes anon page requests = 1 DB query
Otherwise typical Drupal page request is 60+ DB queries -
"Minimum cache lifetime" setting - make it longer
-
Enable block cache (Drupal 6)
-
Select "Optimize CSS" & "Optimize Javascript" settings (javascript in Drupal 6)
Merges all various modules's css and javscript into one file (CSS or javascript) -
Watchdog slows down sites; Drupal 6 allows for swapping out with other logging mechanism
"Database clustering is not too fun"
-
What you can do instead
Optimizing MySQL is key
MySQL default is configured for a laptop (i.e. not for a server)! -
MySQL Report (from hackmysql.com)
Extensive report, extensive documentation at hackmysql.com -
MySQLa
Checks the slow query log file
Run "Explain" in front of your query (on the command line)
Returns query plan from MySQL
Look for: - Key column always filled in
-
Rows should be low
Dev.mysql.com - explains the this "explain" table -
Devel module can show queries
Show how long they take
Can show querys that take longer than a time-set threshold
Shows how many times that query was called
These queries will highlight: - What is taking a long time
- What is getting queried all the time (i.e. not optimized)
Watch out for:
-
Views
-
Anything executing too many queries
- E.g. Views calling other views
- Views usually perform 7-8 queries each
Side conversation:
Stored procedures are not used in standard Drupal dev (e.g. not in core) because these are implemented differently in different database systems (e.g. MS SQL vs. MySQL vs PostGRES)).
PHP / APACHE FIXES
-
Install op-code cache
This is a PHP extension
Provides a good Apache speed improvement
E.g. APC or E-excelerator
Note: Can cause site faults
But can be configured to automate fixes (i.e. restart Apache) -
Optimize calling external web services
Set up proxy to cache these external services
E.g. Squid
Installing Squid can add a lot of complexity
OPTIMIZING FRONT-END
- Test w/ Firebug
Net tab: shows all requests used to build page
YSlow (Yahoo add-on): - Letter grading of various services
- Used for large installations
Your aim is to reduce HTTP requests
Javascript profiler (in Firebug) - more JS, slower page load
MORE COMPLEX FIXES
-
Using MemCache
Caches whole "objects" (user / node object)
I.e. Caches results of one object = multiple qrys in one
Used in addition to op-code cache
Run it as close to web server as possible
Requires code patch
Hard to debug -
DB Clustering
Structure:
1st db server - all rights (read and write)
2nd "slave" db server - only read rights
(DB clustering ability is built-in to Drupal 6)
Example: Drupal.org
2 load balancers, w/ Squid
3 web servers:
- DB master - read/write
- Slaves - search / read only
OTHER NOTES
-
VPS "virtual private server" recommendations
Advomatic uses Voxel.net (all Xen machines)
Groups.drupal.org/highperformance (node/229)
"Anything with Shack in the name is a bad idea" -
Tag1Consulting.com/drupal
Drupal performance checklist