Step-by-step: Setting up Varnish, Apache, APC and Solr Project Mercury Style
By popular request, here are step-by-step instructions for building the Project Mercury setup on a fresh ec2 image. We have both 32 and 64 bit versions of this AMI available (see http://groups.drupal.org/amazon-web-services-s3-ec2).
This is a wiki page which I will try to keep up to date as the project evolves. Please feel free to comment, add notes, and correct any mistakes you see.
Updated on 9/20/09 with numerous fixes and the new pressflow BZR location!
Updated on 10/8/09 to include adding Apache Solr to the install
updated on 10/22/09 with configuration file download instructions
Updated on 11/14/09 with restructured instructions
Updated on 11/19/09 for Mercury 0.8-Beta
Project Mercury Step By Step
1) Basic startup
a) Launch a new Ubuntu image. I use alestic ubuntu-9.04-jaunty-base.
b) Add the bzr/bzrtools ppa as per https://launchpad.net/~bzr/+archive/ppa
c) Get updates of installed packages and install needed packages:
apt-get update; apt-get upgrade; apt-get install apache2-mpm-prefork apache2-utils apache2.2-common autoconf automake automake1.4 autotools-dev bzr defoma fontconfig-config libapache2-mod-php5 libapache2-mod-rpaf libapr1 libaprutil1 libdbd-mysql-perl libdbi-perl libfontconfig1 libfreetype6 libgd2-xpm libhtml-template-perl libjpeg62 libltdl7 libltdl7-dev libmysqlclient15off libnet-daemon-perl libplrpc-perl libpng12-0 libpq5 libssl-dev libt1-5 libtool libvarnish1 libxpm4 m4 mysql-common mysql-server-core-5.0 mysql-server-5.0 mysql-client-5.0 php-apc php5 php5-cli php5-curl php5-common php5-dev php5-gd php5-mysql php5-xmlrpc php-pear postfix shtool ssl-cert subversion ttf-dejavu ttf-dejavu-core ttf-dejavu-extra varnish zlib1g-devNote: For our Mercury AMIs we leave the mysql root password blank (so the owner can choose it), but you should pick one during the install. Also we include postfix in the above instructions – replace this if you have a preference for another Mail Transport Agent (e.g. exim, sendmail).
2) Optional: Download config files:
If you choose, we have copies of all the config files that you'll need to edit available on launchpad.net – already edited!. Once downloaded, simply copy the files to the appropriate path instead of modifying the original. Make backups of all original config files before modifying and/or copying over them. Also note that copying the config files out of order (or all in one step rather than at the appropriate time in the instructions) will cause errors. You can skip this step if you wish to edit the config files by hand. We've only uploaded config files of Ubuntu Jaunty so far - we'll be adding additional distributions soon
bzr branch lp:~gregcoit/projectmercury/trunk/3) Configure apache and varnish to work together:
a) Change port 80 to port 8080 in /etc/apache2/ports.conf
b) Change port 80 to port 8080 in /etc/apache2/sites-available/default
c) Make a pressflow directory for varnish ( mkdir -p /var/lib/varnish/pressflow; chown varnish.varnish /var/lib/varnish/pressflow)
d) Edit /etc/default/varnish and:
1) Change INSTANCE=pressflow
2) Change port 6081 to port 80
3) Make sure "Alternative 2, Configuration with VCL" is set
4) Change the 1G at the end of the DAEMON_OPTS line to the amount of RAM you want Varnish to take. Too much and your server crashes under a heavy load, too little and Varnish isn't really helping. Start with 25% of the server's RAM and go from there (an Amazon Web Services small instance has 2G of Ram, so we set ours to 512MB). Never set this number higher than the amount of RAM you have.e) To allow Drupal's .htaccess file to function, edit /etc/apache2/sites-available/default and change "AllowOverride None" to "AllowOverride All" in the Directory /var/www/ block.
f) Enable mod-rewrite, restart apache and restart varnish (a2enmod rewrite; /etc/init.d/apache2 restart; /etc/init.d/varnish restart)
g) Load the public DNS of your instance. You should get the basic "It works!" page. This is coming via Varnish on port 80, pulled from Apache on the backend!
h) (Extra credit) install the Live HTTP headers firefox plugin and look for headers like
Via: 1.1 varnish
X-Varnish: 1429427693Then try loading it with the port 8080 attached (e.g. http://public_AWS_dns:8080). Note the differences.
To do step 3 using the config files we provided:
mv -i trunk/distro/jaunty/conf/ports.conf /etc/apache2/ports.conf
mv -i trunk/distro/jaunty/conf/default /etc/apache2/sites-available/default
mkdir -p /var/lib/varnish/pressflow
chown -R varnish.varnish /var/lib/varnish
mv -i trunk/distro/jaunty/conf/varnish.1 /etc/default/varnish
apache2ctl restart
a2enmod rewrite
/etc/init.d/apache2 restart
/etc/init.d/varnish restart4) Set up APC:
a) Edit /etc/php5/conf.d/apc.ini and make sure it has the following:
extension=apc.so
apc.shm_size=128
apc.include_once_override = 1b) If you're setting up a stable (rather than a development) server, consider adding the apc.apc.stat = 0 option as well. This will prevent APC from re-checking cached files, and improves performance. However, if you're changing files frequently (e.g. the site is still under development), it's not a good idea. For more information, see this 2bits article.
c) Increase the max memory available to php by changing memory_limit = 16M to memory_limit = 64M in /etc/php5/apache2/php.ini.
To do step 4 using the config files we provided:
mv -i trunk/distro/jaunty/conf/apc.ini /etc/php5/conf.d/apc.ini
mv -i trunk/distro/jaunty/conf/php.ini /etc/php5/apache2/php.ini5) Configure Mysql:
a) Create a pressflow database
b) Grant access to a non-root user
c) Optional: Change the default storage engine from MyISAM to InnoDB and restart apache (add default-storage-engine=innodb to /etv/my/my.conf)
mysql -u root -p
mysql> create database pressflow;
mysql> grant all on pressflow.* to user@localhost identified by 'password';
mysql> flush privileges;
mysql> \q
#if you wish to change the default-storage-engine:
mv -i trunk/distro/jaunty/conf/my.cnf /etc/mysql/my.cnf
/etc/init.d/mysql restart6) Download and configure Pressflow:
We're going to replace the "It works!" /var/www directory with a BZR checkout of pressflow (from its new home on lauchpad):
a) Delete the /var/www directory (rm -r /var/www)
b) Download pressflow into /var/www (bzr branch lp:pressflow /var/www)
c) Make files dir for Pressflow and set ownership and permissions (mkdir /var/www/sites/default/files; chown -R root:www-data /var/www/sites/default/; chmod -R 775 /var/www/sites/default/)
To recap step 6:
rm -r www
bzr branch lp:pressflow /var/www
mkdir /var/www/sites/default/files
chown -R root:www-data /var/www/sites/default/
chmod -R 775 /var/www/sites/default/7) Configure Varnish to work with Pressflow:
Edit /etc/varnish/default.vcl and add this, replacing the included "backend default" declaration:
backend default {
.host = "127.0.0.1";
.port = "8080";
.connect_timeout = 600s;
.first_byte_timeout = 600s;
.between_bytes_timeout = 600s;
}
sub vcl_recv {
// Remove has_js and Google Analytics cookies.
set req.http.Cookie = regsuball(req.http.Cookie, "(^|;\s*)(__[a-z]+|has_js)=[^;]*", "");
// Remove a ";" prefix, if present.
set req.http.Cookie = regsub(req.http.Cookie, "^;\s*", "");
// Remove empty cookies.
if (req.http.Cookie ~ "^\s*$") {
unset req.http.Cookie;
}
// Cache all requests by default, overriding the
// standard Varnish behavior.
// if (req.request == "GET" || req.request == "HEAD") {
// return (lookup);
// }
}
sub vcl_hash {
if (req.http.Cookie) {
set req.hash += req.http.Cookie;
}
}Then /etc/init.d/varnish restart
This will work to basically configure Varnish and Drupal to work together in terms of serving cached pages.
To do step 7 using the config files we provided:
mv -i trunk/distro/jaunty/conf/default.vcl /etc/varnish/default.vcl
/etc/init.d/varnish restart8) Install and configure Cacherouter:
a) Download Chacherouter from http://drupal.org/project/cacherouter, untar and move to Pressflow's module directory
b) Add the following to the bottom of /var/www/sites/default/settings.php
# Cacherouter: use APC for all local caching
$conf['cache_inc'] = './sites/all/modules/cacherouter/cacherouter.inc';
$conf['cacherouter'] = array(
'default' => array(
'engine' => 'apc',
'shared' => FALSE,
'prefix' => '',
'static' => FALSE,
'fast_cache' => TRUE,
),
);While we are at it, let's set the reverse-proxy settings:
$conf['reverse_proxy'] = TRUE;
$conf['reverse_proxy_addresses'] = array('127.0.0.1'); To do step 8 using the config file we provided:
wget http://ftp.drupal.org/files/projects/cacherouter-6.x-1.0-rc1.tar.gz
tar xvzf cacherouter-6.x-1.0-rc1.tar.gz
mkdir -p /var/www/sites/all/modules/
mv -i cacherouter /var/www/sites/all/modules/
mv -i trunk/distro/jaunty/conf/settings.php /var/www/sites/default/settings.php9) Install and configure Apache Solr
a) Install Tomcat6, the JAVA servlet container for Solr (apt-get install tomcat6)
b) Change #TOMCAT6_SECURITY=yes to TOMCAT6_SECURITY=no in /etc/default/tomcat6 (note that we also uncommented this line)
c) Download and install Apache Solr. We recommend using the nightly builds as Solr development is still pretty rapid
wget http://people.apache.org/builds/lucene/solr/nightly/solr-`date +%Y-%m-%d`.tgz
tar xvzf solr-`date +%Y-%m-%d`.tgz
mv apache-solr-1.5-dev/example/solr /var/
mv apache-solr-1.5-dev/dist/apache-solr-1.5-dev.war /var/solr/solr.ward) Insert the following text into /etc/tomcat6/Catalina/localhost/solr.xml (this tells Tomcat where to find Solr):
<Context docBase="/var/solr/solr.war" debug="0" privileged="true" allowLinking="true" crossContext="true">
<Environment name="solr/home" type="java.lang.String" value="/var/solr" override="true" />
</Context>e) Replace all instances of 8080 with 8180 in /etc/tomcat6/server.xml (since 8080 is being used by apache)
f) Download the Apache Solr module from http://drupal.org/project/apachesolr, untar and move to Pressflow's module directory
wget http://ftp.drupal.org/files/projects/apachesolr-6.x-1.0-rc3.tar.gz
tar xvzf apachesolr-6.x-1.0-rc3.tar.gz
mv -i apachesolr /var/www/sites/all/modules/g) Download php files for the Apache Solr module:
svn checkout -r6 http://solr-php-client.googlecode.com/svn/trunk/ /var/www/sites/all/modules/apachesolr/SolrPhpClient
mv -i /var/www/sites/all/modules/apachesolr/schema.xml /var/solr/conf/
mv -i /var/www/sites/all/modules/apachesolr/solrconfig.xml /var/solr/conf/h) Set permissions for Apache Solr dir (chown -R tomcat6:root /var/solr/)
i) Restart tomcat (/etc/init.d/tomcat6 restart)
To do step 9 using the config files we provided:
apt-get install tomcat6
mv -i trunk/distro/jaunty/conf/tomcat6 /etc/default/tomcat6
wget http://people.apache.org/builds/lucene/solr/nightly/solr-`date +%Y-%m-%d`.tgz
tar xvzf solr-`date +%Y-%m-%d`.tgz
mv -i apache-solr-1.5-dev/example/solr /var/
mv -i apache-solr-1.5-dev/dist/apache-solr-1.5-dev.war /var/solr/solr.war
mv -i trunk/distro/jaunty/conf/solr.xml /etc/tomcat6/Catalina/localhost/solr.xml
mv -i trunk/distro/jaunty/conf/server.xml /etc/tomcat6/server.xml
wget http://ftp.drupal.org/files/projects/apachesolr-6.x-1.0-rc3.tar.gz
tar xvzf apachesolr-6.x-1.0-rc3.tar.gz
mv -i apachesolr /var/www/sites/all/modules/
svn checkout -r6 http://solr-php-client.googlecode.com/svn/trunk/ /var/www/pressflow/sites/all/modules/apachesolr/SolrPhpClient
mv -i /var/www/sites/all/modules/apachesolr/schema.xml /var/solr/conf/
mv -i /var/www/sites/all/modules/apachesolr/solrconfig.xml /var/solr/conf/
chown -R tomcat6:root /var/solr/
/etc/init.d/tomcat6 restart10) Optional: Download ec2-metadata and configure Postfix:
a) We recommend installing the very handy ec2-metadata shellscript. It makes finding out info about your instance a breeze.
wget -q http://s3.amazonaws.com/ec2metadata/ec2-metadata
mv -i ec2-metadata /usr/local/bin/
chmod +x /usr/local/bin/ec2-metadata
/usr/local/bin/ec2-metadata -p | sed 's/public-hostname: //' > /etc/mailnameb) If you insalled Postfix, configure using the following:
postconf -e "myhostname = `/usr/local/bin/ec2-metadata -p | sed 's/public-hostname: //'`"
postconf -e "mydomain = `/usr/local/bin/ec2-metadata -p | sed 's/public-hostname: //'`"
postconf -e "mydestination = `/usr/local/bin/ec2-metadata -p | sed 's/public-hostname: //'`, localhost"c) Restart Postfix (/etc/init.d/postfix restart)
To recap step 10:
wget -q http://s3.amazonaws.com/ec2metadata/ec2-metadata
mv -i ec2-metadata /usr/local/bin/
chmod +x /usr/local/bin/ec2-metadata
/usr/local/bin/ec2-metadata -p | sed 's/public-hostname: //' > /etc/mailname
postconf -e "myhostname = `/usr/local/bin/ec2-metadata -p | sed 's/public-hostname: //'`"
postconf -e "mydomain = `/usr/local/bin/ec2-metadata -p | sed 's/public-hostname: //'`"
postconf -e "mydestination = `/usr/local/bin/ec2-metadata -p | sed 's/public-hostname: //'`, localhost"
/etc/init.d/postfix restart11) Install Pressflow:
a) Connect to your AMI using a web browser and install Pressflow
b) Go to Administer > Modules and enable Apache Solr framework, Apache Solr search, Search and CacheRouter via web interface
c) Go to Site configuration > Apache Solr and set:
Solr Port: 8180
Number of items to index per cron run: 50
Make Apache Solr Search the default: Enabled
Enable spellchecker and suggestions: checked
d) Go to Site configuration > Search Settings and set:
Number of items to index per cron run: 50
e) Go to Site configuration > Performance and set:
Caching mode: Aggresive
Minimum cache lifetime: 10 minutes
Page cache maximum age: 1 hour
Page compression: Enabled
Block cache: Enabled
Optimize CSS files: Enabled
Optimize JavaScript files: Enabled
f) Go to User Management -> Permissions and set:
search content: anonymous user and authenticated user
use advanced search: anonymous user and authenticated user
g) Change the permission of settings.php to protect it (chmod 755 /var/www/sites/default)
12) Setup the /mnt directory for storage, enable cron and create a boot script
Since root storage on AMIs are limited, we recommend using the /mnt directory for storage of Mysql files and the Varnish cache. We will soon have optional instructions for using Amazon's Elastic Block Storage (EBS) for storage.
a) Change the Mysql datadir from /var/log/mysql to /mnt/mysql in /etc/mysql/my.cnf.
b) Change the Mysql tmpdir from /tmp /mnt/tmp in the same file.
c) Change /var/lib/varnish/$INSTANCE/varnish_storage.bin to /mnt/varnish/$INSTANCE/varnish_storage.bin in /etc/default/varnish
d) Make the temp dir for Mysql (mkdir /mnt/tmp && chmod 666 /mnt/tmp)
e) Create a file (/erc/cron.d/drupal) with the following lines (this automates updating the Apache Solr database and other Drupal maintenance:
m h dom mon dow command
0 * * * * root /usr/bin/wget -O - -q -t 1 http://localhost/cron.phpf) Set up a one-time boot script by adding the following to /etc/rc.local (keeping "exit 0" at the end of the file):
# Mercury init script; only runs once at first boot.
# Produces /etc/mercury/incep with start time and a log of all bootstrap actions in /etc/mercury/bootlog
/etc/mercury/init.shg) Create the directory for mercury (mkdir /etc/mercury), add the following to /etc/mercury/init.sh and make it executable (chmod +x /etc/mercury/init.sh):
#!/bin/bash
if [ -e /etc/mercury/incep ]; then
exit 0
fi
exec &> /etc/mercury/bootlog
cd /var/www/; bzr merge --force
# Move mysql and varnish to /mnt
# TODO support for EBS and RDS
/etc/init.d/mysql stop
/etc/init.d/varnish stop
mkdir -p /mnt/mysql/tmp
chown mysql:mysql /mnt/mysql/
chmod 777 /mnt/mysql/tmp
mkdir /mnt/varnish
mv /var/log/mysql /mnt/mysql/log
mv /var/lib/mysql /mnt/mysql/lib
mv /var/lib/varnish /mnt/varnish/lib
sed --in-place=.bak s*/tmp*/mnt/mysql/tmp* /etc/mysql/my.cnf
ln -s /mnt/mysql/log /var/log/mysql
ln -s /mnt/mysql/lib /var/lib/mysql
ln -s /mnt/varnish/lib /var/lib/varnish
/etc/init.d/mysql start
/etc/init.d/varnish start
# Update packages
apt-get update
apt-get -y upgrade
# Config Memory
#uncomment the following if you would like to use our script to automatically
#configure apc, php, tomcat and barnish based on the RAM available on your system
#/etc/mercury/config_mem.sh
# Mark incep date
echo `date` > /etc/mercury/incepTo do step 12 using the config files we provided (only do one of the my.cnf commands!):
mv -i trunk/distro/jaunty/conf/drupal /etc/cron.d/
mv -i trunkdistro//jaunty/conf/rc.local /etc/rc.local
mkdir /etc/mercury
mv -i trunk/vps/aws/init.sh /etc/mercury/init.sh
chmod +x /etc/mercury/init.sh
mv -i trunk/vps/config_memory.sh /etc/mercury.sh
chmod +x /etc/mercury/mercury.sh13) Repackage Your AMI
Here's my basic AMI packaging script based on Eric Hammond's work. In order to make this run, you need to put your certs in /tmp. See Eric's great blog post for more details:
#!/usr/bin/php -q
<?php
//This is largeley based on http://alestic.com/2009/06/ec2-ami-bundle
//Helpful script for bundling AMIs
//Accepts command line args for version-number and "public". If public, it will remove potentially sensitive data.
//Readline input function.
function read() {
$fp=fopen("/dev/stdin", "r");
$input=fgets($fp, 255);
fclose($fp);
return trim($input);
}
echo "Make sure to chmod +x /etc/init.d/ec2-ssh-host-key-gen Continue? (y/n)\n";
$confirm = read();
if (strtolower($confirm) != 'y') {
die("User cancelled script\n");
}
if ($argv[1] == '') {
echo "Specify a version number! (e.g. 0.1)";
exit;
}
$bucket = 'chapter3-storage'; // update w/your bucket
$prefix = 'drupal-pressflow-mercury-jaunty32-'. $argv['1']; // change to start a new line
$exclude = '/mnt,/tmp,/root/.ssh';
$AWS_USER_ID = ''; // your user id (e.g. 5555-5555-5555)
$AWS_ACCESS_KEY_ID = ''; // your access key
$AWS_SECRET_ACCESS_KEY = ''; // secret key
if ($argv[2] == 'public') {
echo "Marking this as a public release!\n\n";
$exclude = '/mnt,/tmp,/root/.ssh'; // excluding my local user account
`sudo rm -f /root/.*hist*`;
`sudo rm -f /var/log/*.gz`;
$logs = explode("\n", trim(shell_exec('sudo find /var/log -name mysql -prune -o -type f -print')));
foreach($logs as $log) {
`sudo cp /dev/null $log`;
}
}
echo `rm /etc/mercury/incep`; // clear incep date
echo `rm -rf /var/lib/varnish/*`; // clear old varnish configs
echo "Bundling $prefix\n\n";
if (trim(`uname -m`) == 'x86_64') {
$arch = 'x86_64';
}
else {
$arch = 'i386';
}
passthru("
sudo -E ec2-bundle-vol \
-r $arch \
-d /mnt \
-p $prefix \
-u $AWS_USER_ID \
-k /tmp/ec2-keys/pk-*.pem \
-c /tmp/ec2-keys/cert-*.pem \
-s 10240 \
-e $exclude");
passthru("
ec2-upload-bundle \
-b $bucket \
-m /mnt/$prefix.manifest.xml \
-a $AWS_ACCESS_KEY_ID \
-s $AWS_SECRET_ACCESS_KEY");
echo "You can now register this AMI at\n";
echo "$bucket/$prefix.manifest.xml\n";
echo "Now chmod -x /etc/init.d/ec2-ssh-host-key-gen.\n";
?>
