What to do with slaves?

Posted by Crell on September 27, 2009 at 10:11pm

So this conversation hasn't happened yet, but it really really needs to. We have this nice and shiny new slave server support in Drupal 7, but we are not using it. That means we need to go through and flag which queries are "Slave safe", that is, are able to use a slave server if one is available without breaking stuff.

But wait! There's one trick involved. We also have a nifty feature where it's possible to temporarily disable a slave server for a single user for a few minutes. That is, after that user writes to the database (say, adding a new node) we can disable the slave server just for them, meaning that they will always get the most up to date data. Other users will get any changes as soon as the slave server catches up, which could be anywhere from a second to 5 minutes depending on the setup.

Of course, that then eliminates the number one case when you'd want to make queries non-slave-using, because they're dependent on up to the second data. So do we ever need to not use a slave server?

In earlier discussion I've gotten two possible answers here:

1) Keep the master server the default, and go through and flag slave-safe queries in core. We'd also need to publish guidelines for when to consider a query slave safe. Note that if most queries really are slave safe, then this could be a LOT of queries that get flagged for slave servers, a process that is, admittedly, not the prettiest code. The odds of contrib authors ever using slave servers is also then extremely low, since 99.99% of developers don't have a slave server on hand.

2) Make slave servers the default for select queries and for db_query(). Then we only need to mark those queries that STILL should not use a slave server, by some metric. Note that a potential problem here is that it is still possible, although not recommended, to pass INSERT, UPDATE, and DELETE queries through db_query(), as well as DDL queries. Those of course would all cause all sorts of ugly if run against a slave server. Also, queries that can't run against slave servers are going to be really really really hard for most contrib authors to figure out, since 99.99% of developers don't have a slave server on hand.

Either way, we need some guideline to determine what queries are slave-safe that we can publish in the database handbook section.

Anyone who has actually run a high-end database setup, please please please we want your input here. (If you haven't you can comment too, but I'm really looking for feedback from people with experience here, primarily.)

Comments

2. is not a problem

Posted by chx on September 27, 2009 at 11:09pm

If you run a slave which is not set to read only, sooner or later some dimwit will accidentally run a maintenance query against that from command line and then you can go ahead and rebuild the slave. So, feel free to pass in INSERT/UPDATE/DELETE/DDL, MySQL will happily throw an error. Case closed.

One issue with that

Posted by nnewton on September 30, 2009 at 4:29pm

An exception to that (running all slaves as read-only) is that some setups include two database servers configured as master-master, with a set of sites on them. Lets say I have 10 sites and that database setup. I could set db1 as the master for 5 of those sites and db2 as the master for the other 5. This isn't that uncommon for mass hosting.

Just thinking really quick

Posted by Jamie Holly on September 27, 2009 at 11:51pm

Just thinking really quick here, but a majority of select queries are slave safe. Ones that aren't:

last_insert_id()
found_rows()
sql_calc_found_rows()
row_count() (mysql 5+ - actually 5.0.1 is memory serves)
temporary tables (though replicated still can cause problems with lag)

(and I know I'm missing some but it's Sunday night)

Possible a regex could be put into play to figure out the none slave safe selects and flag those for master. Basically all selects are flagged for slave, unless there is a pattern match then they go to master.

Also we would have to take into consideration COUNT(). If a query writes to a master then does a SELECT COUNT() from the same table it would also have to go to the master. Perhaps keeping an array of table's that were written to and matching against that for any chance that a count is performed.

Something that might help developers in the handbook is creating a "how to" page for setting up replication, going through running two mysql instances on the same machine and how to get replication running.

I'm sure I'll think of more later, but wanted to throw these ideas out while they were fresh in my mind.

HollyIT - Grab the Netbeans Drupal Development Tool at GitHub.

No regex

Posted by Crell on September 28, 2009 at 12:23am

Regexing queries to detect master vs. slave is a no-go. Aside from the non-small performance cost of regexing over every query, while there are some "known unsafe" operations like you mention most slave-unsafe situations are contextual. We tried 2 years ago to come up with an automated way to detect slave-friendliness and couldn't come up with one. There's a thread about it somewhere that I think eventually got marked a duplicate of the DBTNG patch.

I was thinking more about

Posted by Jamie Holly on October 1, 2009 at 2:03pm

I was thinking more about this. How hard would it be to create a module, perhaps an extension of devel, that analyzes each query to determine if it's slave safe or not? I haven't dove into the new D7 stuff too much yet, but if queries are still saved in a global then perhaps a module that registers a shutdown function and parses $queries doing a regex. We would have to come up with a pattern to try and catch all/most items that would disqualify a query for a slave.

HollyIT - Grab the Netbeans Drupal Development Tool at GitHub.

Possible?

Posted by Crell on October 1, 2009 at 6:40pm

They're not in a global anymore, but you can start any number of query loggers at any time to record all queries that get run. Getting the raw data is easy. A parser to analyze and suggest slavishness on the other hand would be hard. As I said, that's a highly contextual question so just seeing the query string on its own wouldn't really tell you anything.

For instance, are path alias lookups slave safe? The query is dead-simple and read-only, so it should be. But are we OK with the path alias taking a few seconds to propagate on really high-traffic servers? That's not a question that can be answered from just the query string itself.

I think situations like that

Posted by Jamie Holly on October 2, 2009 at 2:53pm

I think situations like that will be better evaluated on a per site basis. For things like path alias, I can think of scenarios where having the data always up to date is a must and others where it's no big deal if the data is stale.

I don't think we will be able to catch every single scenario out there for when stale data is acceptable. Drupal is too organic for that, given all the different types of sites that can be built upon it. There are certain givens, like any cache tables should never be given stale data since that would produce some ugly race conditions (true in sites looking into replication they should be using alternative caching methods, but I'm sure some won't). Another big one would be the variables table, since a lot of modules set timestamps in variables to mark the last run of certain items. Again any lag here could produce some nasty race conditions, but this is on a per module basis.

One thing that might help (again - not sure if this is there yet or not), but a way to specify which tables should always be queried from the master in settings.php. That way individual sites have more control over what is acceptable to be stale and what isn't.

HollyIT - Grab the Netbeans Drupal Development Tool at GitHub.

Per site?

Posted by Crell on October 2, 2009 at 8:00pm

Well per-site becomes quite problematic, since then each site would have to hack core to flag different queries. That's very much not a good solution. :-) Even if it's not perfect for all sites we should have some sort of pattern in core that should work for all sites, even if not optimally.

Per-table doesn't help either. For built-queries we could theoretically detect which tables a given query uses, but we couldn't for static queries. Even then, by the time the query is executing it's already bound to a connection. It cannot be changed at that point, because by design you could in theory use different database drivers for different targets. (Weird but true. :-) )

not so fast

Posted by chx on September 28, 2009 at 1:14am

last_insert_id() we do not support this. We use the return value from an insert query.
found_rows() same, removed already.
sql_calc_found_rows() ?? Why is this not slave safe.
row_count() (mysql 5+ - actually 5.0.1 is memory serves) never heard of this.
temporary tables (though replicated still can cause problems with lag) Again, huh? Why are temp tables are a problem?

Temporary tables created on

Posted by Scott Reynolds on September 28, 2009 at 3:32pm

Temporary tables created on master do have to be replicated on the slave. Sure you can create temporary tables on the slave and then all queries referencing that table (including INSERT/UPDATE) must be run against the slave.

I haven't looked but from reading this thread, it seems that all INSERT/UPDATE's only goto Master? That will cause problems for developers INSERTing into Temporary tables created on Slaves.

Inserting?

Posted by Crell on September 28, 2009 at 10:02pm

I have to ask... Why are you inserting into a temp table? I've never used a temp table that was populated any way other than a single query on creation. Eg, "CREATE TEMPORARY TABLE SELECT ...".

I meant to replace

Posted by Jamie Holly on September 28, 2009 at 4:19pm

I meant to replace sql_calc_found_rows with row_count(). The reason for row_count is specified by MySQL:

Important

FOUND_ROWS() is not replicated reliably, and should not be used with databases that are to be replicated.

Starting in 5.1.23 it was properly replicated with row level replication.

Temporary tables are always problematic when written to a master. Considering the usual amount of data put into a temporary table you end up with increased replication lag, so the chances of going to a slave without that table being there is greatly increased. Also there are ways to prevent temporary tables from being replicated, using the --replicate-wild-ignore-table option (ie: naming all temp tables temptable_ then ignoring that pattern).

HollyIT - Grab the Netbeans Drupal Development Tool at GitHub.

Possible standard

Posted by Crell on September 30, 2009 at 10:49pm

Right, so, since what we actually want here is a standard for what is a slave safe query, I will propose the following:

A query is "slave-safe" and can be set to run against a slave server if available if it is acceptable for its data to be stale for up to five minutes for users other than the user who instigated a change.

Thoughts? Additions? Subtractions? Multiplications?

I would say a little bit more

Posted by Jamie Holly on October 1, 2009 at 1:55pm

I would say a little bit more expansion on that:

Slave-safe queries will be performed against a slave server, if available. For a query to be marked as slave-safe the following conditions should be met:
- It is acceptable for the data to be stale for up to five minutes, except for the user who instigated the change.
- The query in no way alters data within the database.

Even though the second point is somewhat redundant, we need to keep in mind that there are contrib developers out there who most likely have no idea what replication really is.

HollyIT - Grab the Netbeans Drupal Development Tool at GitHub.

What about cache_ tables?

Posted by Jamie Holly on October 7, 2009 at 11:51pm

Just thinking more on this, I could see potential problems with cache tables. Sure if a site is at the point they need to run replication then they should be employing alternative caching methods like memcache, but that isn't always the case.

Now here is the scenario I am thinking of, and one that is rather common in contrib. I have a rather intensive query that runs. So that it doesn't have to run on each request, I decide to throw it into cache. If there is a large lag time in replication (say 5 minutes) then that query would run on each page until replication catches up.

I'm just wondering if there shouldn't be some mention to be on the look out for this type of scenario (maybe even a look at cache_get and adding a flag for it to pull from master - again problems arise getting people to utilize it properly). Maybe even just a mention of this type of scenario and making a best practice being to only run this update on cron so that you don't end up with this race condition.

I can just see people saying "my site is fine for 60 minutes then for 3-5 minutes it slows to a crawl and I don't know why".

HollyIT - Grab the Netbeans Drupal Development Tool at GitHub.

Update?

Posted by sylv3st3r on November 19, 2010 at 10:12am

This talk have been death for a year now. Anyone have a better insight now with what should we do with this master/ slave? I've been looking at current drupal 7 release, beta3. And I still don't see any improvement on this. Can someone help with how should we do to enable a master - slave server?

Thanks

We're using Pressflow since

Posted by ronnbot on November 20, 2010 at 1:11am

We're using Pressflow since it has support for slave databases. But, the slave databases are severely under utilized since db_query_slave/db_query_range_slave/etc have to be used to send the query to the slaves. Unless you want to patch every module to use db_query_slave instead of db_query and so on, no more than 10% of query will ever go to the slaves.

So, what we did is patch the include file, 'database.mysqli.inc', specifically the function '_db_query()'.

The patch below basically overwrites the parameter $slave (if false) by checking the query if it is slave save by doing some logic:

it is a select
does not contain key terms such as 'last_insert_id' and etc.
querying tables we deemed slave safe.

The list of slave save tables is defined in the settings.php. For example:

$conf['slavable_tables'] = array(
  "blocks", "boxes", "cache", "contact",
  "content_group", "content_node", "date", 
  "imagecache", "languages", "locales", "menu", 
  "permission", "role", "system", "term_data", 
  "url_alias", "variables", "views", "vocabulary*"); 

Basically, we chose tables that are fairly 'static' in nature. Also, "*" is the wild card, so that blocks and blocks_roles tables are included.

Here is the patch:

--- database.mysqli.inc   2010-11-18 10:42:13.000000000 -0800
+++ database.mysqli.inc 2010-11-19 16:39:07.000000000 -0800
@@ -91,6 +91,18 @@ function db_connect($url) {
 function _db_query($query, $debug = 0, $slave = FALSE) {
   global $active_db, $active_slave_db, $queries, $user;
 
+  $slave = $slave ? $slave : _db_query_is_slave_safe($query); // send to to slave if 'safe'
+
+//  static $counter = 0, $a, $b;
+//  if (!$slave) {
+//    $a++;
+//    drupal_set_message("noslave $a: ".$query);
+//  } else {
+//    $b++;
+//    drupal_set_message("toslave $b: ".$query);
+//  }
+//  $counter++;
+  
   if (variable_get('dev_query', 0)) {
     list($usec, $sec) = explode(' ', microtime());
     $timer = (float)$usec + (float)$sec;
@@ -414,3 +426,90 @@ function db_column_exists($table, $colum
  * @} End of "ingroup database".
  */
 
+function _db_query_is_slave_safe($query) {
+  static $slave_safe_tables_pattern = false;
+
+
+  if ($slave_safe_tables_pattern == false) {
+    $slave_safe_tables = variable_get("slavable_tables", array());
+
+    if (count($slave_safe_tables)) {
+      $slave_safe_tables_pattern = '/^('. strtr(implode("|", $slave_safe_tables), array("*" => ".*")) .')$/';;
+//      drupal_set_message('$slave_safe_tables_pattern '.$slave_safe_tables_pattern);
+    }
+  }
+
+  if ($slave_safe_tables_pattern == false) { // still nothing, assume slave are not desirable
+    return false;
+  }
+
+  $query = strtolower($query);
+  if (strpos($query,"select ") !== false && !preg_match("/alter|insert|update|drop|delete|flush|lock|create|last_insert_id|found_rows|sql_calc_found_rows|row_count/", $query)) {
+    // a select and contains none of the unwanted key words
+
+    $tokenize_query = _db_query_tokenize($query);
+    $tables = array();
+    $table_count = 0;
+    for ($i = 0; $i < count($tokenize_query); $i++) {
+      $token = $tokenize_query[$i];
+      if (isset($tokenize_query[$i+1]) && in_array($token, array("join", "from"))) {
+        $table = $tokenize_query[$i+1];
+        $tables[$table] = $table;
+        $table_count++;
+      }
+    }
+
+    if ($table_count == 0) { // couldn't detect a table, must be a reeealy complicated query
+      return false;
+    }
+
+    foreach ($tables as $table) {
+      if (!preg_match($slave_safe_tables_pattern, $table)) { // table is not in the list
+        return false;
+      }
+    }
+
+    return true; // if we get here, it's all good to go to slave
+  }
+  return false;
+}
+
+function _db_query_tokenize($query, $cleanWhitespace = true) {
+
+  /
+   * Strip extra whitespace from the query
+   */
+  if($cleanWhitespace) {
+   $query = ltrim(preg_replace('/[\s]{2,}/',' ',$query));
+  }
+
+  /
+   * Regular expression based on SQL::Tokenizer's Tokenizer.pm by Igor Sutton Lopes
+   **/
+  $regex = '('; # begin group
+  $regex .= '(?:--|\#)[\ \t\S]*'; # inline comments
+  $regex .= '|(?:<>|<=>|>=|<=|==|=|!=|!|<<|>>|<|>|\|\||\||&&|&|-|\+|\*(?!\/)|\/(?!\*)|\%|~|\^|\?)'; # logical operators
+  $regex .= '|[\[\]\(\),;`]|\\'\\'(?!\\')|\"\"(?!\"")'; # empty single/double quotes
+  $regex .= '|".*?(?:(?:""){1,}"|(?<!["\\])"(?!")|\\"{2})|\'.*?(?:(?:\'\'){1,}\'|(?<![\'\\])\'(?!\')|\\\'{2})'; # quoted strings
+  $regex .= '|\/\*[\ \t\n\S]*?\*\/'; # c style comments
+  $regex .= '|(?:[\w:@]+(?:\.(?:\w+|\*)?)*)'; # words, placeholders, database.table.column strings
+  $regex .= '|[\t\ ]+';
+  $regex .= '|[.]'; #period
+  $regex .= '|[\s]'; #whitespace
+
+  $regex .= ')'; # end group
+
+  // get global match
+  preg_match_all( '/' . $regex . '/smx', $query, $result );
+
+  // return tokens
+  $arr = $result[0];
+  $new_arr = array();
+  foreach ($arr as $i => $val) {
+    $val = trim($val);
+    if ($val) {
+      $new_arr[] = $val;
+    }
+  }
+  return $new_arr;
+}

Hi ronn, thanks for the

Posted by sylv3st3r on November 23, 2010 at 11:30pm

Hi ronn, thanks for the reply. But I'm using D7, so I don't have access to pressflow. (Isn't this topic for D7? Or drupal in general?)
And D7 have a different query system. Trying to isolate which to hack is hard for me.
As the logic, well, we have the same idea, using the conf to determine which query goes to slave and master. I also implement cache system, so some queries will go there.
So yeah, once again, thanks. I'll try to hack this drupal core with the logic you posted, will report here. And still, every help will be appreciated~

UPDATE 1: Table block will always flagged as non-slave-safe, because of "lock" preg_match, so will every field consist the unwanted keywords. But I think it's still ok if we misses some query to master.

UPDATE 2: I've successfully hack the D7 database core to filter out the query. Which is slave safe and which is not. And change the default target to slave if it's slave-safe. Now the question is. Regular expression and string comparison is expensive, we add the query processing time by our auto-detection, so is it worth it? We are trading blows here.

Yes, I'm resurrecting an

Posted by firebird on September 20, 2012 at 11:03am

Yes, I'm resurrecting an ancient thread, but there seems to be a fairly easy solution to this now, check out the mysqlnd_ms PECL extension:

http://www.php.net/manual/en/book.mysqlnd-ms.php

The plugin automatically splits read and write queries to master and slave servers. Specifically, check out the configuration option master_on_write:

http://www.php.net/manual/en/mysqlnd-ms.plugin-ini-json.php

The setting may help with replication lag. If an application runs an INSERT the plugin will, by default, use the master to execute all following statements, including SELECT statements. This helps to avoid problems with reads from slaves which have not replicated the INSERT yet.

And if that's not good enough for you, you can implement your own selection rules with the "user" filter, by defining your own callback function to do your server selection for you. We have a project where we need exactly this, so we're going to try it out. I'll report back when we have something.

Interesting

Posted by Crell on September 25, 2012 at 4:16pm

Do keep us posted. I don't know if we can require mysqlnd for slave servers to work, though. Especially since we support non-MySQL servers, too.

Unfortunately, I'm no longer

Posted by sylv3st3r on September 27, 2012 at 8:35am

Unfortunately, I'm no longer developing the website that require the master-slave database, so I can't really test it now.

So currently you are our only test subject firebird. I might tried it with fresh Drupal installation, since I too want to try with noSQL database. I used mongoDB or mariaDB lately.

Keep us posted! And thanks.

Comments

2. is not a problem

One issue with that

Just thinking really quick

No regex

I was thinking more about

Possible?

I think situations like that

Per site?

not so fast

Temporary tables created on

Inserting?

I meant to replace

Possible standard

I would say a little bit more

What about cache_ tables?

Update?

We're using Pressflow since

Hi ronn, thanks for the

Yes, I'm resurrecting an

Interesting

Unfortunately, I'm no longer

Database

Group organizers

New groups

Group notifications

Hot content this week