Introducing "Materialized Views"

You are viewing a wiki page. You are welcome to join the group and then edit it. Be bold!

Posted by david strauss on December 19, 2008 at 8:20am
Last updated by chx on Fri, 2008-12-19 14:48

There have been quite a few posts to the Fields in Core group about the decision to put all fields in individual tables, despite the apparent scalability penalty of removing the per-content-type (per-"bundle" in the Drupal 7 Field API jargon) tables. The reason is that we have something better to replace it with, and we've decided to simplify the Field API code by removing now-unnecessary code related to per-content-type tables.

To gain an idea of the scope of simplification, the DB and CRUD components from CCK, which we trivially ported to Drupal 7, initially totaled 1439 lines. The working copies of DB and CRUD components for Field API now stand at 697 lines, and we haven't even stripped all of the per-content-type code yet. The Drupal 7 Field API will likely gain some code before going into core, but I'm not expecting anything close to 742 more lines. The Materialized View API currently stands at under 800 lines and provides superior capability to the existing per-content-type tables.

Despite being approximately line-neutral, our current code now allows dynamic consolidation and indexing of data from all over Drupal into these new "materialized" views. First off, these materialized views aren't database VIEW objects or related to Earl's Views module. They do not require triggers. They do not depend on massive and slow INSERT INTO ... SELECT operations. They do not give you performance at the cost of stale data. They store their data in standard database tables. And they can build the per-content-type tables several users are afraid of losing, in addition to doing much more.

The easiest way explain the Materialized View API is to go through a quick example.

Modules define materialized views using hook_materialized_view(), which returns an array of MaterializedView objects. Modules instantiate these objects by calling $mview = new MaterializedView('my_materialized_view');, where "my_materialized_view" is the name of the table the module would like to store its MV in.

You can embed several types of operations into materialized views:

Static filters: These apply hard-and-fast rules to the objects appearing in your MV. If objects pass these filters, they get indexed into the MV table. If they fail any one filter, they will not be indexed into the MV table.
Dynamic filters: These filters help you embed data in MV table columns and properly index it for fast application of WHERE criteria later.
Sorts: Sort columns are included in MV tables for potential indexing to accelerate sorting performance.

To continue the example with $mview, let's add a static filter for published status:

$mview->addStaticFilter(new MVNodeStatusColumn(), new MVEqualityOperator(1));

This adds a static filter using the MVNodeStatusColumn data source and compares using MVEqualityOperator against the value "1". This filter restricts the contents of the MV table to published nodes.

At this point, the my_materialized_view table contains only the two default MV table columns: entity and entity_id, which identify the objects in the materialized view. The "entity_id" in this case is the node ID because this view will only contain node "entity" objects. Only by adding dynamic filters or sorts to an MV object will we get additional columns.

Let's say, at query time, we'd like to be able to filter by taxonomy term:

$mview->addDynamicFilter(new MVNodeTermColumn(), new MVEqualityOperator());

This will add a term ID column to the materialized view. We now have a table with three columns: entity, entity_id, and tid. This table will index published nodes. Nodes with multiple associated terms will get a number of rows corresponding to the number of associated terms; this allows the database to fully index the terms and nodes. This table will have an index on tid.

Now, let's put sticky nodes on top:

$mview->addSort(new MVNodeStickyColumn(), MV_SORT_DESCENDING);

We now have a table with columns: entity, entity_id, tid, and sticky. The index is now (tid, sticky).

As a secondary sort, let's use "created":

$mview->addSort(new MVNodeCreatedColumn(), MV_SORT_DESCENDING);

We now have a table with columns: entity, entity_id, tid, sticky, and created. The index is now (tid, sticky, created). We've consolidated and indexed term and node data in a way we couldn't have before without significant manual work.

We can now run the following fast query, which has the implicit filter of only including published nodes:

SELECT entity_id FROM my_materialized_view WHERE tid = 3 ORDER BY sticky, created;

Existing nodes will be indexed into this materialized view, starting with the highest node IDs. New and updated nodes will be added to, removed from, or updated in the MV dynamically, keeping the materialized view up-to-date with minimal action in the database.

While this example shows custom data sources, Materialized View API natively supports Field API fields for use in MVs without any extra code.

As long as your module doesn't change its materialized views after installation, the Materialized View API will automatically add and index your module's materialized view tables. Thanks to Schema API and Field API, you don't have to provide Materialized View API with any information beyond your data sources (like MVNodeStickyColumn) for it to build your table.

If your module does dynamically change its MVs, the MVAPI allows you to call materialized_view_reconcile(), which causes the MVAPI to take a fresh look at your MVs, add/drop tables, and re-index as necessary.

The Materialized View API builds its indexes using the following order and rules:

Dynamic filters using the equality or LIKE (without any %) operators.
An indexable dynamic filter not covered in the first category. If such a filter exists, the index building stops here because MySQL cannot generally use further data from the index in the queries MVAPI expects on the table.
Sorts, until the direction (ascending or descending) changes. MySQL cannot use indexes to process sorts after the direction changes. MVAPI may be able to change the effective direction of numeric items through negation. (An ascending sort of x is the same as a descending sort of -x, and vice versa.)

You don't have to understand the index-building rules to use Materialized Views. The MVAPI has been built around automatic, intelligent determination of the indexes for you.

Aside from countering the scalability concerns with the table-per-field Field API architecture, MVAPI has some clear uses in core:

Tracker
Comment (node_comment_statistics)
Taxonomy (as shown in the example)

Comments

I think this could be a good idea but has many many gotchas.

Posted by adrian on December 19, 2008 at 3:48pm

This would lead to incredible data duplication, however, if you take it in the context that it is essentially an indexable cache subsystem, that's not the main problem.

What I do see being a problem is keeping all these duplicate items in synch, and the fact that direct modifications of the database (outside of the api) will become impossible. This is the kind of problem that database triggers are best suited for, but they aren't available in all the databases we support.

This is inviting synch problems on a level beyond having to clear your cache, and performance improvements that this presents will all but disappear if you need to regularly reconcile your materialized views. The business logic is not the correct place to implement something like this. If you need to make a change to 300 different nodes based on a query (ie: update only certain nodes), this will drastically increase the complexity of that query, possibly even making it impossible without selecting and recurring through the nodes and only ever using the api to modify things.

Even if we do, we simply can not ship with a default that performs poorly. At the very least there needs to be an automatic fallback to the current per bundle like schema, that all modules can depend on. For any module that does queries against the tables, this will have implications.

Essentially you will be moving complexity away from cck and into views and other similar modules. Views will have to understand which mv's are available, and which ones it should use to get the results it needs in the most efficient way possible, and for some cases, you would actually end up needing to tell views about the back end views and which one to use (which is really complicating an already complex module). Views might even have to generate it's own MV's.

This will drastically increase writes to the db as well, if every change that needs to be made needs to be made in several different places.

--
The future is so Bryght, I have to wear shades.

i rescind this part of that comment

Posted by adrian on December 19, 2008 at 3:51pm

Even if we do, we simply can not ship with a default that performs poorly. At the very least there needs to be an automatic fallback to the current per bundle like schema, that all modules can depend on.

I see there is a fallback.

--
The future is so Bryght, I have to wear shades.

What I do see being a

Posted by david strauss on December 19, 2008 at 4:25pm

What I do see being a problem is keeping all these duplicate items in synch, and the fact that direct modifications of the database (outside of the api) will become impossible.

If you do need to perform data modification outside Drupal, you have plenty of problems already. You'll have to invalidate multiple caches and manually manage any foreign key-related issues. If you were updating CCK data in Drupal 5 or 6, you'd have to read CCK's tables to find where to modify the data.

We've solved that last problem, which makes external modification of the canonical data much easier. When it comes to the MVs, you can trigger a full reindex by deleting the rows tracking the indexing of relevant MVs from the materialized_view_index table. It would also be possible to make a reindexing job queue available in the database with a contrib module. You'd just have to walk through the job queue table and ping the MVAPI to reindex those items. I don't think explicit support for external data modification should be a goal for core.

...performance improvements that this presents will all but disappear if you need to regularly reconcile your materialized views

Reconciling your materialized views has no effect unless you've changed an MV, and even then the effects apply on a per-MV basis. Operations that would affect the MV set on your application should not be common, production-level operations, any more than modifying CCK fields and Views would be casual operations today on a production site.

The business logic is not the correct place to implement something like this.

We don't really have a choice. The vast majority of Drupal sites run on MySQL, and no open-source relational database seems to provide indexed or materialized views. The only possible in-database method is with triggers, which require SUPER privileges to use on MySQL 5.0 and use different syntax on every database system.

If you need to make a change to 300 different nodes based on a query (ie: update only certain nodes), this will drastically increase the complexity of that query, possibly even making it impossible without selecting and recurring through the nodes and only ever using the api to modify things.

If you need to modify nodes in batches of hundreds at a time, you'd want to follow the same advice as for external data manipulation: reset the MV indexing state or develop a tool to manually queue a large number of individual updates.

Views will have to understand which mv's are available, and which ones it should use to get the results it needs in the most efficient way possible, and for some cases, you would actually end up needing to tell views about the back end views and which one to use (which is really complicating an already complex module). Views might even have to generate it's own MV's.

The likely case is that Views would either use the "fallback" you acknowledge above or specify its own MVs. It's a rather straightforward conversion from Views criteria to materialized views.

This will drastically increase writes to the db as well, if every change that needs to be made needs to be made in several different places.

If objects fail to meet static filters, they won't get written any additional times. I'm not arguing it's not important to balance the advantages of MVs to the increased save time (which is imperceptible until you reach quite a few), but it's the same problem as deciding when to cache. If you cache more, then you get a speedup on read, but you also write data to more places. Despite the write penalty, caching has exploded in recent years in Drupal and enterprise CMS systems because the read benefit usually outweighs the additional write cost. And, by definition, any object could only qualify for one "fallback" MV based on the content type-driven static filters, so we wouldn't have a significant "cost" to have MV provide what we already have.

In some sense: nothing new

Posted by chx on December 20, 2008 at 1:36am

If you have a node body, you store a copy of it in filter cache (given circumstances it can be the very same string), and then the page cache (maybe files with boost). The page cache will contain the blocks and the content lumped together. And then you might have squid in front of your system. Scaling web applications was always achieved by keeping multiple copies organized in ways that's best for performance. This system does the same: takes data and lumps them together in such a fashion that's well-accessible for SQL queries.

Maybe the cost of rebuilding MV?

Posted by markus_petrux on December 20, 2008 at 7:58am

All the caching techniques that you have described generate individual copies of items, and you don't need to rebuild the cache to keep functionallty.

However, if I understood the idea of MV correctly, you'll have to rebuild every MV when it's definition changes, and that may be a problem if there are thousands (millions?) of items involved.

You've identified one

Posted by david strauss on December 20, 2008 at 7:06pm

You've identified one significant difference between a materialized view and a standard cache. But I do need to make a correction: only the MV whose definition changes needs the rebuild. The other MVs are unaffected.

On Drupal.org, we use Tracker 2, which functions as two materialized views. We were able to build these at rates of 10,000 nodes per cron run, and our cron runs were not long. The rebuilds of MVs also start with the newest items, so things like forums, trackers, etc. become quite functional long before they complete indexing.

Integrity checking?

Posted by markus_petrux on December 20, 2008 at 8:48pm

Since MV mean duplicating data, maybe this needs some kind of method to ensure data integrity, and that may also cost depending on the size of the site.

The thing that concerns me with this idea is that it seems it won't be optional, so you should live with it, with its pros and cons.

It seems to me that this kind of things should be provided at a lower level. ie. the database engine. specially because there might be a huge amount of data to deal with.

Plus ça change, plus c'est la même chose.

Posted by bjaspan on December 21, 2008 at 4:34am

Yes, you have to rebuild a MV every time its definition changes. Rebuilding it means dropping the old table, creating a new one, and copying it again from its original location.

You also have to rebuild a "per-content type" table every time its definition changes. Sure, it looks like you are executing a simple "ALTER TABLE" query but, as David explains it, the db system actually creates a new table, copies all the data over, drops the old one, and renames the new one. Not so different than rebuilding an MV.

Perhaps you'll say that you don't change your content types very often and thus do not incur that cost often enough to matter. Well, you will have to rebuild your MVs exactly as infrequently.

That's correct, with a

Posted by david strauss on December 21, 2008 at 11:27pm

That's correct, with a distinction I'd like to highlight:

When you change a per-content-type field in CCK right now, your site is down (in several critical ways) until CCK finishes the ALTER or table rebuild. Your canonical node data is in flight or locked from editing.

With the Materialized View API, content listings using MV data (like forum topics and tracker pages) may be temporarily incomplete, but nodes remain reliably viewable and editable. The incomplete listings will, eventually, become complete as data indexes. You can keep sites online while they index with no risk to your canonical data or the accuracy of the indexing. This is a vastly preferable model for enterprise Drupal sites that cannot simply decide to take their sites offline for hours to change field configurations.

ALTER TABLE is much faster than INSERT SELECT with JOINs

Posted by markus_petrux on December 22, 2008 at 12:07am

...it looks like you are executing a simple "ALTER TABLE" query but, as David explains it, the db system actually creates a new table, copies all the data over, drops the old one, and renames the new one. Not so different than rebuilding an MV.

Sure, but ALTER TABLE is supposed to be much faster than... what would you need to do to rebuild one or more dependent MVs?

When planning for an ALTER TABLE, you don't need to worry much about rebuilding caches, as that will be done on demand. On the other hand, with MV you have to plan for the time required to rebuild the MVs. If I understood it well, features on the site depending on MVs will remain offline until the MVs are rebuilt.

There's also the problem I mentioned earlier about integrity checking. When things can go wrong, they will.

ALTER TABLE blocks. DROP +

Posted by david strauss on December 22, 2008 at 12:34am

ALTER TABLE blocks. DROP + CREATE + re-index does not. A slow but non-blocking operation is better.

A site depending on MVs is only "offline" if you'd rather have you site offline than have potentially visible gaps in data derived from MVs. For any site I've worked on, they'd rather disable access to the tracker for 4 hours or leave it on and have it be incomplete than turn the entire site off for 45 minutes.

Sorry, I disagree. "Better" is relative, it depends...

Posted by markus_petrux on December 22, 2008 at 1:01am

When you change data definitions, regarless of "blocking", you have one or more features offline (non-accesible) for some time. The faster method is better here, IMHO. Aside, when you plan for such changes, you may still run the features serving cached content. You may force caching what you need, during the time you need, and return back to "normal" operation when changes are done. Using MVs is slower and you have less options, if it's a required feature to run the site. MVs would have to be optional, IMO.

You're treating all features

Posted by david strauss on December 22, 2008 at 1:27am

You're treating all features equally; I'm not. To me, complete site downtime is orders of magnitude worse than a few features being offline or less functional. The features that would be taken offline to support an MV rebuild also happen to be the easiest features to serve from cached data.

Will fields depend on MVs?

Posted by markus_petrux on December 22, 2008 at 2:32am

If so, the features of the site affected by changes in data definition will depend on how you use fields.

For example, if you need content from fields for blocks that are spread on several pages of the site, you can cache the blocks during ALTER TABLES. If you are forced to use MVs, then that time will be longer... there may be several MVs affected, and there may be thousands (or even millions) of records involved. If you need to minimize the effect of accurate/current information that the site is offering, time required is critical. If MVs are optional, you can choose. I mean, we don't have MVs now, and we have options to deal with data structure changes without taking the whole site offline. It depends on the changes and the features affected, of course.

Aside from performance, storage requirements, data integrity issues... there's also another thing here. Probably, you will still need some kind of caching mechanism (blocks, pages, CDN, ...), specially if you're serving millions of pages (where content caching is a must), hence you're duplicating efforts if you're forced to use MVs.

Sorry if all this doesn't sound constructive. It's that we're running a huge site and I don't see where's the benefit of MVs, compared to what we would do without them. I'm working to migrate meristation.com to D6, and I'm pretty confident that D6 + CCK2 + Views2 + caching will be just ok for us. And I know we'll need to change data definitions because that's what happened until now, which is not because of bad initial design, it's because the requirements change because this is competitive world, you need to adapt to changes as quickly as possible. Sorry for the offtopic.

I guess we'll have to see this running to try with benchmarks on different scenarios and data volumes.

There are several

Posted by david strauss on December 22, 2008 at 1:06pm

There are several options:
(1) ALTER the table. Maintain caches for reads and a queue for indexing new writes. The site features have stale data while the ALTER continues. After the ALTER, it might still be necessary to index existing data to new or modified columns, which could take a long time. The new and modified columns in the MV may or may not be populated or correct for a given row while the re-indexing (if necessary) continues. Rows without correct data may cause errors for site features expecting to find the new data. So, the only safe approach is to continue keeping the site offline until re-indexing (if necessary) finishes. The method of determining how to run ALTER is in 1a and 1b.

(1a) Because MVs are only specified as they should be and do not include a series of "update" operations as they change, using ALTER would also require "diff"ing the schema to determine what to change. Such diffs may not be quite perfect.

(1b) Require module developers to specify changes to MVs in their "update" operations. This takes a lot of the magic out of MVs because module developers now have to think about how MV hook changes affect the underlying schema and apply the changes themselves.

(2) Rebuild the table. Don't do anything special for reads and writes. Re-index all data into the MV. Site features have limited historical data -- though entirely fresh recent data -- while indexing continues backward. All rows in the MV are guaranteed to meet the latest specification.

I'm still pretty firmly in the (2) camp, though it might be worth sort-of allowing (1b) for certain use cases on very large sites that intimately understand their MVs.

A degenerate case for Fields

Posted by bjaspan on January 16, 2009 at 3:57pm

I implemented the Per-Bundle Storage module which is a sort of rotated-90-degrees special case of Materialized Views just for Fields. I think it should resolve all complaints about per-field storage, though it does not provide all the potential benefits that the more general MV can. Please see http://groups.drupal.org/node/18302.