I am working on "fields in core" this week and recently found myself pondering the problem of very inefficient CCK-related Views queries. Here's a brief summary of the situation:
In D6 (and, through overwhelming though unjustified demand, probably in D7), CCK fields can exist in "per content type" tables that share many single-value fields or in "per-field" tables that contain a single field that is either shared among multiple content types, is multi-valued, or both. Also, in D6, the admin can change a CCK field from one to the other form of storage, perhaps without realizing it (not all admins are RDBMS wizards). If CCK exported information about fields to Views in a way that was tied to the actual database layout, then when an admin changed the database layout of a field, all Views using that field would break and require manual intervention to fix.
Therefore, when CCK exports its fields to Views, it always tells Views that every field is in its own table, even if it is not. Suppose that you define an Event type with a date and venue nodereference field:
Table content_type_event: nid, vid, field_date_value, field_venue_nid
CCK tells views that two tables exist, named something like "node_data_field_date_value" and "node_data_field_venue_nid." They both happen to be aliased to content_type_event, but Views treats them as separate tables. So, if you add both Date and Venue as fields to a View, you get a query something like this:
SELECT node_data_field_date_value.field_date_value, node_data_field_venue_nid.field_venue_nid
FROM node n
LEFT JOIN content_type_event node_data_field_date_value ON node_data_field_date_value.vid = n.vid
LEFT JOIN content_type_event node_data_field_venue_nid ON node_data_field_venue_nid.vid = n.vid
In other words, you get two joins to the same table (content_type_event) with the same criteria.
Here's my question: Can't this query be dynamically optimized to eliminate the redundant join? If I'm right, Views can do it, or a contrib module could do it with hook_views_query_alter().
My thinking is that when you have two joins to the same table using the same "ON" criteria, they can be merged. So:
SELECT t1.field_a, t2.field_b
FROM node n
JOIN table t1 ON t1.col = n.col
JOIN table t2 ON t2.col = n.col
can be rewritten as:
SELECT t.field_a, t.field_b
FROM node n
JOIN table t ON t.col = n.col
Am I missing something? When is this not a valid transformation? I'm sure that I am going down a well-worn path here and am either re-inventing something from the 1970s or missing the obvious problem with this approach. If I am missing something and this is only sometimes valid, CCK could certainly add a setting to the table definition it exports to Views saying that for this table it is a safe transformation.