Everything is a node

Events happening in the community are now at Drupal community events on www.drupal.org.
robertdouglass's picture
  • Users, comments, taxonomy terms and attachments become nodes
  • Why do this:
    • Simplicity, consistency and code reuse
    • Allow us to extend these content types as we do for nodes, without rewriting code
    • Allow relationships between these, without needing a mapping table
    • Why users:
      • Use CCK and taxonomy instead of profile.module
      • Use workflow for user signup/promotion
      • Location module does not need to deal with users separately, just location enable the users node type
      • Users birthdate can just be an event field. Birthdays are just repeating events
    • Why comments:
      • Use workflow to moderate comments
      • Use taxonomy to classify comments
      • Use relationships for threading
    • Why taxonomy:
      • The taxonomy builder becomes a streamlined interface to both build a tree of terms (nodes) and select an item from that tree to classify a node
      • Both the tree and each classification can be designated using relationships
      • All the taxonomy storage and retrieval functions can be encapsulated by the relationship API
    • Why attachments:
      • Less confusion about when to attach media directly to a node and when to add it using a media module node
      • The current attachment interface could then be a quick and easy UI to view/add/update attachments - uploading and adding attachment nodes, and relationships to those nodes (from the original node) on the fly. Expanding title/description fields that update the attached node could also be provided
      • A unified view of all attachments, whether they are added using the standard attachments control, or a more sophisticated media module
      • Potential to write generic gallery modules that don't need to understand different media module node types
  • Each of these already has a well defined APIs, meaning that changing the backend storage to nodes should not be overly hard
  • Existing API code would generally move to node hooks, and the old API functions would just become wrappers
  • Modules that are directly accessing affected tables (most commonly the users table) will be broken by this change, but if their own uid columns are updated things should still work fine, as user metadata would still be stored in the users table as it is now (the only duplication of data might be user name - stored in both the node title and the user table)
  • What will be significantly harder will be to upgrade existing sites, as all the id's will change.
    It should be possible to use some mapping to work around this:
    • Find the highest 'old-style' id in existence.
    • Ensure that the next nid is higher than this (bump it up if it is not). This value becomes the threshold.
    • Create nodes for each object, adding their old and new id's to a mapping table.
    • Update all fields in existing tables to use the new id's for each object (this could also be done automatically for contrib module tables, assuming that uid fields are for users etc).
    • When a request is made for an id below the threshold, map it transparently to the new id. This could be done at the function level for greater compatability, or (more simply) at the URL aliasing level.
    • This functionality could be wrapped in a module, which could be disabled for new sites as it is not needed.
  • Potential DEP Dependencies:
    • Relationship API. A lot of the benefits/simplicity of taxonomy and comments becoming nodes comes from the ability of relationships to provide a standardised way to store the node-relation metadata.

Comments

Users as nodes

Gunnar Langemark's picture

CivicSpace people work on a membership admin system. If users are nodes, you can use views to search and sort users.

Gunnar Langemark

Denmark

Is node module ready for this?

Jaza's picture

I'm very much a part of the "everything-should-be-a-node" camp - in particular, I strongly believe that taxonomy terms and vocabularies should be nodes, which is a big reason why I wrote the category module. ;-)

But for the dream to be realised, node module needs to be improved. IMHO, node module is not ready to support all of these data types right now. It isn't generic enough. Nodeapi needs to support only a small 'core' of operations, and should support additional 'op's being custom defined for certain node types. Speaking of 'node types', there needs to be a new field for nodes called 'content', which is a boolean indicating whether the node type is a traditional 'content type', or whether it is some other type that is not content (e.g. user, comment). The whole concept of 'node' equals 'content' needs to change - because in this vision, content is only a subset of what nodes are.

There needs to be more control over the output of nodes, to make it easier to embed nodes within other nodes (e.g. for comments and attachments), and to control things like whether nodes get listed on the 'node/add' page, or the 'admin/node' page. There needs to be support for system-defined or module-defined nodes (and I mean actual nodes, not node types), as well as the traditional user-defined nodes.

These are just the logical barriers to the everything-as-a-node dream being realised. Another huge issue that needs to be addressed is the performance barriers. In particular, how can we maintain the current performance of the comment system, when comments become nodes? This is the number 1 reason why comments aren't nodes already (AFAIK). I'm sure there are many other similar issues about performance in this vision.

However, none of these barriers are insurmountable - I am confident that, with enough support, and with enough confidence (that we're going in the right direction), we will break through all of them eventually.

Jeremy Epstein - GreenAsh

Jeremy Epstein - GreenAsh

great stuff

moshe weitzman's picture

jaza, your node.module improvements are really innovative and excellent. i encourage you and others to pursue this and submit patches.

Taxonomy

mfredrickson's picture

I spent some time thinking about the taxonomy aspect of this over the weekend, and I wanted to jot my thoughts down before they disappear.

  1. Assume that the generalized relationship is just the node reference field in CCK. By giving this field semantics (e.g. "parent of", "child of", etc) we have a relationship system.
  2. Taxonomy module gets gutted. Instead of implementing tables, it implements a system that creates CCK types on the fly. Each CCK type corresponds to a vocabulary. The nodes one creates using the "vocabulary" type are the terms. Taxonomy creates types with the appropriate relationship constraints in the node type (i.e. node reference fields for parent terms, synomyms, etc). In addition to the graceful aspect of this design, it also allows for some interesting per-vocab and per-term access control.
  3. Attaching terms to a node will could be implemented by providing a CCK node reference field via nodeapi/form_alter. Views could supply a list of terms that could be attached.
  4. Free tagging would require creating nodes on the fly. Not impossible but CCK needs a better module API to do this.
  5. Term nodes could be themed to provide interesting views of nodes tagged with the term.

I think this module could serve for other entities implemented as CCK types.

I am not convinced users can be nodes. I think profiles certainly can be. Comments to the contrary certainly welcome.

-Mark

heady stuff

moshe weitzman's picture

this sounds pretty great to me.

robertdouglass's picture

Jaza's category already handles containers as nodes and categories as nodes, including free tagging. It even works (very well) with CCK types, but doesn't force CCK to handle what is currently the job of taxonomy.

More review

mfredrickson's picture

I need to review category more. I have not spent a lot of time investigating it because it always seemed so closely tied to books, which I found strange. Who said I was open minded? ;-)

One thing comes to mind as to the advantage of this system: standardization. I mean this in several ways:

  1. Users are familiar with creating nodes. The process of clicking "create content" would be the same process to create taxonomy terms. Users would not have to create a node AND make it a term. (Maybe Category doesn't work this way anyway, as I said, more review necessary).

  2. Extensions to CCK benefit taxonomy. I'm think specifically of import/export functionality, but by using the CCK framework, taxonomy could benefit from improvements across the board (scaling, etc).

As I said, this is rough. I'll hit up category and see if fits the bill.

-M

Way to approach category

robertdouglass's picture

Don't get caught up in how much it can do... just turn on category module itself and marvel at how similar it is to taxonomy... only better because everything is a node. Then, only then, is it worth considering if any of the other half dozen modules can add something to the mix. Jaza's got me hooked. I'm a complete Category convert.

Oh Catmod, with thine distant parents.. =)

Jaza's picture

Awesome! I noticed that you're started to make your mark in the category module scene. You are most welcome in the category module community. At the moment, we have too many users (we need to kick some out), and not enough developers.

Interested in joining me as a slave to the issue queue (particularly the bug queue)? ;-)

Jeremy Epstein - GreenAsh

robertdouglass's picture

Man I wish I had any spare capacity to work on developing Category. You might have to settle for a cheerleader for the time being.

Category approach vs CCK-nodereference approach

Jaza's picture

If I'm reading this correctly, then what Mark is suggesting is that CCK can handle any and all relationships-between-nodes needs - including everything that category currently does with its current taxonomy-and-book-like database tables - simply by using nodereference fields.

I think that this is a great goal to work towards, as it will certainly provide a much more generalised and 'clean' approach than what category currently does. But, of course, CCK right now is not ready for this. There's no way that CCK can currently provide the vocab-selection-like functionality, the free tagging system, the powerful menu-item-generation system (category_menu) and context-sensitive display system (category_display), that you will find in category.

Also, I have serious overall concerns about whether using simple nodereferences to emulate the full taxonomy / book / category system in Drupal can work performance-wise. The current solutions in Drupal have customised schemas, that make performance actually 'work'. But hey, only one way to find out - try it!

Jeremy Epstein - GreenAsh

Jeremy Epstein - GreenAsh

Category on top of CCK?

Jaza's picture

Perhaps the goal (ultimately) should not be to get CCK by itself to do everything that category currently does, but instead to rewrite category so that it uses CCK as its foundation. So instead of storing relationships in its own custom schema, the category module simply stores them all in CCK nodereference fields. Category can then continue to handle things like menu item generation, custom navigational display, etc. It could even have a similar API to what it has now, except that the API works with data in CCK form, rather than the data in its current form.

The category module at the moment is a massive compromise, under the hood at least - see http://lists.drupal.org/archives/development/2006-03/msg00609.html, of which this is a quote:

At its core, [category is] a hack that merges book and taxonomy
functionality without re-assessing the underling architecture. It makes
sense as a contrib module for sites that really need it, but I think
moving core in that direction would be a mistake.

Far better, IMO, to explore a more robust relationships system that
*encompasses* taxonomy-style metadata and book-style relationships (as
well as other stuff). There are a couple of projects along these lines
already. If there's talk of revisiting something as fundamental as
taxonomy, make the project worth the effort.

I do not deny any of the statements made in this quote, and I am certainly open to the possibility of rewriting category to depend on a generalised relationships system under the hood. The more work that goes into making this a practical option, the better.

Jeremy Epstein - GreenAsh

Great idea

mfredrickson's picture

I think you're on to something: let CCK handle data storage and UI (the goal of CCK); let category provide interesting information based on the semantics of the data.

Again, the real key to this is making CCK module friendly (for example, an import/export feature/API :-)

I strongly encourage everyone to get involved with CCK and make it the center piece of later work (rather than put time now into recreating a subset of what CCK does).

-M

1++

robertdouglass's picture

imo too few people are actually getting involved in direct CCK development. I would have thought that people would jump at the chance to start making widgets for their modules. I guess we need developer documentation for it.

Docs are key

mfredrickson's picture

I'll take a stab. I've written some CCK field widget code (mostly by blindly copying the examples) but I can at least document that.

Does anyone know if there is an API reference yet? I think this would be the most helpful starting document - tell people what each hook does.

-M

views parallel

moshe weitzman's picture

I think you're on to something: let CCK handle data storage and UI (the goal of CCK); let category provide interesting information based on the semantics of the data.

this reminds of nthe relationship of views to cck: cck handles forms and data storage, while views provides interesting UI on top. not a perfect parallel, but FWIW.

yes, cck docs would be quite useful.

Where does CCK need to go?

mfredrickson's picture

If I'm reading this correctly, then what Mark is suggesting is that CCK can handle any and all relationships-between-nodes needs - including everything that category currently does with its current taxonomy-and-book-like database tables - simply by using nodereference fields.

Yes. The nodereference field is the mechanics of the system; users and modules add semantics to the field to give it meaning. For example, on a page node, we could add a nodereference field to be the "next page" link.

But, of course, CCK right now is not ready for this. There's no way that CCK can currently provide....

In my mind, the real problem with CCK right now is lack of an API. Everything is done through forms. It is near impossible (or at least a rediculous amount of work) for modules to emulate form submission (as we've recently seen on the devel mailing list).

When this API matures (either through maturity of the CCK module itself, or the generalized FAPI 2.0) then I think most of these things can be done. Take free tagging: in my proposal free tagging is just creating new nodes (and/or looking up existing nodes by title) on the fly. This is not a conceptually difficult problem. It's only hard to do because CCK doesn't have a content_api_save($node) function.

I have been approaching this from a Drupal 5.0 perspective, after CCK has been in core for a revision. Can we implement this today? No. Can we work on CCK to get it up to speed? Yes.

issue queue:
http://drupal.org/project/issues/cck

Also, I have serious overall concerns about whether using simple nodereferences to emulate the full taxonomy / book / category system in Drupal can work performance-wise.

I am wary of saying something won't scale before it's implemented. Moreover, I'm worried about cutting corners in the name of performance. Pre-mature optimization can really hamper a project. But of course, this is not exactly what you are saying, so I won't put words in your mouth. :-)

I didn't write this

robertdouglass's picture

I forget who did, but I posted it here from the DEPs pages on drupal.org. I don't necessarily agree with everthing written here, but certainly most of it. Comments and profiles should immediately become nodes. If Jaza's category module proves itself (I certainly like it), then taxonomy terms and vocabs are nodes. Where I get a bit skeptical is users as nodes.

OK I own up

owen barton's picture

It was me.

Well, I got stuck in a hotel room for several hours after a Drupal meetup....what was I supposed to do?

This is indeed a bit of a 'dream plan', and that is how it is intended.

The users-as-nodes bit does seem to be the hardest, but if we can keep the basic user API the same is not actually all that difficult. The user_save and user_load functions (etc) can just fire off a node_save or node_load call, and the guts of these can move to the node hooks. Voila, users=nodes!

Of course, the tricky part here is the namespace problem - i.e. that the load and save function name postfix are used by both the user API and the node hooks, but that's not insurmountable. Either we:
* Let a helper module do the node api stuff (to avoid the user_ namespace entirely)
* Have the functions check the arguments so they work out whether they are being called as a node hook or a user API call (ugly)
* (for the long, but clean, haul) rename the user API functions - to e.g. user_load() becomes user_api_load() or user_fetch() - via a big CVS search/replace job to allow user.module to become a standard node module without the namespace issue)

I have done some playing with the second option and had some basic user=node saving working for a while (before heading on a different track) with only 15 lines of code or so. I would be interested in hearing from people which (if any) of these approaches, or others might be best to explore.

narres's picture

This page is indended to leave blank ;)

A joke? perhaps not

mfredrickson's picture

You may have intended this as a joke (sarcasm is hard to read on online), but it is a worthwhile question. While it is an interesting idea, the obvious reason to avoid this is security. With modules and themes as nodes, we would be executing code out of the database. If this code could be rewritten, that would be a big security risk.

But an intriguing idea, nevertheless.

Not a joke

narres's picture

.., but I don't thought to store modules directly as nodes. But the possibility to store them as attachments exists.

The same arguments for users, comments, taxonomy terms and attachments are valid for modules and themes, too.

Sure, this may be a security risk, but this risk exists already, if users become a node-type.

The fortune of modules & themes as a node would be:
- Revision (Versions history)
- Easier to upload

Everything is an object!

narres's picture

If "Everything is a node" in the database this would have some strange performance results.
A "typical site" with 900 users, 9.000 nodes and 100 taxonomy entries has as a calculated join a weight of 810.000.000 (900x9.000x100).
If these 3 types are stored in one table you will get one table with 10.000 entries and the calculated join will have a weight of 1.000.000.000.000, wich has a factor of ~1200 in relation to the first (current) scenario.

Sometimes "The devolpers heaven is the database optimizers hell".

But what shall we do now?
What's about a set of OO-functions which are implemented as wrapper access to real nodes, users and so on?
The may implemented as:

  • get_object_name($type, $id)
  • get_object_parent($type, $id)
  • get_object_children($type, $id)
  • ...

Everything should be a node, but store in different tables

mki's picture

In Drupal we call and process many objects as nodes. Main problem in discussion on "everything is a node" or "comments as nodes" is a performance issue.

So what about having different tables for nodes? Which nodes to which tables? The main criterion for choosing that is content type, because different content type have different properties, like fields or special action. At present we have nodes, comments and users, that are store in different tables. But the point is that comments and users don't have the functionality that nodes have.

How to identify nodes:

http://example.com/story/1
http://example.com/page/1
http://example.com/comment/1
http://example.com/user/1

As you can see, we need provide not only node id, but also content type. So far this is needed only for comments and users, because both story or page are just a node.

The problem is that every content type table should have some set of identical fields, for instance nid, vid, created, changed etc. Despite everything this will result in better performance than current, because there will be a few tables for every content type instead of one table for all content type. And every processing could be done only for specific content type table.

drupal-entities-part-1-moving-beyond-nodes

kenorb's picture

Drupal Entities - Part 1 - Moving beyond nodes

Article mentioned from here:
http://www.istos.it/blog/drupal/drupal-entities-part-1-moving-beyond-nodes

Everything is an entity in

adaddinsane's picture

Everything is an entity in Drupal 8 - even roles.

Relationships & site structuring

Group organizers

Group notifications

This group offers an RSS feed. Or subscribe to these personalized, sitewide feeds: