Artesian: Initial brainstorming

We encourage users to post events happening in the community to the community events group on https://www.drupal.org.
You are viewing a wiki page. You are welcome to join the group and then edit it. Be bold!

This document is from my first start on Artesian last May, moved over here from the Forum Improvements group. I ended up having to put Artesian on hiatus all summer due to illness and another project with a deadline. I'll be starting back in on it, soon, and will be writing more documents to plan out each bit. I've moved this over here for reference of my initial thoughts.

Entities

Forum entity
Replaces taxonomy for forum categories. The entity is either a container, normal forum, or link forum. Need to decide whether it makes more sense to have these be different bundles or simply an option field. Leaning towards an option since the fields are mostly the same and that makes it easier for the user to convert between them. Also contains some denormalized data to cut down on queries when displaying the list of forums.

  • [schema] fid (forum id - primary key)
  • [schema] parent_id (parent id)
  • [schema / field] type (bundle or option field? - container, forum, link)
  • [schema] posts (count of published posts in this forum - not children)
  • [schema] last_post (pid of last published post in this forum - not children)
  • [schema] last_author (uid of last author to post in this forum - not children)
  • [schema] last_time (timestamp of last post in this forum - not children)
  • [field] description
  • [field] image (if enabled)

Thread entity
Contains denormalized data to cut down on queries and also some data that applies to the thread as a whole unit. This is meant as a way to handle a set of posts easily. Somewhat like a container but maybe more like a "summary".

  • [schema] tid (thread ID - primary key)
  • [schema] title
  • [schema] created
  • [schema] posts (count of published posts in this thread)
  • [schema] last_post (pid of last published post in this thread)
  • [schema] last_author (uid of last author to post in this thread)
  • [schema] last_time (timestamp of last post in this thread)
  • [schema] type (not sure if we need this - bundle type the thread is made up of)
  • [field] teaser (Maybe - Teaser of first post in thread. This would make some things easier but add a lot of data storage)

Post entity
All posts in a thread are the same entity / bundle type, including the starting post, to make splitting and merging threads easier and to keep consistency across all posts in a thread. The starting post is known because it has no parent ID. At this time, we are not worrying about threading but are storing the parent ID for the "Reply to #NN" link and so the data is there when we do add threading.

  • [schema] pid (post ID - primary key)
  • [schema] parent_id (ID of parent post to preserve threading for future use)
  • [schema] thread_id (thread ID - ID of thread entity post is associated with)
  • [schema] title
  • [schema] uid (author)
  • [schema] created
  • [schema] updated
  • [schema] ip (IP address)
  • [field] body
  • [field] image (optional)
  • [field] poll (optional)
  • [field] name (optional - for anonymous posting)
  • [field] email (optional - for anonymous posting)

(Note) Need to handle revisions as well. Currently Entity API (contrib) does not do this.

(Q) How do we handle "new" posts?

  • Current method stores the last time the starting post was read per user then looks if node has changed since.
  • We want to exclude updates from making a post "new" so need to differentiate added reply from edits.
  • Any way to improve performance on counting new posts? Better way to store it? Can't de-normalize since it's individual to the user...
  • Possibly allow a system of "new" posts based on time of last visit rather than traditional "unread" posts for heavily used forums as an option.

Connections

  • forum -> parent forum (many to one)
  • post -> thread (many to one)
  • post -> parent post (many to one)
  • thread -> forum (many to many)
  • post -> forum (many to many)

(Q) How do we handle relating the threads/posts to the forum(s) they are in? Relating every post will take more storage but may make calculations easier. Relating just the thread to the forum(s) is better for storage but adds another table to query. Does the denormalizing negate this? Also need to keep in mind that threads should be allowed to be in multiple forums which means fixing shadow posts to be an explicitly set field/flag on that relation rather than an assumed one.

Forum access

Since the forum access module is meant for core forum, it's questionable whether it can be made to work with this. These are just some notes at this point, to be fleshed out when we actually have a forum to restrict access to.

Permissions

  • view thread titles
  • view teasers
  • view complete threads
  • create post
  • create thread (requires or implies create post)
  • create thread in multiple forums (requires or implies create thread)
  • edit own post
  • edit all posts
  • delete own post (consider exempting starting post since that will delete all following posts)
  • delete own thread (if starting post exempted)
  • delete all posts
  • move own thread to another forum
  • move all threads to another forum
  • split thread
  • merge threads

(Q) Normal permission system is not really flexible enough for this. We need at least a role + forum combination and possibly a user + forum combination. So do we make a specific table for this?
(Q) Are permissions automatically granted to subforums or do they need to be set separately?

Other thoughts

Threading - Needs to be handled at some point. The sooner we can make that happen the better so we have the data storage we need. Storing the parent id is a start but not everything we need.
Viewing a thread - Do we use Views to display the threads? Not the thread list but the actual posts in the thread.
Thread lists - Should this make use of the thread entity and list those rather than a list of parentless posts?

Comments

Great start

quicksketch's picture

Hey Michelle, still really excited to see the possibilities here. As I've said before I'd love to help with the initial build-out of this, but I'm probably too busy to provide long-term sustained support. I'll help where possible though. :)

optional poll field

Polls still aren't fields, afaik. We'd need a different module (besided core poll) for polls. A new "poll field" module with the possibility of replacing core poll would probably be a good approach here.

Revisions on by default (need to find out how entities handle revisions)

All entities are capable of revisions. I think it's just a property in hook_entity_info() (revision table and revision key).

We might also want a fields attached to users to record things like "total forum posts", "rank/title", etc. Of course these could be separate modules (entity_count.module? badges.module?)

.

Michelle's picture

Any support you are able to do would be great. I want to make sure this is done well. I'm a competent programmer but I'm not well versed in performance, which is key here, and having other eyes on the basic architecture will help ensure there's a solid foundation to build on.

There is a contrib pollfield module. I don't know if that is remotely core ready and, honestly, wouldn't be the one to make it so. I was including it here just for reference as something that can be added, not necessarily included.

Thanks for the info on revisions. I know I have a lot of research to do when it comes to actually implementing this. I just haven't had time for that, yet, and figured I could at least make a start by writing up some basic design thoughts and getting some input. :)

I'm not sure how far we want to go on the user stuff. We'd be duplicating a lot of contrib there. It's hard with forums because stand alone forum apps do so much but I need to be careful to balance making a complete forum with not re-inventing existing contrib. There's the extra issue, too, in that I'm hoping to get the core of this into Drupal core so we need to figure out what should come with Drupal "out of the box".

I have to run. Thanks for the thoughts!

Michelle

The thread entity sounds like

catch's picture

The thread entity sounds like a good plan looking for the schema, if only for the denormalization - it definitely makes sense to maintain that information per-thread (and not in the horrible node_comment_statistics table we have now), what the actual queries end up looking like is going to determine some of this. I wasn't sure about the thread entity initially, but this seems a good use for it - basically metadata about the thread.

You can make a contrib entity that supports revisions by writing the functions yourself similar to node (not that node is actually a good example of revision handling, when you look at it there's a lot of bugs/limitations), or it might be worth trying to just get revisions support into entity module.

Question I need answered

Michelle's picture

I could really use advice on this question from up there:

(Q) How do we handle relating the threads/posts to the forum(s) they are in? Relating every post will take more storage but may make calculations easier. Relating just the thread to the forum(s) is better for storage but adds another table to query. Does the denormalizing negate this? Also need to keep in mind that threads should be allowed to be in multiple forums which means fixing shadow posts to be an explicitly set field/flag on that relation rather than an assumed one.

That's a pretty key bit that needs to be decided very early on. Any thoughts?

Thanks,

Michelle

Contrib module?

andypost's picture

Is there a sandbox or module or it's just a thoughts about refactoring?

Michelle are have seen a schemas for popular forum soft like vB invision?

.

Michelle's picture

It's a module, but it has been delayed a bit due to health issues: http://drupal.org/project/artesian

I haven't looked at the schemas for other apps all that much, just briefly at phpbb & mybb.

Michelle

Tested 3 forums

andypost's picture

vBulletin, phpbb, mybb - all have very similar architecture about database tables as you proposed (denormalization of topics)

"total forum posts", "rank/title"....yes

greg.b's picture

"We might also want a fields attached to users to record things like "total forum posts", "rank/title"" - I think that is a great idea.

I understand that its best to keep it as simple as possible to get the foundations solid but adding the above would be great as imho the above needs to come in at some point.

Keep up the great work.

multiple forums per thread

mototribe's picture

Hi Michelle,
it's great to be able to have multiple forums per thread. That limitation is currently preventing me from using forums.
I personally like a "tagging" system, for example checkout how drupal.stackexchange.com handles their "forum".
From a data structure standpoint it's pretty much the same, just a slightly different UI and user experience.

Being able to have discussions in groups would be another feature I would personally find useful.

all the best,

UWE

Can any entity be a 'post entity' ?

Vyoma's picture

I am not sure if this has been brought up before, but I was wondering if we can design it such that any entity (article or story or my-custom-entity) can act as a 'post entity'. I am not saying this is a deal-breaker or anything but just putting the idea out there.

This would of course bring up the point of how do we represent it in schema. Perhaps the parent_id and thread_id exposed as fields, that someone would then go and attach to their entities. After that, any of those entities could be used as thread starters or thread replies.

The brainstorming above might work as well, since post entity is an entity as well. Any customization I want, I can go ahead and attach it as a field to the post entity.

.

Michelle's picture

No, but you can make different entity types... bundles...? I'm a little fuzzy on the terms. Think of a "post entity" as "node" and then you can make "node types". Same idea here. It's a little restrictive in that you can't stick, say, an event node into the forum directly but I need to tone down the massive flexibility in order to have things I can count on for better performance.

Michelle

That makes sense. Not any

Vyoma's picture

That makes sense. Not any node, but a node marked in some form, would make it 'stick'able as post entity. I think I understand now.

(Q) Are permissions

alexanderpas's picture

(Q) Are permissions automatically granted to subforums or do they need to be set separately?

I would suggest both.

Permissions should be automatically granted to subforums unless they are set separately.