Web Publication System for Newspapers

Mike Wacker's picture

A copy of the official proposal can be found at http://www.people.cornell.edu/pages/mew66/gsoc.pdf.


Hello everybody,

I'm Mike Wacker, Assistant Web Editor for The Cornell Daily Sun. Over the past school year, The Sun has used a web publication system, codenamed MustRun, to manage our content online from when the story is assigned to when the story is finished. I coded the group of modules that are MustRun, and other web staff on The Sun have contributed to this project as well. Currently, I am working on the second version of MustRun, and I've found that there are so many things that can be done with this version, especially after getting feedback from the current version which has been in use for almost a year.

Thus, I thought I should mention this as a potential Google Summer of Code proposal. Such a project would help expand Drupal's reach in the realm of journalism. The Sun built MustRun in the first place because it could not find a comparable contributed module. As newspapers consider switching to Drupal, it will be important that they can have a module out-of-the-box to help manage the publication of their content via the website. This is especially important for those publications that are limited in terms of their ability to create custom modules or modify contributed modules to meet their needs.

First, I should start with an idea of how MustRun works. In MustRun, our editors at first will create story assignments online. Staff writers will then log on to the website and take these assignments. They then put the story on the website and mark it as finished. There is also a way to give people access to the website to revise the articles, but in the end the editors will mark the story as ready to publish. Then, when the online edition of The Sun is ready to go out, one person clicks one button and the entire online edition goes out on our website.

For the second version of MustRun, I am going back to some problems I have looked at before and retackling them (as I often see a better way to do things now that I have worked with Drupal more), and I am also looking to incorporate new features. Here are some of the more important challenges:

Permissions:
Permissions frequently change with a node. All writers get access to pick up an assignment, then one writer who picks it up has access to it, but once the node is finalized, only editors should have access to it. Others can access it to revise the article, and of course there could possibly be a hierarchy of permissions for the editors, too. And of course, only sports writers should pick up a sports story, some stories should not be picked up by new writers, etc. Drupal's node grants have gone a long way in solving these challenges, but I am also experimenting with other ways to do permissions. For example, I tried using a form that programatically updates a node when a writer checks a checkbox to pick up the node and then submits the form. This lets me use PHP and the Form API to control permissions on an even finer level.

Email Notifications:
Writers should get notifications if a story is available to pick up, if an editor directly assigns them a story, etc. Also, editors may also benefit from some email notifications here and there.

Dates:
Articles just don't have a date they are published. They have a due date; there may also be an associated event which occurs at a certain time. Thus, not only must these dates be added, but they have to be integrated into the system.

UI:
It's especially important that the user interface to access all content in MustRun be intuitive and easy to use. From one calendar, users should be able to find content to pick up, view content they have picked up, and see the dates for all relevant content, etc.

Ease of Use:
Along the same line, a simple staff writer should have a very easy time using the system. But this matters a lot for the editors, too. They're managing all this content in MustRun, and they may not want to send an email notification for this article, change a setting for feature X, etc. As the feature set for MustRun adds up, the complexity of the administration pages could potentially grow to an unmanageable level. This complexity needs to be managed.

These are just a compilation of my thoughts on the matter. I'm sure I've missed a few things, and I probably will have made a lot of progress on some of the other items before GSoC even starts. But I wanted to put my thoughts out here so I could get some feedback.


Version 1 of MustRun can be downloaded at www.people.cornell.edu/pages/mew66/mustrunv1.tar.gz. Its provided on an as-is basis. Note that the architecture behind it is real esoteric (with changing node types and strange invocations to nodeapi) and has changed to something much more reasonable for version 2. Basically, the architecture is the closest you can get to hacking the core without actually hacking the core, and I needed the mustrun_image module just to make it work with the image module. mustrun_type and mustrun_group are two modules that provide MustRun with two more options on how to do permissions. I also believe that email notifications are spread between mustrun and mustrun_group. Keep in mind, this is the old version, and I didn't know as much about Drupal back then.


The newer version of MustRun in progress can be downloaded at www.people.cornell.edu/pages/mew66/mustrun.tar.gz. It contains all the features that are fully working; some incomplete portions of code have been excluded. Basically, this has three main features: it has the fundamental architecture of MustRun, the ability to clone nodes (for a recurring assignment), and the ability to publish many nodes at once. The way permissions work now are (relatively) simple at this point.

Comments

Sounds like this would be

femrich's picture

Sounds like this would be very useful as a module (or as a recipe using existing modules). The trickiest part seems like first allowing reporters to submit and then re-edit their own content, and then revoking their editing privileges once the editor marks it for publication. (As a one-time editor, I have experienced the frustration once or twice when a reporter stepped in and started re-editing copy I had already finished editing...) How do you manage that now?

Nice

It should be posted under

Mike Wacker's picture

It should be posted under both the newspaper group and the soc group. I know it was appearing under both.


But in regards to reporters reediting articles, in the old MustRun there were many different story statuses, but they fell into 4 categories.
1. Statuses for stories not picked up or assigned
2. Statuses for when the story was being written
3. Statuses for when the story was being revised
4. Statuses for when the story was approved

These statuses were all listed in one big set of radio boxes, and the list was quite long. But when the status was something in category 3 or 4, the writer did not have access to edit the article (but could still view it).

For the new MustRun, I developed three different statuses: one for the overall status of the article, one for the author, and one for revision. This gave me three smaller and more comprehensible lists instead of one big, nasty list. Editors at any time could change the overall status from "Content in Progress (All Staff)" to "Content in Progress (Editors Only)" to lock out everybody except editors from the article.

Also, for the author status, there were two statuses for when the writer had finished writing the article. One gave the writer permission to change the article; the other didn't. Only the editor could select or deselect the status where writers could not change their articles.

Technical Note: In order to do this, I always have the node marked as published in Drupal (i.e.: status is 1 in the node table). If its unpublished, then the writer will always have access to their own article. However, node grants are used to ensure that the general public can't see the article when its published in Drupal, but not actually published in the traditional sense.

Good thinking

yelvington's picture

Good start. Certainly the Drupal "published" flag is inadequate and pretty much irrelevant in this context.

I worked with print newsroom production systems for many years, and back in the late 1990s I did some brainstorming with a major systems vendor about a potential future multimedia content management system. I do remember some of the details of that project, which the vendor canceled for financial reasons. Many of Drupal's fundamental tools (especially the flexible taxonomy system and the trigger/action system) are very applicable to the problem.

I want to think more about it, but I'm not sure your approach to handling permissions scales well to the needs of a newsroom that might have, say, 200 people working in parallel. It's not just about permissions; it's very much about communication. The old Atex model of mapping permissions to individual and shared "desks" with ACLs applied to the desks might be replicable in Drupal.

Is your code in a state that you would be willing to post it as a project?

I'll look into gathering my

Mike Wacker's picture

I'll look into gathering my code and posting it. Its probably not well-commented, thought, and there may be parts not up to the Drupal coding standards. I can certainly post the code for version one, and I'll see what I can post for version two.

In terms of permissions, right now they are mapped to user roles, not individuals. So if someone was a sports writer, they would be in the sports group, giving them permission to pick up articles in sports. I've also tried to work the code in a way that someone could upload a module that would provide a whole new set of permissions (via node grants). The current system of permissions I have works at our office, but I could develop a module for a new way to do permissions based on a different model, upload it to the website and go to a menu to select that set of permissions. Thus, on the issue of permissions, perhaps I could develop several different modules, each one for a different paradigm of permissions.

I just posted version 1 of

Mike Wacker's picture

I just posted version 1 of the code. I'll work on getting version 2 to a form where I can also post it. Look back to the bottom of the original post for more information.

The newer version is now

Mike Wacker's picture

The newer version is now posted as well. See bottom of the original post again.

Looks interesting, but some

stdbrouw@groups.drupal.org's picture

Looks interesting, but some of the features you describe are easily doable with existing modules (cck for a due date, workflow for eh, the workflow and permissions-based-on-article-status, and actions for the email notifications). Does NextRun make use of this existing functionality and is the goal of the module to "wrap everything together" so it's easier to use or does it start from scratch? If so, what was your motivation and where were modules like workflow lacking?

In terms of workflow, I

Mike Wacker's picture

In terms of workflow, I haven't used that module, but there does not appear to be a version for Drupal 6. The second version of MustRun will be built with Drupal 6. But I'm not sure workflow could handle the permissions. Those get so complex that I have to rely on a system of node grants, and like I said I'm experimenting with PHP along with the FormAPI to control permissions for picking up the article.

For email notifications, if an editor directly assigns an article to a writer, an email notification would be sent. But if the editor just made the article available for someone to pick up, then an email would go to everyone who has permission to pick up the article. And sometimes, perhaps the editor may not want to send out an email notification in a specific case. This sounds doable with actions, but I would have to code in some of these actions myself.

As for dates, if I just needed one more date for the due date, then I could just use CCK. But suppose a node has a due date, a second due date (a more hard deadline), two related events which both occur on a certain date, etc. And perhaps another story will just have one due date.

Also note that field

KarenS's picture

Also note that field permissions are going to be added into the D6 version of CCK at some point (not immediately, but hopefully soon -- Moshe is working on it). That may affect this project.

Mike, it would probably

stdbrouw@groups.drupal.org's picture

Mike, it would probably entail more work indeed if you'd want to use and adapt existing modules rather than update the code that you have, but I think the benefits are clear. By adding features to existing modules or integrating your module with them, you'll be able to accomodate more use-cases and benefit from the cooperation of others that work on Workflow or on modules that integrate with it. Workflow(-ng)/Actions will be available for Drupal 6 sometime in the future, and is a very solid module.

I looked at the workflow

Mike Wacker's picture

I looked at the workflow module; the last CVS messages date back to January, and they cover the 4.7 and 5 branches. I've seen no indication of work for a port to 6, so I can't bet anything on being able to use workflow over the summer.

Another idea I considered initially was using taxonomies for the status of the node (in the spirit of using other modules), and I actually did this in an earlier version of the code. However, one main issue bugged me: taxonomies can be modified. Since the status of the story is so closely related to permissions, any accidental modification of a taxonomy (or intentional by someone who just decides to add a new status to the taxonomy) could create all kinds of security holes, and fixing it could be a nightmare. Perhaps I could make sure those taxonomies revert back to their original state whenever they're modified, but now I'm having to write more code to achieve this, although I was relying on taxonomy so I could write less. So I abandoned that idea; its not always a good idea to use another module.

I also took at what you did for multiple authors, using a multivalue userreference field. While that works, multiple authors shouldn't just be something that can be added to a node type via CCK. From a newspaper's perspective, it should be something that is integrated into all node types. These additional authors will also have to be integrated into MustRun's system of permissions, and CCK does not seem to have an easy solution for that.

I'll definitely be taking advantage of the trigger module which is included in Drupal 6, although its clear that I'll have to write some of the actions myself. Taxonomy also looks to be very useful down the line for other purposes, especially for assigning a story to different beats. Organic groups is also something I'm probably going to want to look at as well for beats. But another thing I have to consider is that I don't want to build MustRun so that is has an overwhelming number of dependencies.

Workflow:

stdbrouw@groups.drupal.org's picture

Workflow: https://more.zites.net/

Multiple authors: not sure I agree there, I want to have multiple authors for stories, but not for events, statistics and static page content types. You're right though that a userreference field doesn't do anything for your permissions.

That said, I don't really see the use of such "exact" permissions - I can understand that e.g. at a certain point you'll only want your editors to have access to an article, but I've noticed from my years here at our student newspaper that strict permissions usually only complicate things. For example: what if one person wants to read the unpublished article of someone else because he's working on something related, but can't because he's not an editor; or somebody wants a certain person to proofread his/her article because he's an expert on the matter but is not an editor... and so on. 'Real-life' rules usually work better. Ymmv, of course. :-)

Well there should always be

Mike Wacker's picture

Well there should always be the option of multiple authors. If they only want one author for that article, then that one author will be stored in the uid column of the node table. So there's no cost for node types that only take one author if you allow for multiple authors.

In terms of permissions, the complex rules do exist for a reason. For example, when I had a simpler permissions scheme, all writers could see the stories for all departments. Not only was this not desired, but additionally all these articles made the scroll bar quite long. I have also made a 'mustrun view' perm in case somebody needs read-only access to all the articles (e.g.: business department). I also have a set of permissions related to revising an article. And, as someone else said, it can be quite annoying when an article is finished but the author goes back and changes a few things without the knowledge of the editors.

No system will be perfect, and you can always copy and paste an article into Word if somebody needs to look at it. But a strict set of permissions can make the user interface less cluttered with articles and also prevents unexpected changes to an article.

I'm also a journalist/editor

benc's picture

I'm also a journalist/editor and I would appreciate it if Drupal had the features you described, including a workflow that is suitable for the publishing/newspaper industry.

Part of my problem with Drupal in a newspaper/magazine environment is how to manage the content queue. As an editor, the simple content admin of Drupal is very basic. There needs to be a simple worskpace that is suitable for editors and "simple" writers.


The Power of Drupal Categories
A Podcast for Mac Switchers

I've added a link at the top

Mike Wacker's picture

I've added a link at the top to a copy of the official proposal.

+1

vivekkhurana's picture

HI!

I have some experience of publishing sector. I had built an online publishing system for a magazine and I fully agree with you, regarding the problems faced while using drupal in any journalism/publication oriented setup. Did you get any mentor for the project yet? If not, I will be interested in mentoring or co-mentoring this project.

regards
Vivek

No, I haven't got a mentor

Mike Wacker's picture

No, I haven't got a mentor yet. If you want to be a mentor, feel free to do so.

Thanks,
Mike

Please add schedule

vivekkhurana's picture

Okay please add a delivery schedule to your proposal. Ideally this should be a weekly schedule.

regards
VK

It's been done. I mentioned

Mike Wacker's picture

It's been done. I mentioned this in a comment on my app, but to let everybody know, in addition to updating the proposal to include a weekly schedule, I also updated the code for the current version of MustRun in development.

Detailed deliverables

vivekkhurana's picture

I saw your update but I think you need to write more about deliverables in the schedule. Since application is now closed and you will get only one update before google locks it down. I will suggest that you post deliverable details as public comments to the application. Once we are okay with the deliverables you can make the final update.

HINT: Add details like what will you do when you say "Multiple Users/Dates" or what will be the deliverables for "Architecture Design and Review". Much better break the schedule into one week granularity instead of 2 weeks. Mention what will be the deliverables on every Friday/Saturday of a given week.

OK, project schedule and

Mike Wacker's picture

OK, project schedule and deliverables have been replaced with this one detailed section. Tell me if this looks good before I commit it to the proposal.

PROJECT SCHEDULE:
Weeks 1-2 - Multiple Users/Dates
Deliverables: A lightweight module for multiple users by the end of week 1, everything else by week 2

Weeks 2-3 - Architecture Design and Review
Deliverables: Any design flaws in the current code hopefully go away here so they do not return later

Weeks 4-5 - Permissions
Deliverables: At least two modules (one per week), each with a set of permissions that can be plugged into MustRun

Weeks 6-7 - User Interface/Calendar
Deliverables: Calendar interface up in week six; integrated with MustRun in week seven

Weeks 8-9 - Email Notification
Deliverables: Framework for email notifications in MustRun after week eight; Completed in week nine

Weeks 9-10 - Buffer to Handle Delays; Documentation; Hardcore Debugging; Fine-Tuning
Deliverables: Catch up where I fell behind; make bugs go away; documentation started

Week 11 - Documentation; Beta Testing
Deliverables: In addition to complete documentation, I'll likely fire up the system on cornellsun.com

calendar picker

catch's picture

Only thing I'd say is that Date module has a js calendar picker now (I think), so with a bit of luck you'll be able to use that rather than having to write a new one.

Cool. I've mentioned

Mike Wacker's picture

Cool. I've mentioned looking into the Date and Calendar modules somewhere in my proposal, but if Date has its own js calendar, I suppose I wouldn't need the latter module once I figured out the Date API.

couryhouse's picture

Looking forward to using this to help our online news source!

As we are now just using flat html files you can well imagine
the mess we have now that we have alsot of content. Is there a
sample site set up to test? see us at www.glendaledailyplanet.com
feel free to email me there too.

ok got it... the Sun is already using Drupal.

couryhouse's picture

ok got it... the Sun is already using Drupal.
Nice! we have massivly more phot and video content though...I will have to study this.. suggestions?

Ed

Newspapers on Drupal

Group organizers

Group categories

Topics - Newspaper on Drupal

Group notifications

This group offers an RSS feed. Or subscribe to these personalized, sitewide feeds: