Drupal as a print CMS

We encourage users to post events happening in the community to the community events group on https://www.drupal.org.
yelvington's picture

As the introductions continue I'd like to start a substantive discussion: What roles can/should Drupal play as a content management system for print output?

Here's some context: I've been involved for the last couple of months in a background conversation in which one of the memes is the need for a single, unified system that can output to Web, print, audio, video and as-yet-to-be-defined channels.

I'm going to quote (without attribution) from a private email:

"It would be wonderful if a one-person weekly newspaper journalist could cover the news and write it up (with photos, video, audio clips) in a clever content management interface that would, in some wonderfully automated way, result in not only the production of a Web site, but also the creation of packaged files from which a low power FM broadcast can run, a community access cable TV program can air and, of course, a newspaper can be printed."

There's a lot to that statement and I'd like to set aside the audio/video issues for a separate discussion and focus now just on the print implications. Some things to consider:

  • Does a CMS really add value in a small-scale, totally local print context? My understanding of newsroom processes in Bluffton, which has 18 people on the news staff, is that there is no CMS involved -- just standard desktop tools (email, shared folders, Microsoft Word, Adobe InDesign).
  • When moving content from Web to print, how much automation is really needed? How much is better done with cut/paste?
  • Is there a real need for NITF and/or NewsML input/output? (In Savannah we are outputting NITF from a DTI system and loading it into Drupal for dynamic page rendering.) Would an NITF content type and parser module be worth developing as a public project?
  • Is the biggest opportunity in utility data, i.e. calendaring? How about a calendar configuration recipe and an output module? Obvious needs: Taxonomy-driven event selections, Quark/Adobe-friendly formatting, etc. (Ken Rickard wrote a preliminary event-dumper we use in Savannah.)

I had a notion awhile back that you could tie together Drupal, a Web browser, OpenOffice and a bit of XML-RPC code into an interesting, if primitive, print CMS.

Housekeeping note: You can subscribe to this group via email or RSS feed; see the "group notifications" block on the right side of the page.

Comments

NITF

Max Bell's picture

It's curious that you mention all this, because I've been looking at how NITF would be implemented in Drupal -- the biggest part of it is simply the XML parser. Yes, I'd like to see this implemented in Drupal, but GADS, is it UGLY. :D

That said, the initial premise is solid. Syndication, semantics, integration (page layout and media types) and a standardized taxonomy.

I suspect it's a bit bigger than merely tagging in a bit of XML-RPC code, though. It's possible that there's a GPL'd SAX/Java application out there that would provide the heavy-lifting aspect of XML parsing that could be integrated, but I haven't made it as far as looking for one (it's just made jokes like "Saying Java is good because it works on all platforms is like saying sex is good because it works on all animals" funny).

NITF parser

yelvington's picture

There is an NITF parser in the PEAR library, contributed by Town News.

victorkane's picture

The CMS I am trashing now I wrote in Java two years ago, with the Spring framework, and native XML database, to make everything work with NewsML archives.

This CMS includes Java classes for instantiating a Java class mappable to form fields from a NewsML XML file, and also includes persisting a form hash into NewsML (XML), and a database layer to talk to Apache's XIndice (still going strong with thousands of articles, even if that Apache project looks like it's dead).

NewsML has a bunch of Metadata and Topic infrastructure, and includes as a subset the NITF schema (from the waist down (body) so to speak).

I am positive I can port those classes to PHP, at the very worst PHP5 OOP; there is now XML support built in, the algorithms are clear (we could all work on that together, make a little prototype).

But the problem is not those classes. The real usability of the system is, how maintainable it is in terms of user needs, which grow and grow all the time. Not just multimedia, but also semantic tagging, workflow flexibility, and different forms of document databases and interoperability. AND the web.

So I am looking to port the whole thing to (beyond java... ?). I was thinking of Ruby on Rails, which I had been working with as a refugee from the bloated Java (write once, write again, write again...), but now I am drawn to finding a solution with the Drupal framework. Because it is the most productive thing out there in terms of productivity and easy maintainability.

Victor Kane
http://awebfactory.com.ar

Resurfacing: Web CMS as print CMS

yelvington's picture

I'm in Las Vegas at the Newspaper Association of America's Marketing/Connections conference, which combines digital media, classifieds, marketing, research and circulation tracks. I've been having a number of hallway conversations about Drupal. I'm not hawking anything; I'm being approached by others who want to know how we're using the software. (I was on a social networking panel Monday and mentioned Drupal.)

Several people have expressed interest in using Drupal as a print CMS as well as a Web CMS. In addition, I have a private email from a northeastern U.S. newspaper that's interested in that concept. I'm not sure what to think about it. Simple web to print export is no big deal. Replicating the functionality of DTI, CSI Europe, Atex, etc., is entirely another matter. However, I have to wonder what functionality is actually valuable and what's not. Newspaper workflow actually tends to be far more departmental (news, sports, business, features) than is generally assumed, and simple solutions may be possible. If you take a Christensen disruptive-innovation approach, what's a "good enough" CMS for a print organization?

interesting idea but a little scary

darthcheeta's picture

not saying it can't or shouldn't be done, but i'd be a little concerned about the node table, honestly.

sometimes, multiple users trying to hit/edit the same node at the same time can toss things out of whack and only rebuilding the table sets things right. this could potentially happen even more frequently if you were tapping mods like aggregator to automatically import feeds as a wire desk inside drupal, because the feeds would be grabbing node numbers that users might be creating simultaneously. session table might also become more vulnerable at that point. i have no doubt that mysql couldn't be relied on for this, it would have to be postgres. you would need serious redundancy, and you might even want to deploy two drupals -- one for back end and one for front end -- and dovetail them with hooks.

we've found that out-of-package drupal isn't real great about warning that a user already has a node open for editing until you've tried to make changes and save them, and it doesn't tell you which user has it. you'd need to use wiki or something to develop a more robust edit tracking module that addressed change management and provided other workflow, maybe coopting a project management mod for some functionality.

but the really big question i have is how would you paginate out of drupal and get camera-ready or plates to the press? sure going to a web focused cms sounds cool, but if you are cutting and pasting copy off of html pages by clicking around and waiting for databases to refresh so you can paste them into quark doesn't sound like an efficient workflow. i suppose if you were to dig deep into an open source quark alternative like scribus, you might be able to bridge them. but you are really baking a pretty serious system at this point and would have to take a very hard look at the cost of development vs. what you could buy already from a very challenged and highly competitive vendor space.

i hate to say this, but making a whole paper adds in a whole lot of complexity that web-cms can totally ignore, preserving system resources, functionality and focus for things that web cms are good at. hooking into pre-press or the wire capture/management system pushing files out via xml and rss is pretty standard and a convenient way to push pure content back and forth across online and paginated platforms.

if everyone is thinking alike, chances are no one is thinking.
www.davidandrewjohnson.com

if everyone is thinking alike, chances are no one is thinking.
www.davidandrewjohnson.com

some very good points...

victorkane's picture
  1. Even in a small group, something would have to be done about locking a file, shouldn't be terribly difficult to develop a locking system more evident and user friendly than the current one.
  2. To get stuff out of Drupal into Quark, Adobe products, Scribus, the most efficient answer is XML. Quark is so proprietary on this, tho, and the costs involved for Park Avenue stuff that doesn't even work that well does make that difficult, but that goes for any system that wants to feed to Quark (and get out!); In a small group the reading of html files and pasting may not be prohibitive, but this has to be overcome; with other products, this could be easier, but we could all share our Quark nightmares.
  3. What has to be recognized is that CMS serves to keep track of your article list for an issue, easily postpone an article to the next issue without losing it, all sorts of stuff (see other posts).

Victor Kane
http://awebfactory.com.ar

Regarding 1) you might have

sun's picture

Regarding 1) you might have a look at Checkout module.

Daniel F. Kudwien
unleashed mind

Daniel F. Kudwien
netzstrategen

On 2), it would be great to

sime's picture

On 2), it would be great to see how this type of thing is progressing these days.

victorkane's picture

A year or so ago I would have tended to agree with you.
But with the "new generation" of modules I don't think we are really that far away.
With CCK meta data fields can be added. All content can have states, and be easily inserted, detected and managed in a workflow, which can be visualized with Views.

With CCK it would be easy to create a NITF type article content type, with states, assigned to (like project module).

So when anyone logs in, they can see what they've got the ball on, and where they have to pick up.

There are alternatives, like Word or Open Office talking directly to a native XML database (which I have also used very successfully) acting as a master archive; with Drupal just on the receiving end. But the ease of programming real data objects in Drupal now (with CCK, Views, Panels, and the jQuery Interface Library just around the corner for dragging items around on sortable lists, etc, or assigning articles to pages with automatic word count, etc.) the functionality we would be looking at is not that far away.

It will be work, of course.

Victor Kane
http://awebfactory.com.ar

Depends on what you're coming from

chrisyates's picture

If you're on a DTI, CCI, Saxotech, etc. system, Drupal probably isn't going to cut it, because you'll be looking for budgeting, advanced photo and wire workflows, concurrent page editing, etc. But if you're a small to midsize newspaper with less than 50 newsroom users, a web-based CMS might be a step up from say.... Baseview. Plus, if you're in said small newspaper environment, you're going to find it tough to ROI a big system, since the database licenses alone for sybase or oracle will cost more than upgrading your Baseview or other small system to the latest version.

Three big challenges I see are:

  1. Syncing story data between Indesign/Quark pages and Drupal after they've been placed on a page
  2. Managing wire feeds (though this might be an opportunity to step UP in functionality
  3. Giving users the drag-n-drop UI experience they're used to with Indesign

One of the missing pieces might be some Indesign/Quark plugin code to make all the UI stuff happen outside of Drupal.

-chris

Definite need for CMS even in small working group

victorkane's picture

In my experience, with a small working staff of 5 working on a weekly 16-page newspaper, Legacy tools just don't cut it.

  • Just keeping the documents in a single centralized location with group editing facilities can be important: the mails and such just doesn't cut it, as versions are mixed, formats different, styles far from uniform, creating a nightmare for the copy editor.
  • Version control - Drupal revisions are great, for example, so that different versions can be labelled and reverted back to if need be.
  • Uniformity of archiving: here NewsML really shines, as all parts of the content (NITF domain) and the meta-data is available and easily transformable to new formats if need be.
  • Semantic category and topic management. Once again NewsML really shines here, although of course there are many approaches (RDF, even RSS 1.0 Dublin Core). Something, though! Folks need to be able to pull up archived material from old issues in an organized manner.
  • Working from the organization's office, working remotely from home!!! And you know instantly if you are working from copy editor Polly's most recent corrected version, or the one she sent you last week, and whether or not it has the Managing Editor's comments included or not.
  • Style uniformity, granularity of article elements. People are forced to present content in a uniform manner (they just won't use those pesky .dot templates!). So you have in even the most rudimentary database, the titles, the teasers, the body, even if you don't go to NITF or NewsML lengths, at least all articles are made up of the same stuff and can be made available for Quark or for Publisher or for Scribe in the same manner!
  • Using a CMS means that you write once, and publish many!!!! That is, all the articles are sent from the CMS archive straight to PDF or printer film, or what have you... but (especially if they are in NewsML or some XML format) instantly ready to be transformed for Web, for WAP, for personal PDF, for Palm, for Plucker, for what have you...!

I could go on and on... and would like to!

And in the next few months will definitely be working on using Drupal for Newspaper production to replace an existing CMS: and there will be dual output: printers, extranet web, website, and intranet archive, plus the occasional WAP and Palm.

Will report here.

Just as a case in point, right now am scraping articles from old static website to new Drupal site. But, since the HTML was created with XSL directly from NewsML (also available online), it is a snap to read the content and metadata with SimpleXML parsing (PHP 5), for example, and import straight into Drupal. Will report on this too, in the next 10 days or so.

Victor Kane
http://awebfactory.com.ar

subscribing

olalindberg's picture

subscribing

Newspapers on Drupal

Group organizers

Group categories

Topics - Newspaper on Drupal

Group notifications

This group offers an RSS feed. Or subscribe to these personalized, sitewide feeds: