Schamper: our student newspaper on Drupal

We encourage users to post events happening in the community to the community events group on https://www.drupal.org.
stdbrouw@groups.drupal.org's picture

We've recently redesigned our website, so I took the time to do a small writeup and look back on our switch to Drupal a while ago. Hope you enjoy the read.

Making the transit from Textpattern to Drupal

When I joined the newspaper four years ago the website didn't have any content management, everything was put online directly in HTML. We wanted a CMS with the dual purpose of making online publishing faster, but also to rethink the horrible workflow of our print edition. Articles used to be delivered by mail in a variety of formats, without uniform markup, and proofreading was done on print-outs that were scattered around the newsroom. This was a burden, especially on the DTP-people.

We started out with Textpattern -- a simple and elegant system -- that we then modified to support a print-to-web workflow. The time needed to produce the paper was cut almost in half. After using Textpattern for two years, however, it became clear to us, that we needed something more sophisticated if we wanted to grow. That concern became even greater when we found out that Textpattern 4 was actually a renamed Textpattern 1, and that new development was proceeding agonizingly slow. So in the summer of 2007 we switched to Drupal.

Two things impressed me immediately. One was the exceptional handling of taxonomies in Drupal. The other was the great system of hooks and api's that makes creating modules so easy. Textpattern also had a plugin system, but anything complex required hacking the distribution. I converted all my hacks into Drupal modules in a few days. "Pro Drupal Development" by VanDyck and Westgate did ease the learning process.

Data migration wasn't really an issue. We didn't have a very big archive, and we'd used few Textpattern-specific features, so writing a few SQL select-and-insert statements was a breeze. However, it did get very messy because of charset problems. The formatting in Textpattern was UTF8, but the database and tables were in latin1, so the data was actually stored as latin1 but could only be read if it was interpreted as UTF8. Textpattern did this correctly, but the SQL exports didn't. This took hours and hours to fix. In the end, I managed to get everything into "real" UTF8 by first converting all special entities to HTML entities and then converting those to UTF8. I had forgotten all about this episode, and remembering it still aggravates me.

We did continue to use the Textile markup language that comes bundled with Textpattern but also exists as a module for Drupal, simply because it's (together with Markdown) the best thing out there. We also imported the PDFs of our print edition, and they are searchable thanks to the search_attachments module.

Content Management

It's preaching to the choir, but anyway: CCK is very, very handy. It allowed us to radically rethink our content management. We don't just use Drupal for our website, we've also added an internal discussion forum and a lot of internal documentation, lists of subscribers and advertisers, manuals and reports (which are automatically mailed to all our reporters with the help of Actions). A student newspaper typically loses about a fourth of its workforce every year as students graduate, and ours is no different. That makes effective organisation pretty hard, and a lot of expertise is lost forever when somebody leaves the paper. By having all that information in a central repository and encouraging people to write how-to's, checklists and so on, we've been able to alleviate that situation somewhat, and I'm convinced that this will be a godsend for those that are in charge when I leave.

Starting now, we'll also use Drupal to coordinate our photography. This is possible because we have a dedicated server that is located in our newsroom, and is connected to our local network and to a super-speedy university line, so our layout staff doesn't have to download anything because it's all on the local network. Pretty much everything that can be centralized has been centralized.

Integration with InDesign

I've hinted at our web-to-print workflow and will elaborate a bit on that. From the start, our back-end has been more important to us than the front-end website. (That's why our new front-end design lags half a year behind our switch to Drupal itself.) So the first thing I did when we switched to Drupal, was convert my homebrewn automatic xml export code into a module. Every time an article is saved, it updates or creates an xml file on our server with all the articles in it for the edition that article belongs to (which it grabs from the taxonomy). It's a bit of a mess because I'm not actually a programmer by trade (I study philosophy) but I've vowed to make a version that is decent enough to release as a contributed module this summer.

Our DTP-software is InDesign. We switched over from Quark the moment that InDesign 1 was available. We're currently using CS3, and I must say: it's a great piece of software and the team behind it knows what it's doing and they really listen to users' needs. Anyway, InDesign has advanced XML support, and gives you the possibility to "link" to the XML instead of doing a one-time import. That means that when the XML is updated (because editors have proofread an article) the content that is put on the page instantly refreshes as well. So the layout team doesn't have to wait on our proofreaders anymore, but can start working whenever a rough version of an article is available.

This kind of workflow, I think, is really worth a look for small papers and magazines that don't require the fancy functions of expensive publishing systems. It's free, it brings enormous timesavings, and people can do more without having to be physically in the newsroom.

Various

A few things worth mentioning. I can give more detailed info or answer other "How did ya do that?"-questions if anybody has any.

Author handling

Basic author handling in Drupal is not geared toward publications at all. There is no support for multiple authors, it displays usernames instead of full names, and it doesn't support guest authors. Thankfully all of that was pretty easy to fix.

We made a user "Press photographer" and another user "Guest writer". We also added a CCK field "byline" so that editors can override the author information without having to register a new user every time. The theme works so that the user information is ignored every time a byline is filled in. This has the advantage that articles where a user wants another name displayed (e.g. for comical effect in a satire article) are still linked to them and part of their portfolio, and that pieces by guest writers display the name of that writer instead of just "Guest Writer".

We then added a multivalue userreference CCK field to the story to accomodate multiple authors. This is themed so that all authors get equal credit. Somebody surfing the site can't see the difference between the "real" author in Drupal and the additional authors.

We also used a template.php codesnippet to display users' full names everywhere instead of usernames.

Image handling

I decided to use the standard Image functionality instead of imagecache/CCK for two reasons. One: I adopted the attitude not to use a custom module when something very similar is available in core, because of concerns about upgradability. Less modules means less to worry about. Two: images as nodes easily allow commenting, ratings et cetera. Because our images are nodes, the aforementioned handling of authors applies to images too.

(The frontpage contains a lot of photos. The beautiful information.dk website inspired us in this regard.)

Comment display

As you can see on our frontpage under "Commentaar", we've grouped recent comments per article, and display the title of the article and the last reaction to it.

This was actually not that easy to do. Views' "node: distinct" filter didn't work to filter out multiple recent comments to the same article, so I had to cook something up myself. The View now loads about 30 comments and then a .tpl for that view picks out the last comment on each article (a simple PHP loop and conditional) until it has found seven distinct nodes. This is a bit wasteful, and I probably should have just written my own SQL, but for the moment it'll do.

Design

Because our student newspaper is actually more of a newsmagazine, we wanted a look that is fresh like the site for a magazine, but with a lot of information clearly and efficiently placed like on newspaper sites. It also couldn't look too serious, because, well, we're not too serious ourselves.

We recently started doing daily news, and made the somewhat bold move to put that daily news at the top of our frontpage, the user-contributed stuff (comments, feeds) below that, and the content that is available offline (i.e. in our print edition) at the bottom of the page. Editors tend to think of our print content as more prestigious (although that is slowly changing - writers like the fact that people can comment on and rate their articles), but there really is no point in giving the content that is already available in our print edition prime coverage on the website as well.

Panels2 came in very handy, although the complex organisation of the frontpage proved to be a mess to theme for Internet Explorer. As usual. It's still not perfect in IE, but for the moment I've had enough of it's quirks to do additional finetuning. This is why I like desktop publishing a lot more than webdesign.

I looked at and got my inspiration from an enormous amount of existing websites, too numerous to mention here.

Challenges and lessons learned

Make sure your OS, php, mysql, the drupal database and all of its fields have UTF8 encoding. Really.

Selecting modules takes a lot of research. There are quite a few unstable modules out there, modules that are no longer actively developed and so on. It is often best to just skim the lists instead of performing a search on drupal.org because it's nearly impossible to guess how the functionality you want will be described and titled. The upside to this approach is that it is sometimes possible to find a module with some interesting features you weren't even looking for.

Another challenge was that I had to do just about everything myself, because we have no real programmers or webdesigners at our newspaper, and hiring external help was prohibitively expensive for our limited budgets. However, knowing a little PHP and mysql goes a long way. Producing the site took a lot of time and effort, but most of the coding that I had to do was relatively simple. If you're looking to start a news site without any experience with coding and with no budget to hire someone, you'll do good to get some basic PHP under your belt.

Another thing I've learned is that it is worthwhile to take as much or more time thinking about what you want, making sketches (on paper, not in photoshop) and so on, before you start theming and building your site. It allows you to focus on the essentials and not the implementation.

Comments

Excellent write up

johnbeamer's picture

Great write-up and nice site. Can you pls list all the modules you used and which were the most critical (cck/ views obviously)!

Thanks

Impressive

Itangalo's picture

Being an old student magazine editor myself I'm genuinely impressed by your work - and the writeup is really good too. "I'm not actually a programmer by trade (I study philosophy)" - Yeah, right! :-)

Two questions concerning workflow:
* How did you find your solutions to the workflow (and general functionality)? Was it just scribbling notes, thinking, re-scribbling and re-thinking - or did you formalize this in any way? Did you try different solutions before finally deciding? (Maybe this is where your philosophy comes in handy?)
* Can you elaborate a bit more on what your XML export module does?

Thanks a lot for the writeup.

//Johan Falk, Sweden

workflow

stdbrouw@groups.drupal.org's picture

Thanks!

I'll comment on the XML export later (have a train to catch). About finding a good workflow: well, what has helped me immensely is that I have been part of every aspect of the production cycle these last few years. I've written articles, have done editorial work, the last two years I led the layout team, and this year I'm editor-in-chief. Having actually participated in each of these activities made it so much easier to see what was really needed. I guess this would be hard to do in a big enterprise where the complexity of each task is higher, but getting to be somewhat familiar with what everyone does is perhaps a more attainable goal. Drawing charts and making a lot of notes was a boon as well, but we didn't really formalize the process.

And it wasn't entirely painless either: we started with asking people to send their text with a uniform xml formatting, which was a mess (people forgot closing tags, used slightly different class naming etc.), we've experimented with creating the site from those base xml files with XSLT and a few other things as well, but all these options required too much manual work.

Nice work -Thanks for the writeup

kommidi79's picture

Impressive work.

I have a question regarding your front page.Did you use Panels 2 to achieve the Layout, with various blocks placed in the panels?

If so how did you achieve individual Blocks (in the same row) aligning in a straight line.As Panels is good in specifying the width but does not have one for height.This is actually creating a bit of a problem for me as the individual blocks dont line up.

Thanks

avatar: Yep, Panels 2. When

stdbrouw@groups.drupal.org's picture

avatar: Yep, Panels 2. When blocks are on the same row, they do start at the same height, don't they? Perhaps you're referring to the fact that "Schamper blogt" and "UGent blogt" are in the same box? That was done with mini-panels.

itangalo: about the workflow, once more, I think what also played a part was that we knew how powerful InDesign was, and so the scope of our search was considerably narrowed from "finding a good workflow" to "making sure that everything plays nice with InDesign". The scope was further narrowed because we couldn't afford a professional publishing system, and InCopy wasn't out yet (and in any case doesn't solve the web publishing problem). The more I think about it, the more a workflow centered around a web-based CMS seems, well, evident. (Off topic: I doubt philosophy did me much good in this regard, as the skills that I've learned during my studies, if any, are more geared towards analysis of arguments rather than to analysis of problems.)

Then, about the XML export. I've noticed in the past that my command of the English language sometimes falters when I have to get technical, but I'll try. There is one XML file for each edition, and it contains all the articles for that edition. Upon every save in Drupal the "xmlize" module is activated (with the nodeapi hook, I think). That module grabs the edition number (it's part of the taxonomy) and then selects all of the nodes of that edition in the database. This is with a custom SQL statement tailored to provide the kind of content we want in the XML. The data is then put into XML with a pretty generic sql-to-xml script and saved away to a directory on our server. The messy part about this is that (a) you have to know SQL and how drupal stores things in the database to adapt it to your own situation and (b) it imports the raw data, so the Textile markup filter has to be applied again to turn the body field into valid XHTML.

There is probably more future in an approach that is based off Views and a minimalistic xml-makeup of the contents of that view. There has been talk about an XML export for Views for the Views Bonus Pack, but nobody seems to be working on that at the moment. However, there would probably still be some additional work required in addition to an XML export for Views, because our export is an automatic one that requires no action from the user's part at all, whereas (as I understand it) the XML export that people have talked about is a manual one, similar to a different theme and markup for a "print version" of a webpage.

On that note: for another project (a student guide to Ghent) we've actually used the print version of drupal book pages as XML, because those are valid and clean XHTML. That worked fine too, although it provides less control over the formatting and is manual.

johnbeamer: hm, let's see.

Four custom ones:

  • xmlize: xml export
  • columnize: splits the node entry form in three parts and allows you to put fields in any of those three columns; after I wrote it I found out there is actually another module generally available that gives you complete control over the entry form layout, although I can't recall its name at the moment)
  • Smart Tags for Textile: textile allows you to give classes to a paragraph, but we only want to allow a few like 'introduction', 'quote' and the like. This module converts all classes to the predefined allowed ones based on similarity. That means it also allows shortcut-tags such as 'intro' for introduction. This one is actually perfectly stable, but I haven't put it on d.o. yet because so few Drupalistas seem to use Textile, let alone have use for this specific functionality.
  • HideAuthor: to hide the author when the byline field is filled in

Contributed ones (not all of them, only those worth a mention):

  • CCK and Views. Heavily. More than most people, I think, because we use Drupal for general content management and not only to publish articles, so that requires a whole bunch of CCK types and Views for e.g. the phonebook and so on. Also, the userreference and nodereference submodules of CCK, to refer to additional authors and to link to sub-articles, respectively.
  • Taxonomy Multi Editor: does what the title says, this was especially when we just switched over, so we could tag articles en masse.
  • Formfilter: dumbs down the content entry form (you can select which fields it shouldn't display, e.g. some superfluous log fields and the like that are only interesting for me as an administrator)
  • Contact: basic contact form. I've found that people contact us more this year because they can do it right on the site, and don't need to start up their mail application, so that's why I mention it.
  • Image_attach for image display. Downside is that it only allows a single image per article.
  • Actions, for automatic mailings
  • Autosave... although we don't really like that one that much anymore. It can do nasty stuff to new articles, and only works well once an article has already been submitted.
  • Checkout: Enables users to lock documents for modification. Doesn't play well with Autosave, so we've disabled Autosave. This is crucial so that proofreaders don't step on each other's toes.
  • Comment Closer
  • Masquerade: man, this thing is great
  • search_attachment to search through pdf and word files.
  • Taxonomy Theme to give more items the admin theme (Garland)
  • Pageroute to be able to rapidly enter multiple nodes.
  • Panels 2 beta
  • Captcha for spam control
  • Tagadelic for tag clouds
  • Fivestar for voting
  • Workflow for, well, workflows ("draft", ..., "second check", ..., "ready for publication")

Awesome

yelvington's picture

Really awesome. I don't know anything about InDesign (it's been over a decade since I worked in print) but I'm really buzzed that you have it updating pages automagically based on Drupal updates.

I added this to the "best posts" category, and blogged about it.

Thanks

johnbeamer's picture

Thanks for the module list. Most interesting. As I said before great site.

Good job

cctoide's picture

Good job on the site, it seems setting up a newspaper on Drupal has never been easier.

One thing, though, you might want to design a custom favicon instead of just resizing your logo, it looks rather unrecognizable when sized down.

cctoide

how do you process photos?

aeboettcher's picture

I work for a chain of small newspapers in Washington State, and we're finally getting our web sites into the 21st-century. It's going to require a decent CMS, and I love the idea of using XML for the stories. Details on how you manage/process photos for print and web would be very helpful. Thanks for the great write-up.

Sorry for the late response,

stdbrouw@groups.drupal.org's picture

Sorry for the late response, I've been on holiday. Anyway:

cctoide: thanks for the tip

aeboettcher: the photo-processing for web is taken care of by Drupal. It makes the different sizes, and with recent versions of the Image module you can also crop images to have an exact size, which is especially useful for thumbnails. It's not optimal as the cropping is automatic and thus sometimes cuts off a head or other important parts of the photo, but it's really too much work to do all of that manually for such a marginal gain (prettier thumbnails).

We make very little alteration to our photography and if it's needed we ask the photographers to do it themselves, so the photographers can upload their images directly to Drupal and they're ready for use.

Drupal leaves the originally uploaded photo intact, so the photo-processing for print is done by taking the original images from the image directory, and converting them to cmyk/300dpi with a Photoshop Droplet that saves them as a new file. One problem with this is that Drupal saves images with their original filename (rather than e.g. the node title), which is usually just a bunch of numbers if it's straight from a digital camera, so we have to ask our photographers to rename their files to something sensible before uploading so we can find them back. It's manageable for us, but you'd probably want something a bit more advanced for a daily newspaper.

Thanks for the awesome

omnyx's picture

Thanks for the awesome writeup and for a great website!
A couple of questions - How important do you think it is to have a different article content/display for the front page - like it's explained in http://drupal.org/nyobserver. The way they do it there is that they have another content type that references to the real article.
In your experience, what stance would you take on this?
I would only want to do that for a few articles that make the central part of the front page. it's also a weekly magazine so there wouldn't be too much changing around.

Aren't there two ways to do this? One would be the same way they are doing - i.e. a new content type that references to the real node. Another one would be just to squeeze all info (title for the front page, image for the front page, body for the front page, title for the node...) in one content type and display it accordingly. I.e. display some fields when the node is in the front page view, and other fields when it's not (though I don't know exactly I could do that - any ideas?)

Also, about images - would this work - have a couple of image fields: One would be fixed size scaled down image that will be used for the front page and then the other ones would just be the images that are in the article. I could then "print" those images in node.tpl.php and style their positions in .css file. What type of module would I need to use to get this functionality?

sorry for the incoherent thoughts
thanks again!

There's more than one way

yelvington's picture

There's more than one way to do everything. In your case Nodequeue, Views, the Views Theme Wizard, a bit of CSS/HTML work, and Imagecache for the scaled images would be an easy approach. Create some custom regions, use them only on your front page.

Image handling I decided to

omnyx's picture

Image handling

I decided to use the standard Image functionality instead of imagecache/CCK for two reasons. One: I adopted the attitude not to use a custom module when something very similar is available in core, because of concerns about upgradability. Less modules means less to worry about. Two: images as nodes easily allow commenting, ratings et cetera. Because our images are nodes, the aforementioned handling of authors applies to images too.

Can you please elaborate on this a bit? By 'standard Image functionality available in core' you mean the regular 'file attachments' tab when creating a node? If so, how do you control where all those pictures go - don't they just create a big folder?
are you using the image.module?

On the other hand, how do you align your pictures in posts? css? and how do you get the image captions? css again?

thanks a bunch - it's an awesome looking website!

late answer

stdbrouw@groups.drupal.org's picture

I'd kind of lost track of this thread. I'll answer anyway, for those that stumble on this thread in the future.

@Omnyx about the NYObserver way of doing things: well, they need a lot of flexibility and they have a sizable staff. We're a student newspaper that still focuses on its print product (I'm trying to change that, but that's a long-term process) and needs to be able to get an edition published in an absolute minimum of time.

They are more concerned about getting a front page that looks absolutely right, and so their approach is different. If I'd go that way, though, I'd consider just adding new fields to the existing story content type for front-page display (custom teaser etc.) and create two edit screens - one for writers and one for front-page editing, that only display the relevant fields of the content type. Though their way of doing things is definitely viable as well.

Your suggestion - to do it with a custom type for the really important front page articles, but with views etc. for the rest, seems like a good idea.

@Yelvington: (a) nodequeue is a very handy module, but people often underestimate what's possible with the "published to frontpage" checkbox. We solely use that one together with some Views criteria (e.g. put the articles on top that are promoted to frontpage, are in the news category and not in culture etc.) to organize our frontpage content. It depends on how much control you need. (b) the more recent versions of the Image module also support scaling and cropping, so that's a possible alternative to imagecache.

@Omnyx again: the Image module does not appear to be in core in drupal 6, but if I recall correctly it was in core in drupal 5. That's the one we use. Pictures in the articles are done with Image Attach, some css and some theming in template.php. All those pictures go into one big folder (files/images), that's true, and that's not optimal for interaction with a print product (as finding the images you need can take some time). I'll work out a solution for that problem this summer, but for the moment it's workable with some effort. Organizing the images in Drupal is easy though: because images created with the Image module are nodes, you can just use taxonomies.

Newspapers on Drupal

Group organizers

Group categories

Topics - Newspaper on Drupal

Group notifications

This group offers an RSS feed. Or subscribe to these personalized, sitewide feeds:

Hot content this week