Tearsheet: Exporting to Print

Posted by agentrickard on March 2, 2007 at 9:55pm

I mentioned when this group started that we had written a simple module called Tearsheet. Its purpose is to let editors search for content and export plain-text formatted for pasting into a front-end system.

The original module is 4.6 and relies on a custom module that is unreleasable. But I was talking to Tim from Maine this week and he's trying to do export using Views and having trouble.

Now Views is great, but to do print (not printer-friendly) export, we generally have to skip node_view() and process data gathered by node_load() instead. So I don't think that VIews will hit the target that Tim has set. I think it's actually pretty easy, so here's a framework and some questions.

Define an admin path that can be access controlled.
Run type-specific content searches
Preview search results
Export search results to a chosen format

This part is easy and could be written using a plugin architecture that lets coders add new features without hacking the module. The default module might handle blogs, books, and events, say.

A second level of admin would give some advanced controls:

Settings for search options for each content type
Templates for exporting the data (possibly with mutiple templates). These templates would need to exist outsifde the normal PHPTemplate theme structure.
Output previews.

The questions:

Is there a stable module that will already do this? Like Import/Export?
Tearsheet would have output templates (for XML, plain text, laTex, etc.). Should these be text file includes or stored in the database?
Is anyone willing to be a co-maintainer? I can write the base module very quickly, but would love to do this project collaboratively.

Discuss.

Comments

alternative method

Posted by victorkane on March 3, 2007 at 11:27am

Actually, one idea is to use embedded views; and then strip tags on the result.

Here is how to code an embedded view that, instead of returning the themed output (with the first parameter equal to "embed") actually returns a hash of the data itself (including the queries -- see the function comments in views.module):

$view = views_get_view('Pub_Issue_Section_Articles');
$the_articles = views_build_view('items', $view, array(0 => 
                                $node->field_issue_publication[0]['nid'], 1 =>
                                $node->nid, 2 => $sec_nid), false, false);

Now, if this is hard to follow, it simply says that we want to build the view "Pub_Issue_Section_Articles" (which, as the name suggests, lists articles according to a series of view arguments specifying node_reference issues and sectins) as an array of items, with no paging or limit.

Now you can iterate over $the_articles, grab titles, headlines, bylines, teasers, bodies, etc., and use the php function strip_tags() to create an ordered stream of stuff for output:

<?php
        foreach ($the_articles['items'] as $article) {
          echo '<div class=issue-section-articles>';
          echo '<hr/>';
          if ($article -> node_data_field_article_slug_line_field_article_slug_line_value) {
            echo strip_tags($article -> node_data_field_article_slug_line_field_article_slug_line_value); 
          }
          if ($article -> node_data_field_article_head_line_field_article_head_line_value) {

// 
// an example of non-hard-copy output if it is useful to someone:
//
//            echo '<strong>' . '<a href="' . base_path() . 'node/' . $article->nid . '">' . $article -> node_data_field_article_head_line_field_article_head_line_value . '</a>' . '</strong><br/>'; 
          }
          if ($article -> node_data_field_article_sub_head_line_field_article_sub_head_line_value) {

// again, a non-hard-copy example, to create a teaser on the fly 
// using a version of drupal's node_teaser() function placed into the local template.php:
//          if ($article -> node_data_field_article_body_field_article_body_value) {
//            echo '<blockquote>' . phptemplate_teaser($article -> node_data_field_article_body_field_article_body_value, 200) . 
//              '&#8230; <a href="' . base_path() . 'node/' . $article->nid . '">' . 'Ver artículo' . '</a></blockquote>'; 

          if ($article -> node_data_field_article_body_field_article_body_value) {
            echo strip_tags($article -> node_data_field_article_body_field_article_body_value, 200); 
          }   
?>

This way, using views as a majestic database independent query generator (!), you can have direct access to the stuff you need and do what you need!

Victor Kane
http://awebfactory.com.ar

Victor Kane
http://awebfactory.com

Ok

Posted by agentrickard on March 3, 2007 at 2:32pm

I get that Views could handle the search nodes functions.

I follow that, but it looks like a coder solution, not a UI solution to the issue that print folks would be comfortable with. I find the Views UI excessively difficult to manage.

But back to the primary questions.

How would your non-technical folks access the embedded view?
How would the creation of embedded views be managed?
How would you create different output of the data (XML vs. TXT) without writing new code?

That said, we could possibly do this as a Views plugins -- though I am hesitant to be chasing the Views code. If we did, what would that plugin need to do?

--
http://ken.therickards.com/
http://savannahnow.com/user/2
http://blufftontoday.com/user/3

--
http://ken.therickards.com/

This functionality simply requires some definitions

Posted by victorkane on March 4, 2007 at 9:56pm

I think it comes back to a definition on format: Once the news format is decided upon, all of this can be pretty much coded in a permanent fashion, with toXML, toPDF, toTXT, etc., available on a GUI level.

It doesn't require NewsML or NITF, but some such format would enable a separation of form and content, so that it wouldn't be necessary to make constant changes, and consequently, it wouldn't be necessary to change that code once it were in place.

So once you had your XML persistence defined (some kind of schema) it actually could be handled with binding to Drupal forms (a one time requirement), on the one hand; and either something like I have here; or, as an interesting alternative, XSLT to create the output.

Victor Kane
http://awebfactory.com.ar

Victor Kane
http://awebfactory.com

persistence

Posted by agentrickard on March 4, 2007 at 11:48pm

What I'm envisioning is that news orgs will all need distinct XML schema (based, for example, on Quark's import XML function and how the news org uses Quark).

I am thinking that we enable users to map each $node element to a display element at the admin level, and then let non-technical users select the output format when producing the export.

Off coding something else rght now...

--
http://ken.therickards.com/
http://savannahnow.com/user/2
http://blufftontoday.com/user/3

--
http://ken.therickards.com/

Standards based vs proprietary non-standards

Posted by victorkane on March 5, 2007 at 3:09pm

Well, here's the thing, Quark's park avenue and other monstrosities it has marshalled to "allow" you to have round trip (export and import) connection to an XML schema (you have to specify each node on the XML tree by hand) based on a single document, in my view, is simply another example of proprietary non-standards shackling all of us.

Scribus, for example, uses XML format for persistence of its files. I am not familiar with it, but Quark's persistence of its files is a mess, it's a binary, messy disgrace. Adobe's products are also moving in the direction of XML, open based persistence.

This is why I would not go for a node field based schema, but rather an industry recognized schema for all documents (NewsML, NITF, etc).

If what you are saying is a requirement, that plugs us into Quark's limitations, I would say... but if that is the case, then some sort of gui system will be required as you say.

Victor Kane
http://awebfactory.com.ar

Victor Kane
http://awebfactory.com

requirements

Posted by agentrickard on March 5, 2007 at 9:58pm

Well, like many things in the web-to-print space, we're hamstrung in requirements by the production methods currently in use. If I have a magazine that uses Quark (and I have not one, but 12, actually), I can't really tell them to stop using Quark and use program X because it has a better XML schema.

I also have 30 newspapers, 12 of which use InDesign and the rest likely use Quark. And I bet none of them could import from a common standard.

I talked to our internal developer who created a 4.7 implementation of Tearsheet. I'll post it sometime this month and we can take a look.

--
http://ken.therickards.com/
http://savannahnow.com/user/2
http://blufftontoday.com/user/3

--
http://ken.therickards.com/