PDF use cases

public
group: PDF
Egon Bianchet@d... - Tue, 2007-02-06 08:37

What do you need from a Drupal PDF module? Do you need it to format a web page just to send it to the printer, or do you need it to generate a document like an invoice, a book, or catalogue?

How much theming and customization flexibility do you need?

Use case - connecting the community plumbing to offline actors

ndru - Tue, 2007-02-06 12:44

This has the potential to be a great feature set. In many community organizations, for whatever reason, there's a large set of people who need to be included in the community who are not able to participate on-line. Paper is a great technology for including these users. To allow them to experience the "life" of the site, and the life of the community that the site represents, I think that some way to select a pile of nodes (like through a view, or a taxonomy query, or through a module like simpleNews) is needed. So the sequence would look something like:

  1. Select a pile of nodes somehow (see above)
  2. Choose some level of detail in which to represent the nodes (either titles, teasers, full)
  3. Apply some default theming or styling functions, allowing others to plug in their own fairly easily (realizing that needs will vary considerably about this)
  4. Generate a PDF based on the styled node collection
  5. Either display the PDF, or plop it in some directory, or attach it to a node and tag it with a particular taxonomy term.

I'm thinking of the case where an organization might want to periodically generate a newsletter / update for its offline members. Potentially it could generate several newsletters, or even generate individually customized newsletters, which a staff person could print and mail (or hand out) to the offline members.

Hope this helps you see some of the potential for this kind of module.

I'm actually interested in the reverse functionality

stormer's picture
stormer - Sat, 2007-06-23 04:00

ie. viewing PDF documents within Drupal. The site I am working on requires uploads of large amounts of PDF documents and we'd love to be able to view them within the browser rather than launching an external viewer - pretty much what these guys are doing but within a Drupal environment: http://view.samurajdata.se/


This looks amazing and would

s.Daniel's picture
s.Daniel - Wed, 2008-02-27 22:33

This looks amazing and would be an awsome feature.

Maybe Adobe could be asked to support such a projects as this is the best marketing one could ask for. Make others not only use your product but multiply the usage.


I want to generate a list of

Tobias Maier's picture
Tobias Maier - Wed, 2007-02-07 13:50

I want to generate a list of events listed in a table.
This table should be downloadable as PDF File.
So that every member has an offline version and also members without internet could get a great looking one for their use.
I need the same functionality for a telephone list, too

What I personally don't need is a way to download a single node.

About theming:

  1. I think it is necessary to be able to theme everything. Beginning with the header and ending with the footer :)
  2. PDF output of views should be themeable individually - I should be able to create a special theme for every view or maybe even for every content type.
    One Idea could be to extend the tcpdf class and add a new method: SetThemeFunctions() which accepts an array of different theme functions for footer, body, head and a general one. If one of this function was not specified it should use the default one.

What I like is, that tcpdf has a html interpreter, but which does not work perfectly.
What I really dislike with tcpdf is that it has no proper error handling every single and stupid error ends with a die()
I created for the a feature request at sourceforge: http://sourceforge.net/tracker/index.php?func=detail&aid=1652731&group_i...

Tobias Maier - http://www.tobiasmaier.info/
--
Switch to Firefox!
Steig auf den Firefox um!
mozilla.com


Problems with Drupal5

hstern - Wed, 2007-02-21 10:56

After installing pdfview and following the (rather not very complete manual) I do only get " | Array " instead of a view as pdf link.

Drupal 5.1
PHP5
Apache

Is there a bugfix for that?

Thanks in advance,
Hermann

Hello Hermann, please use

Tobias Maier's picture
Tobias Maier - Wed, 2007-02-21 12:09

Hello Hermann,

please use the Drupal Issue Tracker for a support request or to submit a bug.

We are talking about use cases on this place.
Please provide your use case here.

Thanks

Tobias Maier - http://www.tobiasmaier.info/
--
Switch to Firefox!
Steig auf den Firefox um!
mozilla.com


Whole threads

wouter's picture
wouter - Fri, 2007-03-23 13:21

Hi,

We're using the pdfview module on KnoSoS (http://www.knosos.be), an experimental knowledge sharing platform.

Case : Events are often used to schedule (online) meetings. Using the comments is an ideal way to collaboratively add agenda points and also to discuss the results of the meeting afterwards. Hence those comments are very useful information.

I would very much like to see a way to (optionally) include comments to the pdf-view of a node.

I might have overlooked this feature perhaps, so my apologies if my case here is useless.

If not, then what are your views for a possible implementation of this ? perhaps we could do some of the work, if you haven't got it on some to-do list already.

Best regards,
Wouter


PDF module 'Publication''Journal' handling-volume, multipagePDF

ica's picture
ica - Tue, 2007-04-10 14:58

'Publication' or 'Journal' output of the PDF module, that would be a good facilation of such module and add Drupal yet one more advantage over other CFS's..

There is an open source project from Canada Open Journal Systems (OJS) does exactly that -only that
http://pkp.sfu.ca/?q=ojs
example site
http://ijoc.org/ojs/index.php/ijoc

i think similar 'publication or journal' content type can be achieved with conjunction of the book module and the taxonomy -as volumes- and with an calendar view output of mothly or weekly or other publication durations..
the challange is the PDF output in a single document in multiple pages.. a publication structured document in volumes -that is a single PDF file in this case..

just a PDF module usage thought. (I am not a coder to make coding contribution unfortunalely)


Role-playing Game Character Sheets

CleanCutRogue - Mon, 2007-05-21 00:59

I plan on using Drupal for an online role-playing game collaboration site. I'd like to have users to be able to enter data on a page, selecting things from combo boxes and typing up stuff in textboxes, have that data stored, and allow the user to click a button to get a 1 or 2 page PDF of his character sheet with all relevant fields on it populated.

intranet use case

worldfallz's picture
worldfallz - Fri, 2007-08-03 01:19

I use drupal for a corporate intranet site storing all types of corporate data including: policies, procedures, processes, memos, documents of all types. We use the PDFView module to output pdfs of nodes when we need to send it to someone who, for whatever reason, doesn't have access to the main site. Because we are an outsourced IT support organization we frequently have to post our process/policy/procedure docs to the mother company site b/c the intranets have no connection to each other.

For the most part the module meets our needs as is, but ideally should be more customizable through the user interface. Currently, as admin, I have to do all the customizations in the theme by overriding the function.

Features we'd like to see for our use case are:
1. custom output per content type
2. UI settings for header, footer, margins, logo, & cover page (per content type)
3. the ability to designate a role that could change #2 but not #1. Ideally this would extend to the OG group manager level so that group managers could change their format for their content only.

Until I got this module working, I was using "Print Friendly Pages" and instructing users to output to the freeware CutePDF (with gsview) pdf printer utility and was researching customizing that module to offer the option of sending that output to tcpdf.

It seems to me that there is a great deal of commonality in uses between the two modules-- it might make sense to combine resources and develop jointly.

I'm still learning to program drupal, but am eager to volunteer my services on this project.


We need an ability to

ardas - Fri, 2007-08-03 11:50

We need an ability to generate invoices most of all. To make it effectively we need a flexible support of table element as well as automatic multi line texts (this is when you pass a long description and it should be aligned in a table cell and splitted into several lines).

Regards,
Dmitry Kresin, ARDAS group - Drupal CMS web sites development, Software outsourcing

I second this motion.

jo1ene's picture
jo1ene - Fri, 2007-12-28 17:25

I would also like to email a node (invoice) as a PDF - to a specified email address, at least. I am considering using OG to set up client home pages, so sending a node as PDF to all OG group members would be cool.

This would presuppose reliable handling of all CCK fields.

This even has a greater application as I would think that emialing a read only version of a node would be useful in other respects.


print a book

lejon@drupal.org - Tue, 2007-10-09 06:01

I would like to be able to print a book along the lines of the facility in PMwiki:

http://www.wikipublisher.org/wiki/

This would mean that people would be able to print off the very latest version of an online manual at any time.

wow--- wikipublisher is very

worldfallz's picture
worldfallz - Tue, 2007-10-09 13:27

wow--- wikipublisher is very cool. I'd not seen that one before. And yes, having that functionality would be awesome. I'll have to keep that solution in mind for sites that require nicely formatted output.


yup, maybe it could be adapted?

lejon@drupal.org - Tue, 2007-10-09 14:10

At the moment it's only used for PMwiki. Maybe it could be adapted?

I'm in no way shape or form

worldfallz's picture
worldfallz - Tue, 2007-10-09 17:52

I'm in no way shape or form a "coder" ( i know just enough to hack myself into trouble, lol) but I plan to take a look at the code when I get a chance to see how big a deal it would be.


Wikipublisher

samrose's picture
samrose - Fri, 2007-12-28 18:17

Hell yeah!

This is exactly the reason why I am lurking here. This is a "missing link" in Drupal for publishing.

Although, I can tell you that implementations of web-to-pdf-to-book are usually quite problematic, and not that "clean" if you actually want to create a printed book.

But, looks like WikiPublisher is headed in the right direction. http://www.wikipublisher.org/wiki/index.php?n=Wikipublisher.InstallTheSe... uses LaTeX

I think that Drupal could instead base off of http://wiki.contextgarden.net/What_is_ConTeXt which integrates vector graphics, instead of the image magick bitmap that Wiki publsiher uses.

This argument about integrating http://wiki.contextgarden.net/What_is_ConTeXt into drupal goes beyond our discussion of PDF integration.

But, basically, I have to come out and say that it would be awesome to emulate and improve upon what you see at wiki publisher, but using http://wiki.contextgarden.net/What_is_ConTeXt as part of th engine, so that you could take collections of practically any type of node and combine them together into a "book" that is typeset and ready to print as PDF, which could then be used by a print on demand printer to output actual books (part or all of a book). Would be cool to be able to buy just the parts of large books that are needed, for instance. (we discuss this at http://socialsynergyweb.net/cgi-bin/wiki/MicroBook)

Sam Rose
Social Synergy
Blog


Use Cases

johnbarclay - Fri, 2007-12-21 15:26

I need to be able to:

  1. select which parts of node to convert to pdf (don't show nav, logon etc.)
  2. use existing node security for viewing the pdf
  3. be able to control the mapping of html elements and classes to particular pdf rendering (Paragraph text is block, 10 pica height; H3 is inline, italic)

I had to do something similar for some custom built form generation software that needed view only pdf versions of forms and used the following approach:

  1. Start with xhtml for web browser + xml document articulating exceptions for the particular document
  2. transform with xslt to xsl-fo
  3. apply apache fop to get to PDF (or jfor for rtf)

Ignoring the tools used which are probably not appropriate for a Drupal module,
the problem with html to pdf is that the developer will want to determine exactly how each html element maps to pdf rendering. This can end up involving a whole new intermediary language. I started off with Doug Tidwell's XSLT at http://www.ibm.com/developerworks/library/x-xslfo2app/ for the trasformation and allowed for a per document type override of all html elements to xsl-fo mappings.

Its vaguely outlined at:

http://www.johnbarclay.com/programming/plus_flowchart.pdf and

http://www.johnbarclay.com/programming/plus_documentation.pdf

But who wants to develop an intermediary language for mapping html to xsl-fo or whatever document preceeds the pdf document in the transformation process.

When print css came along I ended up repeating the info in print css as in the configuration xml files. ( though I could have generated the print css from the config xml I suppose.)

An solution could use print css as the xhtml to pdf configuration language. The print css file could be the same print css page used by the browser in printing or a second print css file used only to convert to pdf. Aside from print css being something people are used to working with the display:none rule could be used to hide unwanted parts of a page from pdfs and fullfilling my 1 and 3 use cases.

I'd be willing to help on this module. I've been working with Drupal for 6 months or so and php for a couple of years.

I have different needs

Patola's picture
Patola - Sat, 2008-01-05 19:16

Ok, I may be talking lots of stupid things here, but the main point I wish to raise is that Drupal is not about PDF, it is about HTML. It was made for the web, it will render HTML. PDF is an afterthought. As such, albeit XSLT and LaTeX are obviously superior approaches, they would be just excess layers for building the final output. If one can't stand to lose formatting,

So, I think the best approach would be to generate a really good HTML output that could be nicely converted to PDF. With line breaks, Table of Contents (generated from the book structure), Page Numbering (is it possible in HTML?) and everything else. Also, it would use drupal filters, so a LaTeX formula could appear as a PNG and be smoothly converted to a PDF picture. Then, an external program (or maybe a library) would just get this HTML-with-pictures and generate a PDF file.

My main need for this is with books. So, there's little need of individually selecting pages, it would follow the book's structure. The options would be something like that:

  • ( ) Print bibliographic reference (using biblio)
    • [ ] At the end of each chapter
    • [ ] At the end of the book
  • Page size
    • [ ] Letter
    • [ ] A4
    • [ ] A5
  • ( ) Print comments
    • [ ] As an appendix at the end of the book.
    • [ ] On each node.
    • [ ] At the end of each chapter.
  • ( ) Extract links
    • ( ) Keep links clickable
    • ( ) Generate links reference at the end of a
      • [ ] Node
      • [ ] Chapter
      • [ ] Book
  • ( ) Add line feed at the end of each page
  • ( ) Print page numbers (maybe that would only be possible with the previous option set. How could I number pages in HTML? Would I have to resort to tricks like adding a line before the page break or adding spaces?)

Well, whoa, that is what I need. If I have enough time, I'll try to program it myself. I am still rehearsing and I am trying to choose amongst htmp2pdf, htmldoc or dompdf to generate the final PDF. I'd probably have to use temporary files for this, though, because of the images/filters. I don't know if I can generate it inline.

With this kind of HTML output that can be nicely converted to PDF, one could always keep using XSLT or LaTeX to reprocess it and adapt to his/her neeeds. So it would be a good solution, in my opinion. What do you think?


This is pretty much what I

worldfallz's picture
worldfallz - Mon, 2008-01-07 15:04

This is pretty much what I need also. I'm sort of cheating by using the Printer-Friendly pages module to control the HTML via print css specification then "printing" to a PDF print driver.

Seems to me there's actually a lot of overlap between the approach you outline and the Printer-Friendly pages module-- the only difference being where the output is directed (physical printer vs. PDF creation utility). Regardless of output device, the main idea in both cases is to add some controls for specifying the HTML created for certain output devices.


A spectrum of needs

samrose's picture
samrose - Sat, 2008-01-05 21:34

It seems to me, based on feedback here and my own experience, that there is a spectrum of needs.

And, as such, it would be probably best to have a basic "glue" module for output of Drupal content to PDF, that would allow other processing to attach to it.

This basic module would basically accomplish part of what Patola outlines above: basically, to "generate a really good HTML output that could be nicely converted to PDF."

This basic module could allow multiple, or any choice of open source PDF processing library, including TCPDF, htmp2pdf, htmldoc or dompdf, or anything else. Thus, site admins will not be restricted to one HTML to PDF processing solution.

Then, there could also be modules that connect different processing abilities, such as Print CSS, XSLT, LaTeX, ConTexT, etc. And modules could be developed and deployed that would allow for complex, paginated PDF creation based on multiple Nodes of different types. This way, a PDF could be genereted from "book" content, but also from any other content.

Sam Rose
Social Synergy
Blog


Glue and Forms

light-blue-pdx - Mon, 2008-04-28 06:46

My PDF requirement is to convert a collection of in-house paper forms into a digital format, so that once filled out over HTML, a final PDF version becomes available (not unlike the on-line IRS tax forms approach).

In that usage case, it seems to me that Sam's "glue" might be something like http://www.ros.co.nz/pdf/ -- though honestly, it might not handle the positioning of HTML elements (the print page to PDF approach) in the way this group requires, or at least, not without tweaking.

Given my forms requirement, I'm curious about your thoughts on the following approach: write a module to hook into certain content types and provide additional fields requesting X and Y coordinates for that content type's form elements on the final PDF document. Such a module would ideally be CCK-field aware. This user-collected positioning data would then fed to the "glue" above.

Thoughts?

CSS support

seutje - Fri, 2008-05-23 07:28

my kingdom for CSS support in the PDFView module :(

and a more flexible theming way would also be much appreciated, since I can't seem to have this thing spit it out the way I want it

but I suppose it's better than nothing, considering I'm forced to use drupal5 and the printer friendly module doesn't support D5 :x

<3