DITA Publishing on Drupal

suzannewang's picture

Per gippy's request, I started this thread.

Working as a Technical Writer in a software company, I am responsible for publishing documentation on Drupal. There is a need to shift from PDF publishing to online publishing on Drupal.

After doing careful research on Drupal.org, I found the Import HTML module. Since we author on FrameMaker DITA authoring tool, a list of DITA files are created. To be able to import, we convert the DITA files to XHTML using DITA-OT. This process is straightforward.

Other Preparations:

  1. Drupal doesn't support Ditamap import and we need a hierarchy structure to organize the imported HTML files. Before importing, we created a hierarchy structure in taxonomy by adding new vocabularies and terms. The taxonomy appears on main menu enabled by taxonomy menu function. Furthermore, in importing module settings, you can make it so HTML files are linked to vocabulary/term.

  2. A new content type named Imported HTML File is created and according fields are added. These fields are linked to related vocabulary/term to categorize the HTML files.

  3. To ensure imported HTML files appear under the related vocabulary on main menu:
    Go to Home » Administration » Structure » Import HTML Site » Manage settings »Replication Options to customize settings:
    a. Node type for new pages
    b. Menu settings-add each page to menu
    c. Select a vocabulary term as index pages.

Importing

Go to Home » Administration » Structure » Import HTML Site and select the server storing your HTML files and start importing.

Summary

The results turn out to be acceptable. However, some HTML CSS styles are missing, you need to add these styles manually in main.css.

It's not a simple process and requires some Drupal settings and CSS skills to accomplish.

If you need further clarification, please don't hesitate in contacting me.

By the way, I have attached a sample page of the HTML publishing.

Thanks,
Suzanne

AttachmentSize
A sample.png143.53 KB

Comments

Import DITA via (X)HTML

Frank Ralf's picture

Hi Suzanne,

Many thanks for these detailed instructions! The Import HTML module was also high on my list of favorite modules. I will give it a try following your instructions and report back :-)

Cheers,
Frank

Styles, docbook, etc.

jhodgdon's picture

I have a module (currently an unsupported "sandbox") that is aimed at displaying AsciiDoc output in a Drupal site:
https://www.drupal.org/sandbox/jhodgdon/2265553

The idea there is that you first take the AsciiDoc text and convert it to Docbook using the "asciidoc" command. If you have a way to convert your DITA to a Docbook, you could enter the process at this point.

Then you use the "xmlto" processor, with a custom stylesheet that is specific to the module, to convert the Docbook to bare XHTML.

Finally, the Drupal module reads the special XHTML files and is able to display them, with navigation, in a Drupal site. The content is not actually stored in the database -- it is stored in the XHTML files, and the module reads the files as needed to generate particular pages. At least, that is how it is done currently... it could probably be adapted so that instead of displaying the files directly, they are read into Drupal nodes or a custom entity type.

But the basic idea is there: rather than trying to read the generic XHTML that would be generated by default by the AsciiDoc -> DocBook -> XHTML tool chain, you instead generate specific XHTML (by specifying a style sheet) that the module can parse easier.

Anyway... I'm not sure if this module could be used as-is for DITA output, but maybe the ideas could be adapted? It sure was easy to write -- I've put in probably 10 hours of development total on that module. Of course it's a bit rough around the edges... but the kernel of an idea may be there?

DITA to XHTML

Frank Ralf's picture

Hi Jennifer,

Many thanks for your input. DITA XML is usually transformed to other formats with the DITA Open Toolkit (DITA-OT). As the DITA-OT already provides a couple of XHTML output formats I don't think it is necessary to use DocBook as an intermediary format.

I started playing a bit with the Import HTML module and I am quite impressed what a sophisticated module this is. The mapping between input and output format is done with XSLT stylesheets which probably need some tweaking for making them usable with DITA XML. However, I will definitely have a look at your module to see how you do the transformation.

BTW, I stumbled across the pandoc tool by John MacFarlane which is a universal document converter to convert files from one markup format into another. Perhaps this can be another source of inspiration ;-)

Kind regards,
Frank

Docbook chain vs. DITA chain

jhodgdon's picture

The point of using the Docbook chain is that the AsciiDoc in Drupal module uses a custom style file to transform the Docbook to a particular known format of XHTML, so that when the Drupal module reads the output, it can immediately pick out precisely what needs to go on each page, and also use the generated table of contents as the navigation.

Maybe the Import HTML is doing something similar... but it says above the DITA maps are not dealt with. That's one thing my module does with the DocBook toolchain -- because it makes the tables of contents output in a special format, the Drupal module is able to read and use them for navigation.

Re-creating DITA document structure

Frank Ralf's picture

Hi Jennifer,

Thanks for the clarification. The Import HTML module should be able to re-create a menu structure from an index.html file which is indeed created by the DITA-OT transformation. However, I'm still struggling with some technical details so I haven't been able to test this yet. I will keep you posted ;-)

Kind regards,
Frank

If you select Add each page

suzannewang's picture

If you select Add each page to menu in the settings of the import html module, the imported pages will be on the selected menu, however, the menu structure need fine- tune in terms of hierarchy and orders.

Cheers,
Suzanne.

DITA template for Import HTML module

Frank Ralf's picture

I've created a first version of an XSLT template which prepares the XHTML output of the DITA Open Toolkit for further processing with the Import HTML modules, see https://www.drupal.org/node/2470337. (I will write some detailed instructions on how to use this template as soon as I find the time.)

You should use one of the dev versions of the module for testing with DITA XHTML content because there was a minor issue of the module not recognizing the XHTML files, see https://www.drupal.org/node/2448437 for details.

Cheers,
Frank

index.html

jhodgdon's picture

Yeah, the custom stylesheet in the Asciidoc in Drupal sandbox module is largely aimed at making an index.html file that is more easily parsed by the module. ;)

That seemed a lot easier to do when I was developing that module than trying to parse the default docbook -> html index file. You may find something similar can help with DITA too.

Re-creating DITA document structure

suzannewang's picture

If you select Add each page to menu in the settings of the import html module, the imported pages will be on the selected menu, however, the menu structure need fine- tune in terms of hierarchy and orders.

Cheers,
Suzanne.

DITA Tech Comm CMS

Group organizers

Group notifications

This group offers an RSS feed. Or subscribe to these personalized, sitewide feeds:

Hot content this week