Automation from 3B2 CMS

We encourage users to post events happening in the community to the community events group on https://www.drupal.org.
amax's picture

Hi Everyone,
We currently publish a printed weekly magazing with circulation of over 60K readers and now wish to start publishing to the web. I am pretty much decided on drupal as the platform of choice, and loved the NYobserver case study. However we have one major concern. Our journalists use the 3B2 Advent system. The requirement from the top is that this definetly has to remain in place so the option of journalists entering / publishing stories through a custom CCK content type in drupal is a no no. What we can do however is output all Journalist stories in perfect clean XML from 3B2 / Advent, but have noidea where to go from here. Can anyone advise on:
1) The best possible way we should be exporting this XML for Drupal and indeed,
2) If its possible to import this XML to drupal Nodes / Structure we should be using to make the most out of our XML files

Kind Regards
amax

Comments

I've had good luck with Node

stdbrouw@groups.drupal.org's picture

I've had good luck with Node import (that's CSV though) which allows you to do custom mapping of the content to CCK fields and works exactly as advertised. I'm quite curious myself if there is a good xml import available. Supposedly few people work with the import/export api but perhaps there are alternatives. The FeedAPI together with Feed Element Mapper might provide a starting point?

Information.dk seems to have developed a custom module to do the job, and good NITF/NewsML imports don't seem to be very easy, so there are no easy solutions at hand afaik.

Other people on this group are more experienced on this front than me, though, so stick around ;-)

experience for importation ?

JBI's picture

you are refering http://drupal.org/project/node_import ?

Could you tell us what was your turn around for that ?
We have 2 type of article node with 700 item in CSV.

And other xml file of 4 Giga.

node import

stdbrouw@groups.drupal.org's picture

Yup. I've only used it for small-scale imports, so I can't really comment on its performance. For daily or regular imports, you would need to code some custom functionality to automate the process, because Node Import (at least the version I tried in May) doesn't remember your mappings (so you'd have to re-map on every import) and doesn't do automatic imports (e.g. with cron). Shouldn't be too much work though, if the module fits your bill otherwise.

The current version of Node

Garrett Albright's picture

The current version of Node Import remembers column mappings quite well. The Eureka Reporter uses it for importing events into their event calendar. However, it has its share of other issues which required some hacking of the module in order to get things working. It's worth a try, at least.

FeedAPI, Atom

yelvington's picture

We've created a CCK content type for news stories that are being exported from our legacy systems in Atom format. Tobby set it up, and at the moment he's on vacation so I can't provide much in the way of details.

thanks everyone, lots of

amax's picture

thanks everyone, lots of stuff here for me to be researching. We have currently come up with a way to standardize the flat plan by tagging each topic" of our paper from a drop down in 3B2. Looking now at perhaps rather than outputting as XML, reading the 3B2 database entries directly from Drupal. The feed API sounds promissing so will get on that this week. Should I be concerend about using Drupal when the number of nodes grow to tens of thousands?

No

yelvington's picture

Should I be concerend about using Drupal when the number of nodes grow to tens of thousands?

Drupal.org is up to 284,789 nodes at the moment. While database size can create MySQL issues (google "mysql limit") there is fairly little impact on performance, as retrievals based on indexed fields are fast.

I would strongly urge you to create a data feed, and NOT implement a model that interacts with your 3B2 database every time someone looks at your website. You'll create performance issues with your Arbortext system, as it wasn't designed for Web scale, and you'll create extra work for yourself trying to figure out how to reinvent all the goodness that native Drupal nodes and views give you.

If you're running a

stdbrouw@groups.drupal.org's picture

If you're running a dedicated box and add the nodes gradually so the Drupal search index can catch up, that shouldn't be a problem. Some people prefer to use Apache Solr (or similar) as the search engine rather than the standard Drupal search, and depending on the traffic you get there are a number of caching possibilities as well. I'd say: just keep an eye on the performance and solve problems as they come up.

Newspapers on Drupal

Group organizers

Group categories

Topics - Newspaper on Drupal

Group notifications

This group offers an RSS feed. Or subscribe to these personalized, sitewide feeds: