Import XML news feed into drupal

We encourage users to post events happening in the community to the community events group on https://www.drupal.org.
Vic96's picture

I run a news website. We're considering subscribing to some XML (not rss) news feeds. I'm trying to figure out how to import the news feeds into Drupal. Is there a module that can help with this? The feeds will be uploaded to a ftp location and I need to import them from there. Appreciate any guidance. Thanks.

Comments

Format?

yelvington's picture

What format are they in?

XML

Vic96's picture

Thank you for your reply.

This is what the XML looks like:

<articles>
<article>
<date>2006-05-12</date>
<time>11:34:07</time>
<author>John Doe</author>
<headline>
Blue widgets are better than red.
</headline>
<cats>International</cats>
<subcats>Asia</subcats>
<content>
Blue widgets are certainly better than red because they're blue and not red. Try green ones sometime.
</content>
<summary>
Blue widgets are better but green aren't so bad either.
</summary>
<copyright>ABC</copyright>
</article>
</articles>

Is there a way this can be imported into Drupal to create a node? There may be several such files within a day uploaded to the same directory.

XML to RSS

kreynen's picture

If you are using PHP5 (which everyone should be!), it should be relatively easy to write a script to convert the XML into RSS with the SimpleXML object. Then you can use one of the existing modules that expect the XML to follow the RSS standard to suck the content into your Drupal install.

This post describes the configuration we used on OurTahoe.org during the Angora Ridge Fire in Tahoe, but there have been several changes to aggregation in just the last 2 months. You should really join the RSS Aggregation Group and dig into what Aron Novak been working on for his Google Summer of Code project. If you weren't already aware, Aron's was one of 19 Drupal related SoC projects that Google funded this summer. Someone in that group might have a better idea.

XML to RSS

Vic96's picture

Thanks for the tip. I'll join the rss group and look into Aron's project! I've got a ton of modules installed and I'm not sure all would work with PHP 5.

XML parsing

agentrickard's picture

Two options on a follow-up note.

One, Aron's parsing in FeedAPI is extensible. You don't have to pull RSS. You can pull any feed of any type. You just have to write a parser to handle it. That said, the code might not be production ready. [Note: I am Aron's SoC mentor]

Two, you could always use built-in XML-RPC handling.

http://api.drupal.org/api/function/xmlrpc/5

We wrote some routines like this for SavannahNOW. Ideally, you would parse the data into a $node structure and then use drupal_execute to insert the data.

http://api.drupal.org/api/function/drupal_execute/5

--
http://ken.therickards.com/
http://savannahnow.com/user/2
http://blufftontoday.com/user/3

Parse XML

Vic96's picture

Thanks Ken. I know very little about this. Is there anyone you can guide me to, who can take care of this?

Moshe?

agentrickard's picture

Aside from the APIs, no. But Moshe and some of the other professional service providers may have some ideas.

--
http://ken.therickards.com/
http://savannahnow.com/user/2
http://blufftontoday.com/user/3

Parsing XML with PHP4

yelvington's picture

If you absolutely must run PHP4, you can install the PEAR XML_Serializer package, which offers rough equivalence to SimpleXML at some performance cost.
http://pear.php.net/package/XML_Serializer

Yahoo Pipes

eli's picture

I'm pretty sure you can use Yahoo Pipes (http://pipes.yahoo.com/) to build a little module that will automatically fetch the xml and parse it into a regular RSS feed.

Pipes

agentrickard's picture

You'd still have to write the parser / user SimpleXML. Pipes can also return JSON, which other Yahoo! APIs will.

I have a Pipes demo module, if anyone is interested.

Here's the magic function for fetching the pipes data.

<?php
/**
* Get a pipe from Yahoo!
*/
function pipes_fetch($pipe, $location = NULL, $keywords = NULL) {
 
$cache = cache_get('pipe:'. $pipe->pid . ":l:$location:k:$keywords", 'cache');
 
$data = unserialize($cache->data);
  if (empty(
$data)) {
   
$url = $pipe->url  . "&location=$location&keywords=$keywords&_render=json";
   
$file = file_get_contents($url);
   
$data = json_decode($file, TRUE);
   
cache_set('pipe:' . $pipe->pid . ":l:$location:k:$keywords", 'cache', serialize($data));
  }
  return
$data;
}
?>

THe cache step is for efficiency. The $url takes the form:
http://pipes.yahoo.com/pipes/pipe.run?_id=PIPE_ID&arg=KEY1&agr2=KEY2&_render=json

Note: json_decode() is PHP 5.

--
http://ken.therickards.com/
http://savannahnow.com/user/2
http://blufftontoday.com/user/3

Import XML data into drupal

rkedemi's picture

I have developed a site on drupal. I am in the process of importing XML fields or data. I'm trying to figure out how to import using the Feeds modules. I have created the relevant node type using FeedAPI with fields to map the imported fields to. Is there a someone with a step by step tutorial that could quide me and do I have to write a Import XML news feed into drupal. Appreciate any guidance. Thanks.

Feeds Guide

KarimB's picture

@rkedemi, you can check this guide to Feeds: http://drupal.org/node/622696

Hope this help.

Newspapers on Drupal

Group organizers

Group categories

Topics - Newspaper on Drupal

Group notifications

This group offers an RSS feed. Or subscribe to these personalized, sitewide feeds:

Hot content this week