FeedAPI processor to modify item titles before saving
I tried writing a FeedAPI Processor module to remove the Username: string that preceeds each item on a Twitter feed. To this end I copied code from the example processor posted on this group and also took the preg_replace code from the Activitystream module. However, it doesn't seem to work even when feedapi_node is enabled and placed at a higher weight (0) than my own processor module (-1).
I'm not very good with PHP or Drupal (just someone that understands a little bit of programming knowledge and does a lot of copy/pasting), so it's probably a fault with the code. Would really appreciate if someone could look at it and point me in the right direction. :)
Code after break.
<?php
/**
* Implementation of hook_feedapi_item(). Timestamp of each feed_item
* will be parsed from the URL instead of being taken from the timestamp field.
*/
function feedapi_twitter_feedapi_item($op) {
switch ($op) {
case 'type':
// Return
return array("XML feed");
case 'update':
break;
case 'delete':
break;
case 'load':
break;
case 'unique':
// Simplest is to make this return true so that we only have to implement the 'save' operation
// and not the 'update' operation.
return true;
case 'fetch':
break;
case 'expire':
break;
case 'purge':
break;
default:
if (function_exists('_feedapi_twitter_'. $op)) {
$args = array_slice(func_get_args(), 1);
return call_user_func_array('_feedapi_twitter_'. $op, $args);
}
}
}
function _feedapi_twitter_save($feed_item, $feed_nid, $settings = array()) {
$item_author = $feed_item->options->original_author;
$item_title = $feed_item->title;
// Removes username: at the front of title
$item_title = preg_replace('/^'. $item_author .' /', '', $item_title);
$feed_item->title = $item_title;
return $feed_item;
}?>
Maybe your processor isn't being called
Perhaps your processor is not being run at all! Try adding some printing statements to your code -- if you see them printed at the top then you know that your processor is being run. If not then your feed/processor may be configured correctly to let your processor run.
I've copied your code and included some good spots to try printing out statements. Let me know what you find out.
<?php
/**
* Implementation of hook_feedapi_item(). Timestamp of each feed_item
* will be parsed from the URL instead of being taken from the timestamp field.
*/
function feedapi_twitter_feedapi_item($op) {
print 'My processor is enabled and being called'; // <-- your processor is enabled if you see this message
switch ($op) {
case 'type':
// Return
return array("XML feed");
case 'update':
break;
case 'delete':
break;
case 'load':
break;
case 'unique':
// Simplest is to make this return true so that we only have to implement the 'save' operation
// and not the 'update' operation.
return true;
case 'fetch':
break;
case 'expire':
break;
case 'purge':
break;
default:
if (function_exists('_feedapi_twitter_'. $op)) {
$args = array_slice(func_get_args(), 1);
return call_user_func_array('_feedapi_twitter_'. $op, $args);
}
}
}
function _feedapi_twitter_save($feed_item, $feed_nid, $settings = array()) {
print 'My save function is being called'; // <-- Your function is being called.
$item_author = $feed_item->options->original_author;
$item_title = $feed_item->title;
// Removes username: at the front of title
$item_title = preg_replace('/^'. $item_author .' /', '', $item_title);
$feed_item->title = $item_title;
print_r($feed_item); // or use krumo($feed_item); if you have the Devel module installed!
return $feed_item;
}
?>
Hey benroot, thanks a lot
Hey benroot, thanks a lot for the tip!
I tried adding some printing statements before but somehow didn't draw the connection between them not appearing to the processor not running at all. I thought that it was just normal since they're processed as a batch. Silly me. :(
Thanks to your nudge I found out that the processor wasn't enabled in the content type, and that I forgot to append a colon to
$item_authorso preg_replace couldn't find what it needed to find.:D
nice!
Hi Matafleur,
Glad you're all set! Good on you for writing a custom processor, after the first one it is kind of fun!
This looks like a great
This looks like a great processor, can you provide any more information about where to put this code?
Twitter usernames and urls
I"m sort of interested in this type of processor as well.
The main thing i want is easier access to original author and/or URL to twitter page which can be parsed out of the original URL if I'm not mistaken.
Next i would also like any reference to @username to be linked to there username, #hashtag to be linked to that search. From what i understand template.php would be were you would preprocess that.
removing the username is easy since it can be done through the feed url with an argument.
lastly the users image is also needed but i haven't had luck with the xml feed returning results and that's the only feed format that I've noticed the image to be in.
Any help on this would be greatly appreciated
The reason why I'm stripping
The reason why I'm stripping the username from the title is because I don't want it to show up in the site RSS feed as well. Better to do it before the nodes are saved than after, hence the FeedAPI processor.
I'm actually using the custom FeedAPI processor I indicated above to do this as well, rather than through a template function. I've already implemented it and it's working nicely.
I think the Feed Element Mapper does this quite well. I'm using it to "map" original author and URL to CCK fields in nodes at the moment.
Of course, all this assumes you are going to save each RSS item as a node. :)
Maybe its that the rss and
Maybe its that the rss and atom feeds don't have the fields I'm looking for. Because your above example works fine on my current setup.
The xml feed doesn't seem to work with feedapi. And I'm not sure why this happens. Is there a method I can trouble shoot why I can't access this xml feed?
Maybe its that the rss and
Maybe its that the rss and atom feeds don't have the fields I'm looking for. Because your above example works fine on my current setup.
The xml feed doesn't seem to work with feedapi. And I'm not sure why this happens. Is there a method I can trouble shoot why I can't access this xml feed?
RSS format, custom Parser?
Hi Illmatix,
Just so you know, the fields that are available to the processor are only those which were set by the parser. The parsers are called before processors and are responsible for validating that a feed has the right fields and for extracting data from those fields. So if you are receiving an "RSS" feed that includes a field that isn't in the RSS 2.0 spec, it will get ignored by the parser and thus won't be available to your processor. Therefore, sometimes the best way to manipulate a FeedAPI feed (and the only way to work with a non-standard xml feed) is to write both a custom parser and a custom processor.
Here is a snippet of valid RSS2.0 from Wikipedia:
<?xml version="1.0"?>
<rss version="2.0">
<channel>
<title>Lift Off News</title>
<link>http://liftoff.msfc.nasa.gov/</link>
<description>Liftoff to Space Exploration.</description>
<language>en-us</language>
<pubDate>Tue, 10 Jun 2003 04:00:00 GMT</pubDate>
<lastBuildDate>Tue, 10 Jun 2003 09:41:01 GMT</lastBuildDate>
<docs>http://blogs.law.harvard.edu/tech/rss</docs>
<generator>Weblog Editor 2.0</generator>
<managingEditor>editor@example.com</managingEditor>
<webMaster>webmaster@example.com</webMaster>
<ttl>5</ttl>
<item>
<title>Star City</title>
<link>http://liftoff.msfc.nasa.gov/news/2003/news-starcity.asp</link>
<description>How do Americans get ready to work with Russians aboard the
International Space Station? They take a crash course in culture, language
and protocol at Russia's Star City.</description>
<pubDate>Tue, 03 Jun 2003 09:39:21 GMT</pubDate>
<guid>http://liftoff.msfc.nasa.gov/2003/06/03.html#item573</guid>
</item>
<item>
<title>Space Exploration</title>
<link>http://liftoff.msfc.nasa.gov/</link>
<description>Sky watchers in Europe, Asia, and parts of Alaska and Canada
will experience a partial eclipse of the Sun on Saturday, May 31.</description>
<pubDate>Fri, 30 May 2003 11:06:42 GMT</pubDate>
<guid>http://liftoff.msfc.nasa.gov/2003/05/30.html#item572</guid>
</item>
</channel>
</rss>