New Aggregator for Drupal

We encourage users to post events happening in the community to the community events group on https://www.drupal.org.
You are viewing a wiki page. You are welcome to join the group and then edit it. Be bold!

Project information

Project page on drupal.org: http://drupal.org/project/new_aggregator

Current status: A basic design is ready, check out the project page. Latest result: list of problematic things in content syndication.. Please check out and add comments!

Description

Main goal:
Create a simple but extensible API for aggregation that ships in a configuration that covers the most common use case of aggregating feeds as nodes on a web site.

The basic architecture consists of three pieces:
an aggregator.module that defines the API, invokes callbacks, manages settings and cron processing.
a parser_common module that downloads and parses XML data into a PHP array. This parser should be independent from aggregator.module.
An aggregator_node module that creates nodes from feed items. This is a standard add on module that depends on aggregator module.
The planned architecture layout can be found here: http://feedapi.novaak.net/design.png

Project schedule

Before last exam (20th of June):
- fine-tuning of the design and the implementation plans
- make feedapi's common syndication parser better (maybe it can be re-used)

23rd to 4th of july: initial version
7th of july to 11th of july: adjustments
14th: midterm evaluation - full patch should be available here
from july 15th on: adjustments to patch in queue, upgrade patches (if time allows).

Status updates

  • 2008-05-27: First design suggestion has been published (http://groups.drupal.org/node/11409). A discussion with the community resulted in going with feed items as nodes AND feed items as lightweight db records instead of only feed items with nodes and an optional lightweight record design in contrib. As scheduled, Aron is wrapping up the current semester at his university (he's in the middle of his exam phase). Expect more activity again from June 23 on.
  • 2008-06-03: Aron's exam period.
  • 2008-06-10: Aron's exam period.
  • 2008-06-24 Various possible designs for aggregator was outlined and talked with the mentor. I also studied what are the new habits of coding for Drupal 7 :) This week I plan to refine the design and beside this I start coding. (decision: feed items as non-nodes will be absolutely available with the new aggregator because of the high demand from the community)
  • 2008-07-02: The alpha release is out.
  • 2008-07-15: Done after the alpha release: taxonomy-based categorization for light items, aggregator_light processor has the same features as the old aggregator, node processor, various improvements. This week: roll out a patch against the old aggregator.
  • 2008-07-17: The patch is rolled out against the HEAD (http://drupal.org/node/236237#comment-923418) . The recent discussion about the patch can be also fount there.
  • 2008-08-05: Another, updated patch is rolled out against the HEAD (http://drupal.org/node/236237#comment-951354), so the folks can apply the patch to the latest CVS (there were changes since the latest patch in the HEAD)

Comments

I'm slightly concerned that

Morbus Iff's picture

I'm slightly concerned that all this work has failed to address the case scenarios and user archetypes for how people actually use aggregator, and that both the mentor and student are fans of "nodes as feed items" with little apparent concern for the existing and implemented alternative.

See more comments here: http://groups.drupal.org/node/11413#comment-37056.

I think having both is important...

webchick's picture

I still do use core's aggregator module on some "portal" sites because I'm doing things like following commit activity, or latest headlines or whatever. I don't care about what they were a week ago. I only care about what's "in the now" and storing 10,000 feed items as nodes would be a tremendous amount of overhead for this use case. I also like the ability to randomly blow away and re-import all the feeds, if I make a mistake with what tag I'm filtering by or something like that, without also losing user-submitted data such as comments.

However, for PlanetSoC, where the feed items are first-class content, then I turn to one of the feeds-as-nodes modules so that I can promote them to the front page, have comments on them, and so on. There are also some neat data import tricks you can do with feeds-as-nodes.

Supporting both use cases in a module that's intended to replace the core aggregator module is imperative, imo.

I don't see any difficulty

Aron Novak's picture

I don't see any difficulty in adding two default processors to aggregator. One for classic items and one for nodes.

Hey Aron, I'm just wondering

SeanBannister's picture

Hey Aron,
I'm just wondering if there will be some type of upgrade path from FeedAPI to the New Aggregator. Also what will happen to FeedAPI in Drupal 7, will there be a need for it or will the New Aggregator replace it or provide functionality for it. Also will modules like Feed Element Mapper be able to use the new Aggregator API instead of FeedAPI in Drupal 7?

Thanks for all your work, I'm really looking forward to this module.

Well, in long term (Drupal7)

Aron Novak's picture

Well, in long term (Drupal7) it's sure that FeedAPI users should move to the core aggregator too, so i think it's a must to have an upgrade path from FeedAPI sooner or later.