Posted by alex_b on June 29, 2007 at 10:51pm
It's a question that I wanted to ask long before. Kreynen's story of how the University of Nevada could use an aggregator for covering the Tahoe fire makes me finally post: What are you using an aggregator for?
- What's its purpose?
- What feeds are you subscribing to?
- Who is reading it?
- Who is creating and moderating feeds and feed items?
- What module(s) are you using and why?
- Are you using the aggregator for collecting other items than news?
- What's cool about it?
- What are the frustrations?
- ...
Let's collect here our use cases for getting a better picture what people's expectations towards aggregation are and to have a good ground for discussing some design issues with aggregators.
Please feel also free to add a question to the list above. I made it up off the top of my head.

Comments
I would like to use an aggregator for...
Just a few off the top of my head - would be interested in seeing any tips and examples.
Gus Austin
PepperAlley Productions
Gus Austin
My main goal is to to have
My main goal is to to have several feeds as input, and have as output a web-page containing only the feed-items I like.
Ideal for me would be the following:
-Specify several feeds of news-articles (e.g. 10 feeds supplying each 3 feed-items per day)
-I would like all feed-items added to the database (e.g. one node each (~30 nodes/day))
-Then I would give the good feed-items a rating (from 1-5), the other ones stay rating 0. (~10 rated nodes/day)
-Then I could display the rated items in a "view". (Sortable by date, rating, author, etc.)
-(This view could in turn be a rss feed or send out notification e-mails if a rated feed-item is added)
-I could also have a second "view" showing all feed-items (even the not rated ones)
My main problem is that every feed is a little different. Some supply a link to the content, others supply a teaser, and even others supply a full article. Some supply as title a link to their website, others just give a text title. Some supply a photo of the author, others supply an additional mp3 file.
I don't know how to handle this, but I would be thankful for pointers to the right modules.
My personal goal is to have
My personal goal is to have something near Reblog, with a easy UI.
Tango Diggs
Hi,
i use leech in my site for aggregate differents news for tango in italy.
I use various content type e variuous node template per take the info and yahoo terms for create tags in node
the site is : http://yoga.netsons.org
Traditional stuff; pulling in others' blogs
What's its purpose?
On Planet SoC I'm pulling in external blog entries from Summer of Code students and mentors to Drupal to centralize them in one place.
What feeds are you subscribing to?
Personal blogs from student/mentor developers from all over the world. So we get everything from Wordpress to Blogspot to weird hand-rolled XML. It's been fun debugging some of them. ;)
Who is reading it?
Mostly other SoC developers, open source enthusiasts, or anyone interested in Summer of Code. But I think by far more people subscribe to RSS feeds than actually use the site.
Who is creating and moderating feeds and feed items?
I have a manual approval process where I check to ensure the student or mentor is actually a student or mentor on Google's list, and then add them to a role with "create feeds/feed items" permissions.
What module(s) are you using and why?
I'm using SimpleFeed because I needed the entries to be nodes, so I could do stuff like promote them to the front page and enable subscription options on them. I realize that there are other modules that create feeds/feed items as nodes, but Ted (a fellow Lullabot at the time) was talking up SimpleFeed quite a bit, so I tried that one on for size (and helped fix a bunch of bugs with it in the process).
What's cool about it?
I like the flexibility that nodes provide; SimplePie is routinely cited as the "best" parser, so it's nice to have that as a back-end. After the initial few weeks of bug-fixing, the site is basically completely self-sufficient now. I just log in every couple days to check for new accounts that need approval, and any spam comments or forum posts that got through.
What are the frustrations?
Beyond the typical stuff of installing an unstable module (which is also using an unstable version of a third-party parser [SimplePie]) and having to deal with growing pains associated with that, there was also something unexpected... A user actually asked to have his blog removed from Planet SoC because his blog entries under Drupal were getting better page ranks than his personal blog. I never heard Drupal's SEO referred to as a 'bug' before, but there you are. ;)
Team aggregator "Managing News"
What's its purpose?
Managing News is a multi user feed reader for organisations or teams. Users can create groups, add feeds to those groups, comment and vote on feeds. There are also some tag-based analysis tools.
What feeds are you subscribing to?
Almost exclusively news feeds directly from blogs or other news sites or from news or blog search engines.
Who is reading it?
The system is closed. So the readers are the creators of the feeds: teams of 2 to 20+ people. Up to now we mostly installed this tool for organizations in the international development and advocacy scene.
Who is creating and moderating feeds and feed items?
Everybody who has an account on the system.
What module(s) are you using and why?
We chose to use leech. When we started the project leech was the aggregator2 successor in the starting blocks. We chose it because we wanted to create nodes of feed items and represent feeds as nodes. We also did a couple of more specific things like integrate with organic groups or inherit taxonomy from feeds to feed items. We also use url_profile which identifies the source of articles that are not coming directly to the feed you are subscribing to. This is a very powerful feature in conjunction with keyword feed searches on search engines like news.google.com.
What's cool about it?
As leech is fully node based, it gave us all kinds of flexibilities: using voting, commenting, custom stuff with the nodeapi. Its parser is pretty fast which is a huge thing regarding the amounts of feeds some of the Managing News systems are subscribed to. Today there are other parsers in the field (simplefeed, aggregation, feedparser) that I would definitely give a closer look if we where to take the aggregator decision again.
What are the frustrations?
* The architecture of leech is pretty complex and it its feature-richness makes it a big module. I am setting my hopes now on a new, more modular aggregator solution to be a viable replacement for leech.
* Downloading, parsing and storing feeds is a slow business. We are constantly fighting to improve details for making sure to suck all the new stuff out there into the system.
http://www.twitter.com/lxbarth
pulling in blogs from departmental staff
What's its purpose?
We have a blog Pilot which in essence creates a separate blog for each Faculty, Staff, and Student for the university, but nothing to tie them all together. The second challenge was then our communities sites based on services that were developed and supported by our departmental staff which are running separate Drupal installations. Using Simplefeed we can pull all this content together into one large aggregation for searching. All tags are also imported and used for indexing.
What feeds are you subscribing to?
Feeds from personal blogs of Staff and Community support sites.. approximately 30 sites at the moment but that number is increasing rapidly.
Who is reading it?
Most of the Staff, including higher up management, for the department who would have never found all the sites or taken the time to add each individual feed to their own rss reader.
Who is creating and moderating feeds and feed items?
No moderation is being done, but only Staff can add new feeds to be aggregated. Comments are only allowed on the Original Posts, though it would be nice to aggregate the conversations developed around the feeds too.
What module(s) are you using and why?
Simplefeed, based on SimplePie, has the right flexibility for our needs. It is simple enough for Staff to add feeds, who have no idea if a particular feed is ATOM, RSS, or XML based. I can also aggregate all of the tags with the content, and have started to submit patches and code back to the developer.
Are you using the aggregator for collecting other items than news?
Actually we aren't aggregating any news, unless you consider public announcements to our community sites. The news directed at Staff is added to the internal portion of the same drupal site in the form of a newsletter or private posts and indexed with all the other content since everything is a node.
What's cool about it?
I think the best thing has been the aggregation of the Tags for quick indexing, and in general all the content being searchable. We have several Staff blogging about the iPhone and I can select just the iPhone tag to see all the posts from everyone. In Simplefeed I can also mark certain tags to automatically to be added to feed items if a particular feed is specific to a topic.
What are the frustrations?
Currently just small coding problems with the module, including a more recent one with SimplePie only aggregating one tag per feed item in the 1.0 release. There were also schema changes from the Alpha version to the Beta version of Simplefeed that I had to make up our own migration path for. We have been extremely excited about how it is turning out.
Linkblog
I use aggregator, Google Reader and del.icio.us to provide a linkblog as a sidebar on my site.
Here's a how-to post from last week.
What's Cool is that it's easy to create a linkblog with very few keystrokes while reading RSS or surfing the web.
My biggest frustration is that for some other projects I'd like to find a way to tag items into a feed without using del.icio.us (i.e. news stories about my company which I don't want to tag publicly) and I'd love to have a way to easily apply annotations or additional fields to items ... perhaps a easy means to promote items with a single click to a node type of the site-owner's choosing which could contain additional fields.
--
Blog: Joshua Brauer dot com
MotoGPod
Hi,
I just used an aggregator for a recent project, so perhaps have something to add here...
What's its purpose?:
A friend wanted to convert his 80-episode podcast to Drupal, without re-entering all the data which he'd already done manually with his RSS feed editor. I configured the aggregator to use a RSS feed from his existing site, and wrote a bit of code to pull in his iTunes tags, etc, and populate the nodes with all the available data.
What feeds are you subscribing to?
Just the feed from the original site.
Who is reading it?
A few thousand listeners to the podcast (via iTunes etc)
Who is creating and moderating feeds and feed items?
I did it as a one-time import exercise.
Are you using the aggregator for collecting other items than news?
Yup, it's all podcast episodes (with mp3 enclosure)
What module(s) are you using and why?
Leech, because it gave me more control over the nodes that would be created for each item within the feed. In my case, I was creating "audio" content items, and needed to pull the mp3 itself to a local file, then fix up the template node to refer to the file just fetched. I started with SimpleFeed but it didn't offer the flexability I needed (though it was certainly simple! :)
What's cool about it?
It can create nodes based on a node-template, which means that I can just set up the template properly (from the UI), rather than doing a whole lot of coding to set the fields of the object.
What are the frustrations?
It took a bit of mucking around to get Audio nodes working nicely with Node Templates, though that wasn't the fault of the aggregator. The way that the whole RSS feed was included in $node during construction was a bit of a drag - I had to hunt through it to find the item being created.
Simon Roberts
Taniwha Solutions
Member blogs on a community site - creative sources
The PHP-GTK community site, gathers the individual (non-drupal) blogs community members have declared in their profile, in a category block (right column, "community news")
This is useful, but I had to do a custom block in the site module, because of a missing feature in aggregator module, and I'm not sure how the new FeedAPI will help solve this. One of the bloggers outputs almost-daily posts (sometimes more). Since his output is almost always on-topic and valuable, it is out of the question to remove that blog from the list, but on the other hand, using the default category block, it means that posts from all the other bloggers combined are rarely seen because they're hidden in the flow from that particular blogger.
The custom block limits the number of items from the same feed (by fid) to a fixed amount. It would certainly be nice if the future aggregator implementations could similarly have a setting to limit the number of items kept for display when adding a feed (like on this page).
Gregarius -> Drupal Module
Would be neat to transform http://gregarius.net into a fully functional module for Drupal.
Sam Rose
Social Synergy
Blog
Sam Rose
Hollymead Capital Partners
P2P Foundation
Social Media Classroom
What features do you NOT see
What features do you NOT see in drupal+one of the aggregator contrib modules that gregarius offers?
http://www.twitter.com/lxbarth
Right now
We're using as a blog aggregator for http://bfwatch.barcampbank.org/?q=/aggregator It's just a collection of feeds at this time
Sam Rose
Social Synergy
Blog
Sam Rose
Hollymead Capital Partners
P2P Foundation
Social Media Classroom
With all of the differences
With all of the differences, I'm embarking on someway to have all feeds display on the front page, whether they be rss, xml or opml or atom or somethingelse..
Talk about jumping in at the deepend!
I don't know what language your capcha machine is reading but its not english, so how do i tell the 5th word, if its not really a word - some kind of practical joker ... hhmmm