Posted by Arto on March 25, 2009 at 1:42pm
I've just posted step-by-step instructions on how to use the Drupal 6.x RDF module to upgrade your Drupal site's RSS feeds for compatibility with RDF and the Semantic Web:
http://ar.to/2009/03/rdfizing-drupal-rss-feeds

As I don't do comments on my site, please feel free to discuss here.

Comments
Hi, some gone wrong with taxonomy
Hi, when I enabled this module as this tutorial said my taxonomy terms stoped working. And when I turned off this module this was fine again.. Oh one thing.. when i did changes threre was no arc jet. Can someone say how to fix this problem? My website url is www.web3.ee
Missing taxonomy terms?
What version of the RDF module are you using? I just was able to RDFify my feed following the tutorial using RDF 6.x-1.0-alpha6. I had a problem with my RSS feed that was caused by some of my code within a node. Once I fixed that, the feed worked perfectly as RDF (Here's my feed: http://www.juliakm.com/rss.rdf).
Also, did you happen to change your taxonomy term feed instead of the front page one by accident?
Recommend using 6.x-1.x-dev
First: for anyone experiencing problems, I'd recommend giving the latest development snapshot a go, as given the pace of ongoing development it is practically guaranteed to always be of higher quality than the module's latest alpha release.
@Kaido, could you describe in more detail how the taxonomy terms stopped working? Do you mean that the pages disappeared, or that their feeds disappeared, or something like that?
@Julia, excellent - may I add a link to your site to the blog post?
Re: RDFizing Drupal RSS
@arto Sure. Thanks for the great tutorial!
@Kaido24 I wonder it it's a clean URLs problem. Do you have clean urls turned on? Can you access the term pages with the non clean urls path and RDF turned on? For example, mysite.com/user becomes mysite.com/?q=user.
Special note to Views users
Note also that in case you have enabled the
frontpageortaxonomy_termdefault views shipped with Views 2.x, those will take precedence over both the RSS 2.0 feeds published by Drupal core as well as the RSS 1.0 feeds provided by the RDF module.That's not a big problem to overcome, however, as the RDF module also includes Views integration that you can utilize to output RDF-compatible RSS 1.0 feeds from any view; this will be the topic of an upcoming article in the RDFizing Drupal series, but in the meantime ping me if you need a hand figuring it out. (You could also just disable those two views, in which case everything will work per the blog post above.)
RDF Site Summary 1.0 vs Really Simple Syndication 2.0
For those curious about how RSS 1.0 constitutes an upgrade over RSS 2.0, see my comment at the Reddit thread:
http://www.reddit.com/r/programming/comments/87d5h/rdfizing_drupal_upgra...
The upcoming parts in the article series will demonstrate ample reasons to prefer RSS 1.0.
Semantics vs syntax....
Semantics vs. syntax is an interesting debate to have, but the unfortunate fact with RSS is that the syntactic model has already won. Readers and aggregators are equipped to handle the "extra" tags that come with RSS 2, and they can already make use of the metadata contained in RSS 2.
RSS 1, while infinitely (semantically) extensible (in theory) still requires that syntax-specific aggregators and clients know how to use the tags. Do most readers/aggregators even handle the common Dublin Core metadata? If not, then for pragmatic reasons it seems that allowing Drupal to provide an RSS 2 is allowing it to convey the most data to the greatest number of aggregators and readers.
But this needn't be an either/or question. You should be able to provide both (using, say, rss.rdf and setting the link for that feed to handle type="application/rdf+xml").
I would be even more enthusiastic to see a linked data feed that actually served clean linked data instead of an RSS 1 implementation that didn't quite hit all of the LD high points (e.g. it still uses RDF:seq, which IIRC, is frowned upon in LD).
This way, you still play to the dominant form of XML consumption -- syntax driven -- while providing for LD-enabled consumers to do their thing well.
For example, I'd LOVE to see something that, say, overrode hook_init() and returned linked data when the user agent reported itself as consuming application/rdf+xml. I'm not sure how exactly that would play out (and I haven't really looked into existing modules), but that -- to me -- is the more appealing solution.
That said, I'm actually really looking forward to the remaining articles. I'm eager to see what else can be done with the RDF module.
Matt
Blog: http://technosophos.com
QueryPath: http://querypath.org
Blog: http://technosophos.com
QueryPath: http://querypath.org
Code talks ;-)
I don't know what it means to say that something has "won", but then I was never a big fan of popularity-based metrics.
I also don't know that there is any actual incompatibility to address, here. RSS 1.0 was around a couple of years before RSS 2.0 (so chronologically, at least, the purported version number has some validity), and continues to be widely published; at present not as widely as RSS 2.0, this is true, but more than widely enough for ubiquitous aggregator support. What's more, any namespaced XML properties that you'd use to extend RSS 2.0 or Atom feeds, you can equally use in RSS 1.0 feeds as well (now, you couldn't necessarily dereference them, i.e. there might not be any RDF vocabulary out there; but you can certainly use them). This is, in fact, what the sponsors of the RDF module are doing with regards to e.g. calendar information and the geolocation information provided by the Location module; I might detail that in a future post.
I'm not opposed, in principle, to the RDF module providing backwards-compatible HTTP content negotiation as you suggest. Given, though, that I don't myself have any vested interest in preserving the RSS 2.0 feeds, I'm not all that enthusiastic about coding up that bit. I would consider including a well-written patch to that effect, but the RSS 2.0 content negotiation would have to be an optional setting, as
application/rss+xmlis used also for RSS 1.0 feeds and I would certainly not want to inadvertently be publishing RSS 2.0.I should also say that my original purpose in providing RSS 1.0 feeds for Drupal and writing the blog post in question wasn't to try and convince anyone why they ought to use RSS 1.0 over RSS 2.0, which is why I did not belabor the point. Rather, this was targeted at people who already know that they would like to publish RDF and don't really care about RSS 2.0 one way or another. This includes, for example, the Drupal-using bloggers at Planet RDF, who have until now had to write custom modules in order to publish an RSS 1.0 feed. That process now, as the blog post shows, requires no development effort beyond a few clicks of the mouse.
(About the use of
rdf:Seqin RSS 1.0: I believe the intent here is that it allows the RSS document to include arbitrary supplemental RDF descriptions of resources referenced by the feed items; in addition, RDF/XML resource descriptions are not inherently ordered, whereas blog posts certainly are; hence a separate ordered index of the feed items (sans supplemental resources) is necessary, and the way to do that in RDF is withrdf:Seqorrdf:List. This couldn't really be done away with if the feed ought to remain RDF and yet be traversable in a guaranteed order. In any case, RSS 1.0 precedes the Linked Data concept, that being only one useful subset of the longer-established Semantic Web picture.)er, well...
A big fat bold "recommended" note and the lack of a description why people might want to stick with the RSS 2.0 default... if I'm an inexperienced user and don't really care about 1.0 vs. 2.0 one way or another, that page seems pretty convincing to switch to RSS 1.0 without knowing the rationale behind that move. As long as the RDF module is only installed by people who know RDF, that's probably ok... but if it gets more popular and becomes a dependency for other modules, you might want to reconsider the wording on that page.
I used
2009-Mar-02 version alfa-6 There is no cck or view modules installed. I used taxonomy module a lot.. Taxonomy problem was like this - all terms showed no page. Page 404..
Thank you all for your help but still
I got an idea maybe I should clean some data from database .. and install this plugin again. Maybe there gone some wrong way.
I tried all this what you have said.. Disabled clean urls upgraded module to newest and so un... Nothing did´nt help, but still I want this to be done. Could there be problems with my server maybe... In the other server I used rdf + calais and this worked fine.
Could be a module conflict
Kaido, unfortunately I can't think of any proximate cause for the problem you're experiencing, so I suspect a bad interaction with some other module. Hopefully it will be solved if you try this on a clean install. If you do discover the cause for the problem, would love to hear about it.
What about Atom?
Thanks for this tutorial. Can't wait to read more of RDFizing Drupal.
I wonder:
what about the Atom Syndication Format (application/atom+xml)? It's a IETF proposed standard (RFC 4287) and it has various advantages over the "RSS jungle".
I'm no expert, but as far as I understand, you can use any formats inside of an Atom feed - so you could use RDFa, right?
Simplifying RDFa Notation (page 5):
I try to use Atom whereever I can. Would be great, if it would be possible here, too. Maybe in combination with the Atom module?
See also / Related:
AtomOwl Vocabulary Specification
Atom and RDFa
AtomOwl can be used in RSS 1.0 feeds, as can any other RDF vocabulary. You can see this in practice in some of the feeds syndicated at Planet RDF. So, AtomOwl is certainly useful in providing another standard vocabulary for syndication purposes, in the same way as the various RSS 1.0 extensions. Think of RSS 1.0 as the standard-sized, RDF-compatible envelope and AtomOwl & other vocabularies as the contents of the envelope.
However, the Atom syndication format, per se, isn't designed for the Semantic Web any more than RSS 2.0 is. It's just another XML format, and while you could certainly stuff RDFa into it and use an RDF extractor to get some triples out, that's true for any legacy data. The benefit of RSS 1.0 is that you don't have to convert it to RDF, as it is RDF. It's usable as Linked Data from the get-go.
Okay, I see - thanks.
Okay, I see - thanks.
Do you know who is responsible for RDF 1.0? Is there any standardization organisation behind it?
Ah, and a different question/topic:
are there any triples included (or is it express in the Schema/Ontology? Sorry, I dipped not fully into this topic), that tell: "this is a feed which means that it shows the last
Xnodes; it is updated regularly" - you know what I mean? So that RDF parsers / agents get the idea, that this is no "static" document?Syndication hints
RDF is a W3C standard, RSS 1.0 is a product of the RSS-DEV Working Group, an informal working group of industry participants.
As for the "regularly updated", yes, you can express that e.g. using the syndication hints vocabulary, which the RDF module includes support for (see the "Item settings" section in the blog post).
Thanks // Link with hash?
(ah yes, I meant "RSS 1.0", not "RDF 1.0")
Okay, I understand.
There is even a Wikipedia article: RSS-DEV Working Group
Would it make sense to use
example.com/feedas URL, which redirects (with Content Negotiation) to whatever version the client likes (RSS 1.0 version for machines/agents and Atom version for humans, for example)? (okay, at the end it doesn't matter, because users won't realize the format in their feed reader, I think).Ah, and another question:
if you've got a FOAF profile, you can use
/foaf.rdffor the document itself and/foaf.rdf#mefor the person, right? What about the feed URI? What does/feed.rdf#itmean? The feed itself (so you should use this one to link to it?) or maybe the blog? Or what?Now its works but one tiny thing
I had to disable taxomomy rdf and then it works well. I cant dont understand why, but there is more issuse with my taxonomy. I noticed I cant see any rss feed when Im in administration area :)
I want to thank @arto for this great tutorial and help and also I would like to thank Julia :). It was really helpful for me. I could´nt get this to work without you both :). And to ARTO if you would like to link, here it is http://web3.ee/rss.rdf
Ou one more question to Arto can I translate and use your tutorial and pics in my totorial site.. I would link back too :)
Links added
Great that you got it working. I've now added links to both you and Julia at http://ar.to/2009/03/rdfizing-drupal-rss-feeds.
And yes, feel free to translate the tutorial if you like (you can do anything you want with it).
Me too!
My feeds are RDF-compatible now:
http://ludovf.net/rss.rdf
Now, these feeds contain different data about the posts than whatyou get by going to /node/*/rdf; is that a problem?
Great work!
Arto,
You continue to amaze!
R,
C
Another site RDFized
Set this up mostly over the weekend and had the front page http://charlotteregionalstc.com/rss.rdf working immediately but had experienced issues with the blog and aggregator not always validating via the W3C Validation Service. Now today those two seem to be working okay via using Google Reader.
Nothing interesting being published at the moment from any of these, but the front page rss.rdf will eventually have our local STC chapter monthly meetings and events and I'm really looking forward to seeing how it displays the Date and GMap Locations info in it.
Thanks for a well written tutorial arto!
Cheers,
Varenne
I appreciate the work, but I
I appreciate the work, but I don't think this is a necessary part to RDFalize your site.
RSS 1.0, or even the RSS 1.1 really isn't supported anymore, most of the folks have gone on to use Atom. Our interests have progressed more to adding RDFa to our web sites, which provides all that we could find in RSS 1.0, and then some.
RSS 1.0 hasn't been updated in close to a decade, but times have changed. Atom has kept up with changes associated with web content. I'd rather see people incorporate RDFa into their sites, but use a more modern syndication feed format, which is Atom 1.0. That's just me, though.
Now what I'd really like, is access to drupal.org data, for documentation, modules, etc, in an RDF format. Perhaps using RDFa.
XML feeds available
There is project data available via http://updates.drupal.org -- the same thing the update module reads. Please put in a request on the infrastructure list -- I meant to bug dww about this, since I asked about it at Drupalcon
Where is the infrastructure
Where is the infrastructure list? Who is dww?
Issue queues
Infrastructure issues queue: http://drupal.org/project/issues?projects=Drupal.org+infrastructure -- you can submit a request from there.
dww designed and maintains the system for releasing modules on Drupal.org -- including the XML that is output by the updates module. There is some info on this XML patch issue thread, which is the XML patch that I was bitching at Drupal Modules about to help actually finish, rather than screen scraping.
I'm totally for making that info available and remixable externally as much as possible. Please come back and add the link to the issue you create, as I'd like to follow up with it.
Edit: the xml is available (e.g. http://updates.drupal.org/release-history/install_profile_api/6.x) -- I just don't know what the path is to e.g. retrieve a list of all projects, etc. so that's what you'll need to ask for.
Thanks a bunch, Boris. I
Thanks a bunch, Boris. I will submit a request at the list and link here. Someone had pointed out the patch thread, and I'll look at it more closely, too.
The path to retrieve all the
The path to retrieve all the projects is http://updates.drupal.org/release-history/project-list/all
There is a discussion about using DOAP and BAETLE for marking up the update status feed.