Just wanted to get some advice on a content aggregation project I'm working on.
I'll be sucking content from multiple XML feeds (Atom and custom XML feeds). I'm going to use either FeedAPI or SimpleFeed. I haven't decided that yet, but I'm sure either will work just fine for the basic requirements. I'm using Drupal 6.x.
The content I'm sucking in should adhere to a fixed taxonomy I already have defined. It's a hierarchical taxonomy with (theoretically) unlimited depth. Due to some other requirements I have to store the taxonomy details in a separate, custom database. Whether that database is in the same physical db as Drupal tables doesn't matter. If it needs to be, then I'll put it there. Tagging the content with the proper taxonomy is most likely going to be largely a manual process, unfortunately. There will be less than a thousand records a day in those feeds. Not all of them will be tagged with the taxonomy.
I will not be using Drupal for the user interface, just for publishing (or admin) functionality. The user interface will be custom PHP code using custom SQL queries against multiple data sources, incl. the Drupal db. Yes, I'm aware of the downsides.
What are my best options on using this external database with Drupal as I ingest content from my content publishing sources?
One potential solution I have is to ingest the content in multiple steps:
Use a non-Drupal to pre- or post-process the data feeds for the taxonomy information. Store a unique identifer (URL, whatever) for the content, and its taxonomy in a custom db, using the unique identifier as a foreign key between the custom db and Drupal db records. Then Use Drupal to ingest the actual content of the feeds into Drupal's db. Link the two databases at the user interface with the unique identifier.
Could I do something like this entirely within Drupal, or do I have to create an out-of-Drupal process for tacking the taxonomy on my content? I think either way will be ok, but I'd, of course, prefer to work entirely within Drupal, if I can.
Is there a way to bend Drupal to use this "taxonomy db" rather than its own taxonomy db tables? If so, how would I go about making that happen. I have relatively little experience with Drupal in this respect.
Has anyone done something similar before?
Thanks for everyone for any suggestions.