Updating site from external database

Events happening in the community are now at Drupal community events on www.drupal.org.
Cousken's picture

Hello!

I will soon begin development of a new module ordered by my employer. I'm posting the details here in hope that some of you can tip me about more efficient ways of doing things, or suggesting already existing modules as building blocks. I also hope that this module will be useful to more people than just my employer.

The general idea is that we have a site, an education catalogue, which need to be updated from an external database. This normally would be no problem if all information could simply be pulled from the database, but unfortunately we only have lists of educations and schools available and we have to provide them with descriptions. I'll use our website as an example to explain the functionality of the module to make things easier.

My solution to the problem is a Notify Administrator About Change module. Everything that can should be automatically updated - things such as links, category changes, removal of educations or name changes. But when a new education or specialisation appears in the database, attention from the administrator is required - a describing text needs to be written before a new node can be published.

Technically speaking it would look like this - 2 'groups' of database tables (identifiable by prefix) or databases are created. Database1 will be updated daily from the external database, and contain information which is most up to date. Database2 will reflect what is already available on the site.

When an Admin enters the site, he can press a 'Check for updates' button. Data from the external database will then be pulled and stored in Database1 (in our case from a MySQL file stored on our ftp). Then the module will compare the differences between Database1 and Database2.

Two lists will be then presented to the administrator:

* The fist one will be of things that can be automatically updated like i mentioned before, all it requires is a press of a button and all the changes will be implemented, including in Database2. A link to each updated node will also be present, i case the Admin wants to check that everything is displayed correctly.
* The second list will present a list of new nodes that need to be created. Each position will have a link to a create node page, preferably with all the right settings already configured (menu, name, URL override). When the node becomes published, the position in Database2 should become updated to indicate that this change is now taken care of.

I hope i made the functionality of this module clear. Any contributions are welcome - ideas, questions, critique.

Comments

maybe 1 database is enough

pedrorocha's picture

i didn't see the need for this other "clone" tables just to compare. From what i understood, an extra boolean column in the database 1 tables to say if it's created or not would do the job.

I figured out something similar using the Feeds module(http://drupal.org/project/feeds). The main difference is that i was going to fetch the data directly from other databases.

Maybe the Data module(http://drupal.org/project/data) will be usefull too.

Thanks for the reply While

Cousken's picture

Thanks for the reply

While writing an answer i gave your proposal a lot of thought, and kept changing what i want to say. First i thought that i would need to republish everything, and we want to avoid that because there are some 100 pages with custom text. But then i thought that if you can have a relation table between nodes and each position in the database, it should be possible.

But then there is the questions of links leading to external sites. What i need is a way to check that every position in the updated database is already there, I don't think i can check that directly from drupals tables, unless there is a way to store links?

Right now the node hierarchy looks like this: Educations --> Education types (several) --> Educational Programs (several) --> Specialisations (many)

On each Specialisation node, there will be a list of links to each school that offers that specific specialisation, and a filter to only display schools in the regions that interest you... I don't think there is a way to create that with Drupal without custom PHP.

Perhaps one could create node-aware menu, if i have a database table that associates a node with a database entry (so that i know what links to display on it if any)... Is there a module that does that? I should perhaps consider creating one.

Somehow just having two databases and custom PHP in the node appear simpler.

Thanks for the tips about already existing modules, i will check them out.

Even with Feeds and/or Data

pedrorocha's picture

Even with Feeds and/or Data modules you will need a lot of custom php. And i agree: use the 2 databases looks simpler for me too, but will generate 2x amount of data in the database for almost nothing.

I think it's all about how much time you have to do this and if you think this will grow in the future.

I already know this will grow

Cousken's picture

I already know this will grow a bit - we need a 'snapshot copy' of all the nodes generated by the module. the idea is that there are 2 categories of educations: one that is now permanently fixed and you can apply to, and one general for the future where the newest changes appear as soon as they are decided.

The whole site needs to be done by September 1st, so i dare say i can do things 'the proper way'
. Although, i'm not so afraid of doubling the amount of database data, because there really isn't that much there, around 5 tables with a list of relationships and links.

I looked at Feeds and Data and they seemed to do something on the lines of what i need, i will investigate further. Even if i think the 2 database way of doing it is simple, i will most likely do it the way you proposed, as it seems more proffesional and Drupal integrated.

Thank you for your input :)

DROP table

Cousken's picture

Apparently the people who send us the SQL updates will include DROP TABLE statements in them, which pretty much stops my plan of having only one table.

But it also made me realise one thing which i haven't thought of earlier - things will disappear from the database, and based on that content on the website needs to be deleted or unpublished.

I'm not going to have copies of the tables, i will create an extra table/tables which will show the relation of posts in the database to Drupal elements, most likely nodes. Then, when an admin updates the database, appropriate actions will be taken - removing and adding of content, and updating links.

Contributed Module Ideas

Group organizers

Group notifications

This group offers an RSS feed. Or subscribe to these personalized, sitewide feeds: