News on project node integration

Events happening in the community are now at Drupal community events on www.drupal.org.
jpetso's picture

Actually, I wanted to write a piece on how the Version Control API is maturing, but I've got to leave here in a few minutes and won't have internet available until tomorrow, so this must wait for now. Instead, a short coverage of the Version Control / Project Node Integration module (versioncontrol_project.module).

Yeah, you might hate me because the Version Control API still contains only function signatures, but no real implementation (instead of a few functions delegating their job to the backend). But alas! things are improving. Apart from the other things that my proposed schedule demands, I've been looking into implementing some of the more straightforward functions.

As expected, the API part of the project node integration module is easy in comparison to the Version Control API itself. Basically, it's only a small wrapper above the Version Control API that adds information about projects and project maintainers. It has got an extended get_commits() function that adds two more possible constraints to the standard versioncontrol_get_commits(), and the nice thing here is that it doesn't need to tackle with versioncontrol.module's database, versioncontrol_get_commits() provides all the querying features that are needed for project node integration.

So as project node integration essentially will just consist of some database access functions and some more Forms API hooking, I did my first actual implementation in this SoC project (early, right? :-}) and kinda finished the API side of the project node integration module, taking care of the database access functions. Allowing multiple repositories per project node is something that I still need to look into, but it should be too hard. Caching will probably also need to be done sometime. The rest of the module (Forms API magic) is scheduled for way later, when cvs.module's code will be ripped apart.

And so that you finally get to see something in this blog, er, update post series, here's the current API of project node integration, minus most of the API docs, because they really don't fit into this browser window. Sorry if the PHP filter screws up a bit, but it should look reasonably ok.

<?php
// Retrieve a set of commits for the specified repository if they match the
// given constraints. If no single commit matches these constraints,
// an empty array is returned.
function versioncontrol_project_get_commits($constraints = array());

// Retrieve a list of all maintainers for each of the given project nodes.
function versioncontrol_project_get_maintainers($project_nids);

// Retrieve a list of all projects for each of the given node ids, including
// detailed project data for all of the found projects. If no list of node ids
// is given, all existing projects will be retrieved.
function versioncontrol_project_get_projects($project_nids = NULL);

// Retrieve a list of all projects for each of the given Drupal uids
// where the corresponding user is marked as maintainer of the project node,
// including detailed project data for all of the found projects.
// If no list of node ids is given, all existing maintainer/project mappings
// will be retrieved.
function versioncontrol_project_get_projects_by_maintainer($maintainer_uids = NULL);

// Retrieve the project for the given item (file or directory) in a repository.
function versioncontrol_project_get_project_for_item($repository, $path);

// Add or update a project. This operation will fail if the given project's nid
// doesn't correspond to an existing node, or if another project with the same
// project directory as the given one already exists.
function versioncontrol_project_set_project($project, $maintainer_uids = array());

// Delete a given project and its maintainer associations from the database.
function versioncontrol_project_delete_project($project_nid);

// Assign one or more maintainers for the given project node. Any previously
// existing maintainer entries will be replaced by the new set of maintainers.
// This operation will fail if the given project node doesn't exist or has
// no project assigned.
function versioncontrol_project_set_maintainers($project_nid, $maintainer_uids);
?>

As always, input is appreciated. And remember to come back for my next post on the actual Version Control API :)

Comments

A few questions and replies.

dww's picture

Great to see some real progress again. A few questions/comments:

// Retrieve a list of all projects for each of the given node ids, including
// detailed project data for all of the found projects. If no list of node ids
// is given, all existing projects will be retrieved.
function versioncontrol_project_get_projects($project_nids = NULL);

Neither the comment nor the name of this API function make much sense. Can you elaborate? How is this fundamentally different from node_load()?

// Retrieve a list of all maintainers for each of the given project nodes.
function versioncontrol_project_get_maintainers($project_nids);

Why does this take an array of nids and return a (presumably) nested array of nid to maintainer mappings? Wouldn't this be more clear to just take a single nid and return a single array. If you happen to need N of them, you call N times. Are you just worried about N separate queries? What's the use-case for getting the maintainers for N projects all at once that you seem to be trying to optimize for?

function versioncontrol_project_get_projects_by_maintainer($maintainer_uids = NULL);

Same story here -- why the array for input and nested array for output?

// Add or update a project. This operation will fail if the given project's nid
// doesn't correspond to an existing node, or if another project with the same
// project directory as the given one already exists.
function versioncontrol_project_set_project($project, $maintainer_uids = array());

It's not clear why the maintainer_uids is separate from the $project. The $project is basically the $node objection for the project node, right? So, it's going to have a bunch of version-control related fields, no? Why not just have an array of maintainers as one of the version-control related fields in directly in the $project node? I'm not convinced this is a good idea, I'm just asking what you think.

// Delete a given project and its maintainer associations from the database.
function versioncontrol_project_delete_project($project_nid);

That comment probably isn't accurate. You're not actually deleting the project from the database. You mean something like "Delete all version control data for the given project", right?

A few answers.

jpetso's picture

// Retrieve a list of all projects for each of the given node ids, including
// detailed project data for all of the found projects. If no list of node ids
// is given, all existing projects will be retrieved.
function versioncontrol_project_get_projects($project_nids = NULL);

Neither the comment nor the name of this API function make much sense. Can you elaborate? How is this fundamentally different from node_load()?

This is different from node_load() in that no node is loaded within the function. It's basically the API pendant of "SELECT * FROM {versioncontrol_project_projects}". In the actual module file which contains complete API documentation, you can also see that the $project return value is not a complete node but only a triple of project nid, repo id and project directory.

I thought about calling the function "versioncontrol_project_load()", but decided not to do it in order to keep the Version Control API more consistent within itself. (Every "fetch" function is called "[module]get*()", and especially with versioncontrol_project_get_projects_by_maintainer() also in place, those two need similar function names.

That said, it might actually be a good idea to autoload this project information into project nodes via hook_nodeapi(). Should we do that?

// Retrieve a list of all maintainers for each of the given project nodes.
function versioncontrol_project_get_maintainers($project_nids);

Why does this take an array of nids and return a (presumably) nested array of nid to maintainer mappings? Wouldn't this be more clear to just take a single nid and return a single array. If you happen to need N of them, you call N times. Are you just worried about N separate queries? What's the use-case for getting the maintainers for N projects all at once that you seem to be trying to optimize for?

I guess you're right here. Yes, I was worrying about unnecessary multiple querying being done, and I imagined it would be a good idea to do the same as a lot of other Version Control API functions and return multiple results at once, but it seems that versioncontrol_project_get_maintainers() and also versioncontrol_project_get_projects_by_maintainer() don't have a real use case for multiple input / multiple output calls. I'll change that to a simpler return value scheme.

// Add or update a project. This operation will fail if the given project's nid
// doesn't correspond to an existing node, or if another project with the same
// project directory as the given one already exists.
function versioncontrol_project_set_project($project, $maintainer_uids = array());

It's not clear why the maintainer_uids is separate from the $project. The $project is basically the $node objection for the project node, right? So, it's going to have a bunch of version-control related fields, no? Why not just have an array of maintainers as one of the version-control related fields in directly in the $project node? I'm not convinced this is a good idea, I'm just asking what you think.

The rationale behind this was that if the maintainers go into the $project array, they should always be there, not only in this function but also in the return values of versioncontrol_project_get_projects(), which means that the project and maintainer tables would always need to be joined even if maintainers are not required to be known. So I thought it would be better to have the caller fetch maintainer associations explicitely when they are actually needed, which in turn makes the $maintainer_uids array a separate entity.

So the actual question is whether maintainers always need to be known for every possible use (or at least the majority of uses) of project information. If the list of maintainers is always required, we could indeed make the $maintainer_uids array a member of the project data array.

// Add or update a project. This operation will fail if the given project's nid
// doesn't correspond to an existing node, or if another project with the same
// project directory as the given one already exists.
function versioncontrol_project_set_project($project, $maintainer_uids = array());

That comment probably isn't accurate. You're not actually deleting the project from the database. You mean something like "Delete all version control data for the given project", right?

That's much better than my initial comment, I'll change it at once. Thanks for the pointer.

Issue tracking and software releases

Group organizers

Group notifications

This group offers an RSS feed. Or subscribe to these personalized, sitewide feeds: