One part of my GSoC is to add support (of some sort) to VersionControl API for having multiple branches within a project (like a drupal.org project) with different user permissions. The idea is to allow a more DVCS-like workflow, where people can work independently on their own branch and then request an admin user to pull their changes into the master repository. I won't be starting this for a few weeks, but I wanted to get a discussion going about what it should look like.
The ultimate goal would be to allow unprivileged users, those who nowadays wouldn't have any sort of commit access for a project, to be able to commit code with the same tools and environment as privileged developers. This would create a much lower barrier for entry for contributing code, and would allow new developers to more easily provide their expertise to a project.
The two main options
There are two main ways to go about a branching-based workflow:
Allow non-admin users to create their own repository, to which only they (and possibly users they authorize) have write access, but everyone has read access. This is the way that GitHub does it, and it seems to work pretty well there. They have included within their site an interface for branching from a repository and submitting a merge request.
Allow non-admin users to create branches within the main repository, but restrict their write access to exclude certain branch or tag names. This would cut down on the number of repositories created, but would require some more complex logic in the access checking functions.
Advantages and disadvantages
It is not yet clear which approach is better, and the choice depends largely on the selection of VCS, since different systems make one or the other option easier or harder.
Advantages of Multiple Repositories
Permission-checking logic would be simpler. There would not be the need to check whether a particular user had access to a particular branch, rather, users would have full access to their own repository, and the existing restrictions would apply to the main repository.
More freedom within a forked repository. Most DVCSs are fairly agnostic to the particular name of the branch on which development is done, but it is nice to have the freedom to create branches and tags of any desired name and push them to the remote repository without restriction. Rather than having to work on a branch called
chrono325(for example), a user could create their own
trunkbranch, and work on that.
More than one user branch. Related to the previous point, no extra though would have to be devoted to how many branches a user would need and what to call them, since the upstream developer would not be affected by the unprivileged user's choice of branch name. With per-user branches, if a user wanted multiple branches on the same project, they would have to be called something like
chrono325-topicand so on. With multiple repositories, there is no need to worry about that.
More "first-class" feel for unprivileged developers. Rather than having a restricted set of branches within the upstream repository, the unprivileged user would have full control over their own repository. This is mostly a psychological difference, but could create an atmosphere of greater inclusiveness and a more level field between admins and unprivileged users.
Disadvantages of Multiple Repositories
Inapplicable to centralized VCSs. CVS and Subversion do not have any means to perform cross-repository merges, so they would not be able to make use of this option at all.
More drastic change to project* schema. As far as I know (and I haven't yet checked), the VersionControl-project integration makes the assumption that each project has exactly one repository. Depending on how deeply-ingrained this assumption is, it may be difficult to add this support to versioncontrol_project.
Potentially higher disk usage. Not all VCSs have an efficient way to set up multiple repositories with mostly-similar contents. Git does, but I don't know about the others. If a VCS does not support this, then the disk space would increase linearly with the number of repositories, rather than with the number of differences between revisions. This would be a big problem for the drupal.org servers, since the disk usage could quickly balloon.
Need to handle repository access control for additional repositories. Many of the access controls rely on writing configuration files which are used by the VCS to handle authentication. More repositories would increase the number of entries in those files.
Advantages of Per-User Branches
This is probably the only viable solution in centralized VCSs like Subversion, since it was not designed to operate on multiple different repositories, and does not include support (as far as I know) for merging a branch from one repository into another.
Cuts down on the number of repositories created. The overhead for a bare repository may not be that high, but if the VCS does not have an efficient way of sharing common objects across repositories, then having a repository for each user would quickly use up a great deal of disk space.
Simpler to (re)view branches with existing tools. Especially in DVCSs, there are good tools to view the differences between multiple branches within the same repository. The support may not be as good when the branches are not within the same repository. Git has "remote tracking branches" which accomplish this easily, but I don't know about other systems.
Also, it would be readily apparent which branches are unmerged, since all of the unprivileged branches would be viewable to anyone who downloaded the repository.
May be easier to add to project* modules. I haven't looked into this closely yet, but there is already a mature system for dealing with multiple branches within a repository, but there aren't yet tools for associating multiple repositories with a single project, so it would likely be simpler to add additional permission-checking code than associating multiple repositories with a single project. Again, I have not yet looked at the project* modules, so I don't know how invasive this would be.
Disadvantages of Per-User Branches
Namespace of main repository gets polluted. The main repository would have a larger number of branches within it, which could make it more difficult for the admin to filter the signal from the noise.
Increased complexity of permission checks. Permission checks would be more complicated, since the permissions for a particular branch and user would need to be checked.
Multiple branches per user. If a single user wanted to have multiple branches, they might have to name them something like
chrono325-topicand so on. This would further complicate the logic of branch permission checking. It could also present problems for users with special characters in their names, especially if the username contains characters not permitted in branch names for a particular VCS.
Making a decision
There doesn't appear to be a solution which is uniformly better, especially since the multiple repositories option doesn't work for centralized VCSs. Probably the best solution is to implement both and let the site administrator choose.
There is also the issue of deciding what to do for drupal.org, which depends on the choice of VCS.
I tried to include everything I could think of, but obviously there are more issues to consider.