Mercurial findings

Events happening in the community are now at Drupal community events on www.drupal.org.
jpetso's picture

So, Mercurial, or "hg" in short. One of the two revision control systems originally written for being the Linux kernel's BitKeeper replacement - in the end, git was chosen for that job, but Mercurial lived on nevertheless. The requirement of being suitable for kernel development implied two features: performance (although git is still a tiny bit faster) and distributed development methodologies. Both are also present in git, and in fact, the two systems are very similar in scope, paradigms and usage patterns.

While git is written in pure C, Mercurial mostly consists of Python code, with only small, critical code paths implemented in C. This has benefits for Mercurial's portability, as can even be observed with other SoC students. git is pretty much focused on just Linux, whereas Mercurial can also be installed more easily on Windows and other Unixes. Not that straightforward graphical tools are yet as mature as the ones that exist for CVS or SVN.

It seems that Mercurial is in the process of becoming quite popular nowadays. Prominent users include OpenSolaris, the Xine video player, and Xen. Most recently the Mozilla community has also decided to switch (read here and here for their VCS decision process). Also, Google Video has a presentation of Mercurial explaining its most important points. There's even an extensive, well-written online book available for further explanations.

Basics

Naturally, Mercurial provides all the standard commands with 'hg command' respectively 'hg help command'. Like git, it has all those distributed VCS niceties in addition to the common add, delete, move, commit, diff, log, etc. commands, which means that push and pull operations are also in there. As with all distributed version control systems, all operations happen client-side and are very speedy as a consequence. Worth a mention is probably the fact that pull (update the repository from another repository) and update (update the working copy from the repository) are separate commands, which is supposed to offer better control over merging operations. It uses SHA-1 hashes as revision identifiers, and commits are usually called "changesets", or alternatively "revisions".

Being a Python program, Mercurial can easily include plugins ("extensions" in Mercurial-speak) and makes good use of this opportunity. Thus, a couple of extensions already ship with the default distribution, and additional ones can just as well make use of the flexible Python API that Mercurial provides. There are neat extensions like access control lists, Bugzilla integration, or a totally awesome quilt-like patch management solution called Mercurial queues. (I really need to try quilt for managing my filefield patches.)

Branching, tagging and file structure

Tags in Mercurial are really simplistic: a tag is just a string that is assigned to some revision, and all existing tag/revision associations are just stored in a simple text file that is under revision control. Branches, especially working branches, are usually done by cloning the whole repository and pulling the changes back. Originally it seemed to me that "normal" branches were not supported in favor of full repository clones, but it turns out that they are just less visible, equally capable, and called "named branches" (in opposition to plain diverging commit trees, which regularly occur when pulling from other repositories).

With all kinds of tagging and branching supported, file layout is completely dependent on the project creator. Partial checkouts like in CVS, Subversion or Bazaar are not (yet?) supported, so here's one more VCS (in addition to git) that would require different repositories for each project.

Authentication and hook scripts

Remote repository access is either done with the built-in 'hg serve', with SSH, or by means of a CGI script running on Apache/lighttpd/whatever (say: HTTP or HTTPS). For central repositories, you'll likely use one of the latter two, and make use of their authentication mechanisms. When users are authenticated, they can comfortably be granted or denied access by making use of the access control lists extension.

Hook scripts can take two forms: either as standard shell scripts, or as more tightly integrated Mercurial extensions written in Python. As is the case with git, the possible hooks are a bit more diverse than for centralized systems as CVS or SVN, but should nevertheless be able to handle drupal.org-like commit restrictions. (I'm not perfectly sure about branches, but then we're not evaluating a drupal.org switch here. Such restrictions would anyways be optional for any backend to implement.)

In other news

Enough research for today, I'm heading off for the local party. Heh.

Comments

Implications of cross-repository branching

jpetso's picture

I'm currently asking myself if it would be appropriate to promote branched repositories (which is obviously common in distributed revision control) to first class citizens, as a separate entity in addition to (but not replacing) traditional in-repository branches. We could then have something between this thing on launchpad.net and this thing ("gitweb-style") so personal branches per project were possible, and the branch creator had all the access rights to this branch repository instead of the regular project maintainer.

Of course, this is not a core requirement, and it can be done without. I would still consider support for cross-repository branches a major coolness, and I'll keep it in the back of my mind when fleshing out the API.

Issue tracking and software releases

Group organizers

Group notifications

This group offers an RSS feed. Or subscribe to these personalized, sitewide feeds: