A peek at Subversion

Events happening in the community are now at Drupal community events on www.drupal.org.
jpetso's picture

My project definition says that I still got to document the specifics of SVN and CVS (git has been taken care of). Coming from the KDE camp, I'm already acquainted to Subversion to a certain degree, as the whole huge KDE repository switched from CVS in May 2005. (No one, including the SVN admin, did regret the switch).

Subversion (SVN) came up as a replacement for CVS - motto: "CVS done right" - and incorporates its centralized development methology and its straight-forward workflow, while improving on its weak parts. Compared to CVS, Subversion features good stuff like atomic commits, renaming, serverless diffs, symlink support and version-controlled directories, to name the most popular ones. It fulfills its role as a drop-in replacement brilliantly and has kinda deprecated CVS, at least for newly installed repositories. (That, of course, doesn't stop Linus Torvalds from claiming that centralized development methologies are outright wrong.)

For two or three years now, until the very recent rise of distributed version control systems (git, Mercurial, bazaar), Subversion was the single dominant open source version control system, and many major open source projects now operate on SVN repositories. For example, the Apache Software Foundation, KDE, GNOME (only switched from CVS a few months ago), Python, Samba, Mono, and also beloved sister projects like Joomla and Plone.

But let's just skim through the important points.

Basics

With Subversion, you can do everything that you can do in CVS, even if some things work slightly differently. But every CVS user will find himself at home immediately: 'svn command' (on-line help: 'svn help command') does all the standard stuff like checkout, add, delete, diff, status, annotate ("blame") and commit. Copy and move/rename operations preserve history and not work on files, but also on directories. Subversion stores a full copy of the current revision in a hidden ".svn" directory in each directory of your working copy, which makes it possible to do the diff and revert-to-current-revision operations locally. There's also an 'svn merge', but this one doesn't preserve the history of the merged branch like git or other VCSs do, you need a third-party tool called svnmerge for that.

Subversion uses sequential numbers to identify commits (1, 2, 3, ..., 147062), and any file that is affected by a commit can be uniquely identified by the same number and the filename. This means that you can revert to previous revisions by specifying the number, whereas in CVS you need to specify a date/time value as there are no repository-wide commit identifiers. In addition to files, directories and file/directory metadata ("properties") are also put under version control.

Branching, tagging, and file structure

This is where Subversion really differs from most other revision control systems. Internally, Subversion hasn't got any notion of branches or tags at all, only convention makes branches and tags out of regular directories. The common convention is that the main branch (HEAD) goes into the /trunk directory, branches go into /branches/*, and /tags/* is where tags reside. So instead of doing 'cvs tag DRUPAL-5--1-1' in your DRUPAL-5 branch, you'd copy your module directory from /branches/drupal-5/* to /tags/drupal-5.x-1.1/*.

For our API module, this means that you can't expect branches and tags to be in the same location. (Even without SVN, you still couldn't rely on this, as long as revision control systems that can do renames are supported.)

Authentication and hook scripts

Subversion provides two methods for repository access: HTTP(S)/WebDAV (http://, https://) with Apache modules, and/or a specialized SVN protocol being run by the 'svnserve' executable. In the majority of cases, the SVN protocol is tunneled in SSH, and that makes an svn+ssh://. Most SVN installations provide both protocols, and users can choose their preferred one. The WebDAV approach is the most capable one, and additionally provides an optional Apache module that can do per-directory access control without making use of hook scripts.

That makes for three potential places for configuring authentication: Apache settings (basic HTTP authentication or SSL), SSH settings, and/or svnserve's built-in authentication settings. The latter is easy to set up and doesn't require real user accounts to be created, but stores passwords as plaintext in the configuration file.

As for hook scripts, SVN essentially offers three hooks: start-commit for determining if the committer has access rights at all, pre-commit for implementing finer-grained access control of any kind, and post-commit for notifying the outside world of the commit that just happened. There's the 'svnlook' executable that exposes all the repository contents and properties by inspecting the server-side repository, and a couple of helper scripts for access control, commit mails and the likes. It's indeed possible to implement drupal.org's stringent access control policies on SVN, as aclight has impressively shown.

In other news

I think I've finally got a proper approach to branches and tags in the API, and I'll commit it as soon as the new name of the Revision Control API is fixed. The name change is a consequence of me developing in DRUPAL-5 as opposed to HEAD, which is not the best idea according to Derek, so we decided to start over from scratch and create the module directory anew. rcs.module is maybe not the best name for the module, because there's already a (really old-fashioned, but still) revision control system out there which is named RCS. After a very fruitful discussion on the development list, it seems that the vast majority of those developers who weighted in on the discussion prefer versioncontrol.module. That may be the way to go.

Upcoming work

Up next is a short write-up on CVS, similar to this one. Also, I still want to investigate Mercurial and Bazaar, although it seems that I might do that while already drafting the API in a more detailed way. Both items from the last report have probably been resolved: I've got a first attempt at the branches/tags issue on my system, and I'm reasonably confident with the way that directories will be handled. So when CVS has been covered, it's time to get to the real thing and have a prototype of the VCS API module.

Comments

versioncontrol is fine with me

dww's picture

sorry for the delayed reply, i was out of town all weekend, and burried with work since i got back. after thinking about this for a while, it seems everyone has converged on versioncontrol.module, so please proceed with that.

thanks,
-derek

p.s. nice work on the other write-ups. this is very useful stuff.

VCS API is a good ideia

lopolencastredealmeida's picture

Check SVK at http://svk.bestpractical.com/view/HomePage

The VCS API is a great idea since it will allow every project's team to choose what VCS they wish for them.

Best,
Lopo

Issue tracking and software releases

Group organizers

Group notifications

This group offers an RSS feed. Or subscribe to these personalized, sitewide feeds: