Evaluation discussion for how to move Drupal.org off of CVS

You are viewing a wiki page. You are welcome to join the group and then edit it. Be bold!

Conclusion

Git has been selected as our new VCS!

http://groups.drupal.org/node/48818?page=2#comment-133893
http://sf2010.drupal.org/conference/sessions/exodus-leading-drupal-out-cvs
http://groups.drupal.org/drupal-org-git-migration-team

Debate archived below:

We actually had quite a productive IRC discussion tonight (no, that is shockingly not an oxymoron!) about the general migration to distributed development, the community fragmentation that it causes, and how the tools on drupal.org might be improved as a way to combat this.

After spirited discussion, we brainstormed this "hit-list" of things that need to happen in order to Drupal.org to ever move to a distributed version control system. These are pretty loose, but there are quite a few actionable tasks that came out of it, which folks interested in scratching this particular itch could start picking off.

If you want to help, please clearly mark one or more of these things with your name (or the name of you and your buddies, make a game of it!), and comment below with progress. Or, help flesh this out a bit more by providing links, things we haven't thought of yet, etc.


Summary of Contenders

We've narrowed the search to Bazaar and Git. For the purposes of our community, the two appear to be basically functionally equivalent (yes?). Here's how they stack up on the other merits:

Bazaar

  • Provides the benefits of a distributed version control system, while also supporting a traditional centralized workflow, which will make transition easier for people. This advantage should not be under-stated.
  • Existing drupal.org infrastructure team already has knowledge of bzr, and has agreed internally to move drupal.org to it.
  • Bazaar is headed up by Canonical, which is likely not going anywhere anytime soon. Good for future-proofing.

Git

  • Much higher percentage of Drupal community members have familiarity with it, which will help ease transition headaches, which may be considerable since there are not familiar "anchors" with Git as there are with Bzr. This advantage should not be under-stated.
  • Existing work has already started to integrate Project* modules with Git: http://drupal.org/project/versioncontrol_git
  • Metrics show a general trend towards Git being the leader in the distributed VCS space with a huge, thriving community. Good for future-proofing.

Bzr vs. Git Smackdown at Drupalcon SF

The best way to figure out which of these two options would be the best fit for our community is to actually see with our own eyes. Drupalcon SF offers a tremendous opportunity for this, with the added advantage of having all the major players in place to lead sessions and make a final decision. Therefore, what we'd like to organize are Git/Bzr "info sessions" where people knowledgeable in the technology sit down with volunteers and observe and take notes, so we can try and get a sense of

So please help by filling out this stuff:

  • Drupalcon SF attendees who are willing and able to lead these kinds of "info/tutorial" sessions:
    • Bzr: Senpai, matt2000, NAME
    • Git: sdboyer, scor, ceardach, sirkitree, NAME
  • Drupalcon SF attendees who are willing and able to be Guinea pigs, mainly those with knowledge only of CVS (or CVS and Subversion): <-- yes?
    • Core developers: webchick, beeradb, NAME, NAME
    • Module developers: webchick, NAME, NAME, NAME
    • Theme developers: NAME, NAME, NAME
    • Site builders who manage deployments through version control: webchick, beeradb, NAME, NAME
    • Wtf is a patch? What's version control?: Elijah Lynn, NAME, NAME
  • Drupalcon SF attendees who are willing and able to act as "clipboard" folks, gathering data during these sessions:
    • webchick, NAME, NAME, NAME

Then we need to brainstorm stuff like:

  • What criteria are the clipboard folks looking for?
  • What documentation needs to be prepped ahead of time?
  • Who's going to prep it?
  • What infrastructure needs to be prepared to perform this testing?
  • Who's going to prep it?
  • What sort of data are we hoping to gather from this session?
  • What sort of infrastructure do we need to capture it?
  • Who's going to prep it?
  • Other...?

Why are we doing this?

Things about CVS that currently suck:

  • It's an archaic system that no one in their right mind chooses as a version control system today, and thus it's a big barrier to entry because new contributors must learn it "specially" in order to contribute to drupal.org
  • Things that should be really simple (submitting changes that add/remove files, renaming files, etc.) are absolutely horrendous in CVS.
  • It's really difficult to "chase HEAD" since CVS merge tools totally suck, making contributing major changes to Drupal core incredibly painful when there are constant re-rolls due to whitespace changes in other places.
  • It's difficult for patch authors to have discipline to keep changes to one "context", resulting often-times in "mega-patches" that are impossible to review.
  • We lose the incremental commit history that happened on said "mega-patches."
  • It's difficult to share experimental code and encourage others to improve it. <-- this is a d.o issue, not a CVS issue
  • The credit on commits (as in cvs annotate credit) currently goes to the committers of the code, not the people who actually wrote (and reviewed) them.
  • It's impossible with CVS to do much of anything while offline, because even a 'diff' operation needs to "phone home" to the parent server. Lots of us are on planes a lot of the time.
  • In general, the lack of modernization of our contribution tools are causing people to move to other places like Launchpad and GitHub, thus fragmenting both our community, and the strength of Drupal.org as the central hub to find any and all Drupal stuff. <-- both a d.o issue and a CVS issue.
  • Security in CVS sucks. pserver's encryption is sub-standard, and only a small handful of people (mainly core maintainers and infra team) use public-key encryption.

What don't we want to lose?

  • Drupal.org being the central collaboration hub for Drupal core and contributed projects: code, reviews, and discussions take place here, where the entire community can participate and learn from them.
  • The "incremental changes posted to an issue queue where they then get peer review" workflow allows people who are non-developers to participate, and for reviewers/maintainers to see incremental progress instead of the whole thing at once.
  • Keeping only one canonical "target" for each project in all of Drupal, for the entire community to buzz around. This makes peer reviews much easier, since the way to test changes is always the same, and keeps everyone collaborating on the same stuff, not a fork of a fork of a fork of a....

Things that we've ruled out

  • Moving to $better_centralised_VCS, such as Subversion. The amount of work required to pull this off is substantial; we need to make sure that we're set for another 10 years, and distributed VCS is clearly the way of the future.
  • Any non-free (as in beer and freedom) VCS.
  • Pretty much anything but Git, Bzr, and Hg at this point, unless someone can make an extremely compelling case for something else.
  • Mercurial. There just aren't enough existing community members who know it, nor do the folks on the infrastructure team have a background there. Sorry, Hg!

Concrete to-dos for moving Drupal.org to $distributed_VCS

Documentation/Training

HEY! You there! Put your name here. :D


Git: yhager, mike booth, sdboyer, mig5, marvil07, lut4rp, scor, ceardach, reglogge, kyle_mathews, bdragon, asimov, hugowetterberg, tobiassjosten, fago, DamienMcKenna, Frando, Will White, Jeff Miccolis, psynaptic, anarcat (willing to help with CVS -> git migration), slantview, EclipseGc, VoxPelli, gordon, schoobidoo, Pisco, Sean Bannister, mikl, jrglasgow, stepmoz, Jeremy, seutje, corni, ben-agaric, sirkitree, chachasikes NAME
Bzr: David Strauss, Peter Wolanin, Narayan Newton, Josh Koenig, bdragon (again), chx, NeverGone, a_c_m, Nuno Veloso, cha0s, Heine, Garrett Albright, Emma Jane Hogbin, TBarregren, dixon_, frjo, ximo, matt2000 NAME
Hg: Heine, mikey_p, mcrittenden, NAME

  • Compile a list of our current "use cases" around version control: (DONE?)
    • Collaborative patch authoring: Huge changes like form API, field API, core themes, etc. that are more-or-less equally authored by multiple people.
    • Primary patch authors: There's primarily one person who's "in charge" of a patch, but is taking contributions from other authors.
    • Patch contributors: Contribute more minor things like code-standard compliance, spelling corrections, etc. to a primary path author.
    • Patch reviewers: Eyeball the code, take a change from the issue queue and ensure that it's properly working, post back their results.
    • Contrib module developers: Handle primary development of module, pull in patches from others, maintain branches and releases.
    • Contrib theme developers: Handle primary development of theme, pull in patches from others, maintain branches and releases.
    • Co-maintainers: Help maintain a module/theme, but often checking first with the primary author before changes are made.
    • Users: Run their Drupal site on code directly checked out from d.o repositories, they update to the latest release tag and rebase patches on top of that.
    • ____ (other use cases?)

Comparisons

Compare Git/Bzr/Hg on the following items (see also DVCS-specific pages for Git: http://groups.drupal.org/node/48843, Bazaar: http://groups.drupal.org/node/48848, and Mercurial: http://groups.drupal.org/node/48853 for more detailed details). Comparison charts git-hg-bzr: InfoQ dvcs-guide, May 2008

(Note: Some of these might be "Well all of them do that" and that's cool, then just mention it there; I was trying to pull from further down the wiki.)

  • Community: How big is the pool of developers working on extending and maintaining $vcs? How's their IRC channel for asking questions? Do we have a contact who would be willing to help us with the technical side of the migration?
  • Prominent users of the technology: What are other open source projects or other big players using $vcs?
    • Git: Linux Kernel, X.org, Perl, Ruby on Rails, jQuery, Debian, Gnome, Fedora, QT, KDE, Android, VLC, Wine, Facebook, MongoDB (wikipedia)
    • Bzr: Launchpad, Mozilla, MySQL (and derivatives like MariaDB), emacs, Squid, Mailman, Ubuntu packaging, Drizzle, Debian Apt (wikipedia) (official)
    • Hg: Python, Mozilla, OpenSolaris, Java/OpenJDK, Adium, Google Code (wikipedia)
  • Features: What cool things does $vcs offer that neither of the others do?
    • Git: Interactive rebase (see also here)-- hack on several things at once, commit to a branch on your local machine whenever you want in random order, then reorder and edit the changes into a smaller (or larger) number of logical, sensible patches before sending the results to anybody else. No need to plan for this in advance; no extra plugins required. interactive add and commit allows for committing only chunks of a file. Supports cherry-picking.
    • Bzr: Why switch to bazaar? bzr also supports a rebase functionality, although it requires a plugin. Also, I believe bzr does not support interactive rebase as does git, but perhaps someone can fill that information in?
    • Hg: Mercurial has a large number of available extensions and includes the excellent MQ or Mercurial Queues which excels at handling dependent patches (similar to stacked git, or quilt) as well as an included RebaseExtension with a collapse option similar to common usage of git rebase
  • Use cases: How would $vcs specifically help/hurt/be neutral towards the use-cases defined above that the Drupal community has?
    • Git: Webchick provides a possible scenario that could be used to handle the use cases (for Git, Bazaar and Mercurial). Git also allows for keeping the current workflow intact for a period of time to ease the transition.
    • Bzr: Bazaar allows module maintainers to use the familiar checkout/update/commit workflow popularized by CVS and Subversion while providing the standard distributed capabilities as an option. This allows Drupal to adopt a single VCS tool while lowering barriers to contribution.
    • Hg: Mercurial emphasizes simple commands that map nearly 1-1 with SVN or CVS. No staging area is used, and most commands give helpful feedback about next steps, such as 'hg merge' or 'hg pull'. Most workflows do not require any switches on commands for common tasks.
  • Access control features: How well can we 'lock down' access in each of the solutions? Can we do things like maintainer/co-maintainer relationships? Can we block commits that contain "black-listed" stuff, such as security holes and LICENSE.txt?
    • Git: Basic and global access control can be implemented with using different protocols (ssh, http/https, git) for accessing a central repo. More finegrained access control (down to controlling access to directories on a per-user basis) is also possible, through "hooks" (executables that can be written in any language).
    • Bzr: Out of the box, Bazaar uses POSIX permissions and ACLs on a branch-by-branch basis, which maps well to Drupal's maintainership model. Bazaar supports Python-based pre- and post-commit hooks.
    • Hg: Utilizes basic authentication using the host protocol, i.e. ssh, or http/https auth. Allows custom scripts to be implemented for many hooks, using Python.
  • Migration path from CVS: How do the migration tools fare with each option? Links to tools that we can use to help make this easier?
  • Speed/Size: How fast is it? How much space does it take up relative to CVS?
  • Scriptability: From both our (drupal.org) point of view, and also from a contributor's point of view, how much flexibility is there in making the tool do other things? Links to resources? (For maximum in-PHP scriptability, we're going to want an analogue to svnlib - fortunately, not a very difficult task)
    • Git: Git exposes the entirety of its guts for use in system calls (which is why it comes with some 140+ commands). Many of these commands are designed to interact with each other via stdin/stdout. In other words, it's built to be scripted on. Some people use git to manage deployment of dev-staging-live servers Example1 or "versionize" an Aegir-Drush-make workflow Example2
    • Bzr: Bzr has a robust API (in Python) that can be easily extended with highly-portable, easily installed and distributed plugins.
    • Hg: Mercurial has a robust number of [[http://mercurial.selenic.com/wiki/Hook|hooks]] that support scripting with Python.
  • Credit tracking: How well does $vcs do with tracking who gets credit for a change, where that change came from (issue queue #) etc.

    • Git: Each commit contains the name, email and date of both the author of the patch and who committed it, and it also allows for a 'Signed-off-by' field (see example). Issue numbers should go in the commit message, and can be enforced through commit hooks.
    • Bzr: Bazaar associates every commit with a name and email address-type field. Bazaar treats issue numbers as first-class fields for commits. For the ultimate in attribution, Bazaar supports optional or required digital signatures on a branch-by-branch basis.
  • ____ (Other criteria?

Comparison of issue trackers on other Open Source projects

Note: While these statistics are interesting, note that a remotely-hosted service such as GitHub is off the table, because it violates the #1 thing we don't want to lose: keeping drupal.org central collaboration hub for Drupal.

Project Repo Bug tracker Notes
Linux kernel gitweb
http://git.kernel.org/
bugzilla
http://bugzilla.kernel.org/
mostly patch + email based workflow
http://www.wlug.org.nz/KernelDevelopmentWithGit
http://lwn.net/Articles/160191/
http://kernelnewbies.org/UpstreamMerge
Ruby on Rails github lighthouse
patch-based workflow
example ticket with patch
link to related changeset
JQuery github trac
merged fork example ticket
merged patch example ticket
KDE gitorious
migration in progress
bugzilla
merged fork example ticket
merged patch example ticket
KDE git migration docs
KDE git tutorial
GNOME cgit bugzilla
CAKEPHP github lighthouse Bakery
SproutCore github tasks tasks source
MySQL launchpad bugs.php.net variant
mailing list/patch based
example ticket linking to commits
commits mailing list
Mailman launchpad launchpad
both patches and branches
ticket example with patch

Drupal.org integration

(This part still needs major exploration work done, to determine what exactly we want, and what work is actually involved. It's also the thing that's going to hold everything else up, so the sooner someone jumps on this the better.)

"Phase 1" probably just looks like keeping our existing workflows in place (i.e. sharing changes via patches in the issue queue, hosting only the "canonical" repository for each project), and replacing CVS with $vcs. This requires research into the following:

  • Evaluate feasibility of moving off of Project* modules in favour of $vcs issue tracker/source code integration tool.

    • Gitorious (Git)
      • Pros: __________
      • Cons: ____________
    • Bzr: Launchpad (Bzr)
      • Pros: __________
      • Cons: ____________
    • I posted a comparison that wouldn't fit here in a comment below. --David Strauss
    • _________ Other? Neutral?
      • Pros: __________
      • Cons: ____________
    • Existing infrastructure that would need to be modified to work with $vcs tool
      • Testing bot
      • _____ (lots and lots of things, I'm sure...)
    • Bad stuff
      • We'd end up forcing our community through two significant learning curves at once (new VCS, new project management tools), which should definitely not be under-stated.
      • In addition to the already significant documentation requirements for simply the VCS move, we'd probably sextuple those having to re-write all of our existing documentation that makes reference to how to use the issue queue, project management tools, etc. (otoh, we could probably externally link to a lot of it)
      • _____ (lots and lots of things, I'm sure...)
    • Good stuff
      • Not having to port project* module every time we want to upgrade durpal.org. :P
      • _____ (lots and lots of things, I'm sure...)
  • Identify and document places where in our current scripts, etc. there are hard-coded assumptions about CVS. For each item, identify the piece of code currently responsible for performing this job, determine if there are CVS assumptions, and if so create a set of issues for discussing/tracking these, and link them here.

  • Determine logistics for how $vcs replacing CVS actually looks:
    • What does the new directory structure look like in core/contrib?
    • How do we manage "official" releases (tags) vs. development releases (branches)?
    • How do we facilitate (or don't we, in phase 1) "spooning" of code?

Stuff we need to do /after/ we have chosen a $vcs and have the rest of the stuff above in progress

Documentation/Training

  • An equivalent of http://drupal.org/handbook/cvs/quickstart for $vcs
  • FAQs/Troubleshooting/OMG I BROKE IT HELP!!! docs
  • Screencasts on how to use it.
  • Scheduling of "info sessions" on IRC (or Skype, or whatever) for $vcs brigade to train existing contributors on the new system.

Future feature requests

  • When patches are uploaded to d.o, do automatic generation of tarballs with project + patch applied already, to facilitate reviews by non-technical users. (Feature request posted here: http://drupal.org/node/707526)
  • Crazily pimping out drupal.org to take advantage of more advanced $vcs features, e.g. feed from commit logs on forked branches to the issue queue, create and post patch automatically to issue with a button, etc.
  • Exploring alternative core contribution workflows, e.g. "authorized" branches for specific features like fieldable user profiles in D8 core, with someone "deputized" to accept changes.
  • Browser-based editing to files in repo leading to automatic patch creation, for designers + commits via the issue queue/web UI <-- /really/ not sure of this idea....
    • This is really appealing in my mind, since it blows away the barrier to entry for the vcs, so devs can commit easily. ao5357

(Props to at least Benjamin-Melancon, Lizzard, walkah, yhager, JohnAlbin, dbabbage, catch, jensimmons, hefox, and whoever else helped work on this. :))

Edit: I've removed all +1 comments. Please, no more

Comments

I can't edit anymore, everything treated as spam. grrr

adrinux's picture

Was just fleshing this list out but my efforts have now 'been flagged as spam and will not be accepted'. So I'm giving up, which is a pity, because I had a lot more to add. No wonder so little has been added to this.

Can't we get away from the Not-invented-here syndrome and use tools that actually work? Like say http://etherpad.com/
Even google wave is better than this.

Immensely frustrating.

Mollom fail?

lut4rp's picture

I think something's wrong with Mollom. I had to enter the captcha more than 5 times to create the 3 vcs-specific wiki pages.

We'll do some Mollom backend

Dries's picture

We'll do some Mollom backend debugging to see what happened (if anything at all).

For those having problems with Mollom...

webchick's picture

I've unpublished a couple of garbage "-1" comments about Mollom. Please do not pollute this thread with stuff like that; keep comments focused, informative, and on-topic.

If you're having trouble with Mollom, please post over here instead: http://drupal.org/node/684424

/me grins webchick dear, some

sdboyer's picture

/me grins

webchick dear, some of that verbiage looks suspiciously like something I sent you in an email last fall...or is that just my imagination?

EDIT: oh wait. i think credit for including those goes to marvil07.

I'll stop being lazy

haxney's picture

I was one of the students working on the version control integration modules last summer for the Google Summer of Code, but haven't really touched them since the end of that. Life has a nasty habit of getting in the way.

Anyway, I am now using Drupal at my shiny new job (which I got largely due to GSoC, yay synergy!), and I'll be able to spend some time working on the version control stuff. I had gotten a fair amount of work done on versioncontrol_git, but hadn't finished it, as it turned out to be a lot more complicated than
SVN. My latest work is here, and I should be able to devote some time to polishing it up over the next few weeks.

I also worked on extending versioncontrol_project to work with multiple branches per project, but that still needs some love as well.

This obviously doesn't have much to do with the community support or training and such, but it is pretty important for the infrastructure side. Feel free to ask me any questions.

Awesome!

webchick's picture

That's great news! :D Thanks for your offer to pitch in!

my tuppence worth

adrinux's picture

When the end of SVK was announced in 2009 I had a good look around at where I was going to move VCS-wise, so here are my thoughts based on that and my ongoing transition from svk to git, plus some direct responses to points listed.

Of the three serious contenders – bzr, git and hg – bzr and git seem to have much more going for them in terms of developer community, user community and documentation. hg also seems to have certain quirks (seems history is pretty much un-modifiable). The choice then is between bzr and git.

There are two points above which bzr addresses:
- As I added above – some effort has been made to make bzr use the same commands as SVN (and thus CVS), making the transition a little easier for some people.
- You can checkout part of a repo and keep everything in one big repo if you desire – in otherwords it supports both a centralised workflow style as well as a distributed one. (in contrast git pretty much forces every project to have it's own repo)

Git wins on speed. It's very fast.

GUI apps for working with either are in their infancy but do seem to be in development, there are plenty of ways to get a graphical overview of a repo and it's history, but it's back to the command line to actually do any serious work.
- GitX is one of the better GUI Git apps available for OSX: http://gitx.frim.nl/

Can't screw an overwhelming number of current Drupal developers' workflows <-- could this be clarified?
We're not going to make a transition to a new VCS without doing that. Change is inevitable, change is good! Learning new workflows can be time consuming, but it's worth it in the long term. I think that point should be removed from the list above, it's silly, we'd never change anything if that was of prime importance! We're not changing VCS every six months...

Internal support for basically packaging versioned drupal install profiles/distros
This is irrelevant and should be removed. We already have drush/drush_make for that, they're even working on d.org, and can use many VCS backends. Plus distributing make files has the potential to work around licensing issues with third party libraries – we won't be keeping them in a d.org hosted repo so VCS internal packaging is a non-issue.

is it free? might it cost? Free as in beer, AND/OR free as in freedom!
Git is of course famously open and free. Bzr is also open, but development was initially at least driven by Canonical, possibly still is.

Personally I'd say git is the way to go, I seems to offer more flexibility in terms of what we build on top of it, at the cost of having to do the building. There's potential for a code.d.org distro that competes with the likes of git-hub...but I digress.

All that said, @yhager has it right. The info is out there, both git and bzr have plus points. We need a decision on this from the top - Dries basically, nothing ever seems to get real traction without his say so for some reason :) Unfortunately Dries has little experience of distributed VCS and until recently doesn't seemed to have viewed that as a problem. Lack of leadership has allowed this issue to flounder for too long. Perhaps he could defer decision to someone else for once? We really don't need any more of these debating point lists, what we need is someone with the power and gumption to decide.

There's a simple choice, and a complex choice here. The simple weighs the cost of moving to a distributed VCS – it will disenfranchise the less capable developers who use GUI's to work with CVS, there simply aren't replacements yet – against the cost of sticking with CVS – more capable developers (dare I say the bulk?) will continue to move away from drupal.org as a place to host their projects, it's already happening and gathering pace.
The more complex choice is between git and bzr. Good luck on that one.

Response...

webchick's picture

"Can't screw an overwhelming number of current Drupal developers' workflows <-- could this be clarified?"
... Learning new workflows can be time consuming, but it's worth it in the long term. I think that point should be removed from the list above, it's silly, we'd never change anything if that was of prime importance! ...

I think instead it should probably be moved down to the Documentation/Training component. Knowing that it's inevitable the workflow is going to be totally different (or is it? can we still emulate it, even in a distributed VCS, to help ease the learning curve?), we probably need a "conceptual cheat-sheet" (or link off to one).

The challenge here is that while probably 20-25% (gut feel, based on interactions in #drupal, Twitter, etc.) of our contributors are the "uber-geek" type, and are probably playing with tools like Git/Bzr in their spare time anyway, the vast majority fall under the "just blindly copy/pasting from the quickstart guide (or clicking around in $GUI) and praying for the best." These folks, by and large, do not grok concepts like branches and repositories, and are not going to be able to make the transition without a great deal of assistance and support. Anything we can do to help cushion the blow will help mitigate either a massive outflux of contributors, or an enormous support burden on our "$vcs brigade."

We need a decision on this from the top - Dries basically, nothing ever seems to get real traction without his say so for some reason :) Unfortunately Dries has little experience of distributed VCS and until recently doesn't seemed to have viewed that as a problem. Lack of leadership has allowed this issue to flounder for too long. Perhaps he could defer decision to someone else for once? We really don't need any more of these debating point lists, what we need is someone with the power and gumption to decide.

I 110% disagree. I think that this needs a strong bottom-up, grassroots movement, and that's the point of this page, as central point of coordination. For too long, we've been sitting around waiting for a decision from on top, and that's not the way things work around here. Someone gets an itch, they build a team of people with similar itches, they all work their ass off and show great progress toward $goal, and Dries goes, "Cool. Let's do it." It doesn't make any sense to me on this to hand down judgment from on high (esp. if Dries's experience in distributed VCS is lacking, which sounds like conjecture to me), because the only way this move is going to work is if we have a HUGE team of enthusiastic people, who are going to roll up their sleeves and do the work. Show that you're actively doing the work, and I'm quite confident that a "blessing" will follow.

Can't we get away from the Not-invented-here syndrome and use tools that actually work? Like say http://etherpad.com/
Even google wave is better than this.

We did originally throw this together in Etherpad, but as a result I have no idea who actually contributed to the document, what those people changed, etc. Plus as far as I know, they're going belly-up anytime here. ;P

At this point, I think we do need an audit trail, so we can see who's actively working on this, and we need to keep the infrastructure team in the loop. So though g.d.o's over-zealous spam filter is annoying, I think it is the right tool for the job.

response to response...

adrinux's picture

I think instead it should probably be moved down to the Documentation/Training component.

True. But I think that's a different point. Documention is always good :)

The challenge here is that while probably 20-25% (gut feel, based on interactions in #drupal, Twitter, etc.) of our contributors are the "uber-geek" type, and are probably playing with tools like Git/Bzr in their spare time anyway

I trust you on that one, but wow, I'm surprised that's so low. I don't think it's spare time though, they're using these tools day to day already, even to the extent of managing contrib modules, submitting patches etc. It all ends up in CVS, so in a sense it's invisible (well, apart from those projects now hosted elsewhere).

For too long, we've been sitting around waiting for a decision from on top, and that's not the way things work around here. Someone gets an itch, they build a team of people with similar itches, they all work their ass off and show great progress toward $goal, and Dries goes, "Cool. Let's do it."

Yes, a lot of things work that way. But large architectural changes don't. It's simply not worth anyone investing in any single VCS until someone at the gate says 'go'. We need a decision that the time to move VCS is now, and we need a decision what to move to. Once we know ($goal = git) or ($goal = bzr) or whatever then sure, you can mobilise grassroots support around that.

The problem is that there's no clear advantage to git or bzr, each has it's good/bad points and which choice you make depends on what weight give each point. Somebody in a position of power needs to sit down with those that have merit in regards to this issue – dww springs to mind, chx is a bzr user (I think?), I'm sure you know better than I – and make a choice. Without that we'll have endless lists like this and lots of people +1 on what they already use. I really don't think the community as a whole can make a clear decision on this one.

Plus as far as I know, they're going belly-up anytime here

Point taken :( That was mostly driven by frustration. I was prevented from editing the original post further.
It even took me 5 attempts to post the comment you responded too. Only by removing a link to an external resource did it finally accept my comment. You're right audit trails etc, but I was prevented from contributing. I can't see that anyone else is editing that 'wiki' page either so maybe it's not just me.

Responses to responses to responses (this is fun! :D)

webchick's picture

For too long, we've been sitting around waiting for a decision from on top, and that's not the way things work around here. Someone gets an itch, they build a team of people with similar itches, they all work their ass off and show great progress toward $goal, and Dries goes, "Cool. Let's do it."

Yes, a lot of things work that way. But large architectural changes don't. It's simply not worth anyone investing in any single VCS until someone at the gate says 'go'. We need a decision that the time to move VCS is now, and we need a decision what to move to. Once we know ($goal = git) or ($goal = bzr) or whatever then sure, you can mobilise grassroots support around that.

And I'm saying, we're not making any forward momentum on this issue until the following things happen:

  • Evaluating and analyzing distributed VCS options to elaborate on pros/cons, particularly in regards to our specific community's needs.
  • Itemizing the bits of our infrastructure that make hard-coded assumptions about CVS, and a plan of attack for moving them away from that (e.g. Version Control API module).
  • Coding those changes.
  • Gathering up of documentation and other resources for people making the switch.
  • Doing planning work around what our tools (issue queues, repositories, packaging scripts, etc.) will look like when merged with a distributed VCS.
  • Coding those changes.

...none of which requires us to know what system we're changing to. In practice, the decision of what system we change to is not going to be made by Dries, it's going to be made by who shows up to do the actual work. If it's Git-loving people, we'll be moving to Git. If it's Bzr-loving people, we'll be moving to Bzr. And I think that's great; it makes absolutely no sense to choose a technology in a vaccuum, without the demonstration of strong community support behind it. And the best way to demonstrate strong community support for something is to start picking off action items above.

I'm Responsed out.

adrinux's picture

Evaluating and analyzing distributed VCS options to elaborate on pros/cons, particularly in regards to our specific community's needs.

Well I tried to contribute, as I said, but g.d.o deemed my contribution spam, particularly when I tried to provide supporting information in the form of links to external sites. I'm stymied.

Sorry, I can't do anything

webchick's picture

Sorry, I can't do anything about that. :( Could you post it elsewhere and link in a comment?

it's not really a complex choice

chx's picture

git is way too complex. There is no question to that. It's staging concept is so confusing that the git authors confused themselves and can't even call it what it is. It's not about perceived features or speed that I am against git but that the very concepts are confusing and too complex.

About complexity

marvil07's picture

Disclaimer: I do not want to increase the increasingly versus that this is becoming on, but just want to give specific feedback for this. We all want a better VCS on d.o, just talk about that, no internal wars ;-)

Complexity, like proposed above can be measured in different ways.

I can say there are a lot of resources that let you understand what is git doing(please take a look Free online documentation, specially the kernel.org wiki list of links).

About the blog post: I do not think the author of that blog post is a git author(git log --format="%an"| sort| uniq on git source)

About staging: personally I love staging, since that is the feature that let add what you really want on a commit.

An scenario:
- working on bug X
- OMG there's another not-related bug Y there
- make fix for bug X and bug Y

So, in that scenario(who IMHO is common) is pretty natural to use git add -p(which use inherently staging), letting you add the hunks that belongs to bug X, then commit them, then add the rest of the hunks, and finally make commit for bug Y. Again, IMHO, that's handy.

bzr does patch-level commits

ksenzee's picture

Staging is either love-it or hate-it. Personally I love it. But it's a complicated concept, and bzr also offers ways to tease two commits out of a workspace (bzr shelve used interactively, for example), which is the real reason staging is so useful. So as a git user, I don't see the loss of a staging area as a dealbreaker.

Heh, hey, I've used

Dries's picture

Heh, hey, I've used distributed VCS since late 2004 before you had even heard about them. ;-)

In my post 8 steps for Drupal 8 I made it clear that I'm willing to experiment with Git or Bzr somewhere in the Drupal 8 development cycle.

Not sure what else you want me to commit to at this point, but I'd be happy to discuss it more.

Apologies then.

adrinux's picture

I got the impression from that very post:

I see myself experimenting with a distributed revision control system like Git or bzr.

That you hadn't.

patches accepted?

ksenzee's picture

At chx's urging, I'm posting my opinion that whatever we do has to still facilitate a patch workflow for those who use other VCSs. I'm probably stating the obvious -- I'm pretty sure everyone agrees that patches should still be accepted. :) But I do think it's important. Whatever VCS we choose, somebody's going to hate it, and they'll probably start a mirror in their VCS of choice. That has to be okay.

decide based on use cases

ksenzee's picture

Posting this idea on behalf of chx, who suggested it in IRC and was met with widespread agreement (but who was also headed to bed at the time, so I agreed to post for him). We should be making our decision based on use cases, not on features or speed or whatever else. We agree on a list of use cases, determine how you achieve them in git/bzr/whatever, and make our decision based on the results. Laundry lists of features are less helpful.

An excellent point!

webchick's picture

One place to start might be brainstorming a list of such use cases, and then documenting what the process would be to satisfy them in each of the given options in the version control system of choice at each of the linked wiki pages.

Also

chx's picture

It might be the seed for the tutorials on the new system.

Issue queue/VCS integration proposal + some notes

webchick's picture

Here's a proposal we kicked around in IRC for how this might work. It is probably totally wrong, but will create a starting point for discussion, anyway.

Regardless of how we set this up though, the main thing we absolutely do not want to lose is the collaborative problem-solving via the centralized issue queue workflow that we have now. I would argue that it is the very cornerstone of our awesome community. It provides a single place to look for any and all decisions made on any line of code in all of core/contrib, one set of tools for contributors to learn, metric tonnes of mentorship both direct and indirect from the discussions, etc.

So here's an idea of how a hybrid might work. Please ignore my legacy terminology below, which I've put in em tags, so we can later replace it with whatever bzr/git/whatever calls them. For now, I'm sticking with CVS terminology so we can all stay on the same page.

  1. Bug reporter: Someone finds a bug. They go to the issue queue and create a new issue, say, issue #987654. Based on the issue metadata they select (project, version, etc.), this will automatically create a branch of that project called '987654' in our repository.
  2. Patch author: Someone who wants to work on that issue will create a clone branch of that 987654 in their own drupal.org-hosted repository, at xxx.drupalcode.org/users/[username]/projects/[projectname]/987654, and start making lots of commits. When they hit a good stopping point, they'll post back to the issue and say "Hey, check this out! Here's a summary of what I changed..." and then fill in the commit ID (or URL or whatever) to their latest changeset in the reply form.
  3. Patch reviewer: They go surfing through the issue queue and come across something interesting that's marked "needs review." They pop it open and scroll down to the latest patch issue reply. The reply includes a link to the full commit logs to this point, a link to the diff of changes, and (if I can have unicorns AND kittens), a tarball of the parent project + diff together so that folks like the UX team can easily test them. They apply the patch using $vcs, or they download a tarball and test it and report back.
  4. Patch reviser: Someone else who wants to work on the same issue can either branch off of the commit indicated in the patch reply, and begin committing incremental improvements to their own xxx.drupalcode.org/users/[username]/projects/[projectname]/987654 directory, which the original author can choose to incorporate or not, or can say "No, you're totally wrong" and "branch" off of the main 987654 branch instead, to start all over again.

One concern I have is that as far as I can tell, in order to accomplish this, every single registered user on Drupal.org would get their own /username/ repository hosted on our central server, in order to satisfy two critically desirable situations that we currently have:

  1. Absolutely anyone can make a patch to any project in all of Drupal using the same set of tools.
  2. The entire back-story of all development decisions in every project in all of Drupal is here on drupal.org. (a couple of outliers notwithstanding)

So I assume we'd need some pretty fancy safeguards in place (extension checking, filesize checking, etc.) to ensure that this does not turn into a gigantic warez repository. :P

One of the reasons that most people right now are using Github is as more or less a "sandbox" area to shove code into. If Drupal.org's set-up could better facilitate this, I think that would help stem some of the bleeding in terms of people going elsewhere.

I also haven't quite figured out how we're going to handle granting commit access to the contributions repository, where our users go to get all of their modules and themes, and so therefore necessarily needs to be more tightly locked down. But whatever VCS we choose needs to support this kind of user-based role granting system like we currently have with CVS (the concept of maintainers and co-maintainers, etc.)

The Git Approach

Pisco's picture

My approach (with some differences) to solving the described use case with Git, I try to follow the KISS principle: Keep It Simple and Stupid.

Prerequisites

Setup

Project owners and selected users (or groups of user) have write access to specific projects, i.e. my_project.

Workflow

  1. Bug reporter finds a bug and opens an issue ticket just as it is the case now. Absolutely nothing happens automagically! Keep it simple and stupid!

  2. Patch author clones the repository to work with it locally: git clone git://drupal.org/project/my_project.git. Supposing he works on what CVS people might call HEAD he then makes incremental changes locally commit them: git commit -a. When they hit a good stopping point, they have several choices:

* they can create a patch: git format-patch origin and attach that patch to the ticket, just as we now it.
* they can publish their clone of the repository with the locally committed changes in a publicly accessible place i.e their private server (http://my.domain.org/drupal-contrib/my_project.git), or on github.com, or ...
* there sure exist other options with Git :-)

In the first case (patch) we already know the workflow. In the second, and following, cases you would add a comment to the ticket, telling reviewers that they can pull in his changes from the URL were he published his clone.
  1. A patch reviewer either applies the attached patch git apply <patch>, reviews and tests it, or he pulls the changes from the published repository (private server, GitHub or wherever) git pull http://my.domain.org/drupal-contrib/my_project.git myCurrentBranch, and then reviews and tests the changes. If it looks good they add comment to the ticket and my change the ticket status, if not they make further changes and create a patch or publish their clone (or if they already have published it, push the last commits to it).

  2. When a patch has been reasonably reviewed and tested, or even better: when the unit test run succesfully :-), the maintainer of the project decides to apply the patch or, if he hasn't already, pull the changes in from the published repo. He then commits (git commit) the changes to his clone and pushes them to the main repo at drupal.org (git push).

  3. Have a coffee :-)

As explained elsewhere in this thread, access rights can be easily managed using Gitosis, or with proven ssh and linux configurations, if needed even using additional tools like LDAP. See the Gitosis example config to get an idea of what is possible with it. One could possibly create something like user accounts on linux systems and manage access rights with ssh configurations in conjunction LDAP directories.

I don't see a need to automagically create issue branches or anything else, let everyone work with Git the way they prefer. Just give guidance and advice on best practices.

I would also refrain from hosting everyone's clones somewhere at xxx.drupal.org, why put that burden on the Drupal infrastructure. Let drupal.org be the meeting point and melting pot, but let people work how they want and where they want. If a patch author likes to publish his clones at GitHub let him do so, GitHub does its job well and it wouldn't be that easy to do their job as well as they do. And the why should we, let us concentrate on Drupal :-)

This would be my first approach and I think it suits very well what we already know at drupal.org.

In this description we have only very marginally seen the power of Git, but if I understand webchick correctly, we should to concentrate on this main use case and always try to keep simple things simple.

What do you think?

side note: starting from what I outlined here, we could then start using Git hooks to trigger unit tests when updates are pushed to the repositories, automatically update documentation, contribution stats and so on.

What do you think? You can do

David Strauss's picture

What do you think?

You can do the exact same workflow with Bazaar (right down to the built-in convenience command for applying patches), but Bazaar also allows module maintainers to work as they always have: bzr checkout, bzr update, bzr add, bzr commit.

If we want to keep things simple, we don't change the main workflow to something very new that does the same thing we do now.

Github for social coding

kim.pepper's picture

I think the Github hosting model makes a lot of sense.

The process for contributing code is basically:

  • Fork the project
  • Commit changes to your own repo
  • Send a pull request to the origin repo owner
  • Origin repo owner can review, send feedback, or simply reject pull requests
  • When finally accepted, owners they can easily merge code back into origin repo.

This offers a cleaner approach IMHO.

This also works for a number of other open source projects like Ruby on Rails.

Kim

Reasons I don't like this...

webchick's picture

I realize that this is the 'natural' way that tools like git are supposed to work, but I have some concerns, detailed below. I'm really hoping the response is, "Oh, no! We don't have to lose that! It would work this way..." cos that would be awesome! :) But I fear instead I'll hear "Yes, this is exactly why we want to move this way because everything you're talking about sucks and slows us down," because that's usually the reaction. :P~ But yet, this stuff is what I consider the very foundations of our awesome community, and the reason I both got started, and continue to stick around. Losing it would basically be gutting Drupal's soul, at least from my POV.

  1. We lose the mentorship aspect. This is critical. Just as a random example in my queue right now, http://drupal.org/node/448292. This issue holds an archive of discussions by several prominent accessibility experts discussing pros/cons of various approaches, looking at how other people solve the same issue, tools that can be used to test accessibility, etc. I've learned a ton by reading that issue and its replies, and by reviewing the incremental code that's been posted and discussed. That's one of several hundred thousand, all of which have built up my knowledge of Drupal over the past 4.5 years of contributing to core. It's what sucked me into Drupal, and kept me here. We absolutely need to retain this going forward.

  2. We lose the project design decision history. Because all decisions need to be discussed in the issue queue now, you can use cvs annotate and for any line of code in any project on drupal.org, and find out not only the commit message associated with it, but the entire history of why the decision was made, who was involved, what other options they explored, etc. I don't get that from a simple commit message.

  3. We lose contributions from people who aren't developers. Right now, people who are subject-matter experts (including usability, accessibility, design, etc.) who have zero code ability, can still contribute to and shape the development of features. There is one place to check (the issue queue) for all activity going on, and all discussions taking place, and you can just dive in to whatever looks interesting and fun. These people cannot read diffs. They don't grok commit messages. They need a "forum" to voice their opinions and spread their knowledge.

  4. We lose the collaboration aspect. Right now, you can fork code all you like, but if you want it hosted on Drupal.org, you need to work with the maintainer of module X to get your changes accepted. There's an issue posted, a patch posted, a discussion happens, some back and forth with additional changes, and then finally a commit. Being on a centralized system /forces/ this type of collaborative workflow, but where's the incentive for contributing back to the upstream project if you can have your own little sandbox to play in? Instead, the maintainer has to play "catch-up" with you.

  5. It fragments and silos our development resources. The Drupal core issue queue at any given time probably has 2-300 people browsing, responding, posting patches, etc. There is exactly one canonical repository, with a small handful of gatekeepers, so everyone who's not one of those gatekeepers by necessity needs to work with everyone else to push changes through. Distributed VCSes make forking easy (in fact that's kind of the point), which means people are no longer "chained" to this central hub of buzzing activity, and start to make their own; Drupal for project managers, Drupal for performance freaks, Drupal for designers. These people are no longer part of the "global" community, but part of their own little sub-communities. And we as a project miss out on that, because we don't have their input on the "global" changes to the upstream branch.

  6. Reviews become impossible. Right now, patch reviewers learn a single set of tools, go to a single set of pages, and do a single workflow for every proposed improvement, no matter if it's for Drupal core or Views module or for whatever. At a glance, I can see ALL activity that's happening on a given project, and can go into any of those and jump in. If improvements are happening willy-nilly everywhere, and there are 50,000 commits a second, I have no idea how I could possibly keep up with it all.

Not sure if I've communicated these concerns properly, since I seem to fail every time. :\ But do you (or anyone else) have some responses, or thoughts on how we could merge the best from the current overly-centralized workflows and tools we have now, with the best of the overly-decentralized solutions like github?

Well said. These things are

Dries's picture

Well said. These things are more important than a tool, as it is what makes our community successful. If we're switching to Git or Bzr, we have to make sure we can maintain our collaborative values and the important knowledge sharing workflows.

Unlike webchick, I don't think we should be able to see ALL discussions. It is OK to see only the important discussions, as that makes reading the issue queues more scalable and might increase the average quality of an issue comment. (In today's world, some discussions happen in IRC already.)

Some good points, some confusion.

adrinux's picture

I have a worry that you're viewing drupal.org + CVS as one integrated thing too much. I don't think any of us want to do away with the issue queues or any of the other community workflow goodness. I don't think you have anything to fear as far as 1, 2, 3, 5 and 6 go.

Four - Is true up to a point. Sometimes commit's don't get made. Sometimes we end up with 4 modules doing similar things. Yes git/bzr facilitate forking, but they also have better merge tools. We may well end up with several variants of a module, but all in effect branches, and all sharing new code. Or not. It's possible for this to happen now of course, the point is that git/bzr make sharing a little easier.

Five - It's no harder to fork Drupal in CVS than it is in git/bzr. All you need to do is checkout or export and start working then never commit back. Drupal for performance freaks == Presflow, and so on. From what I can recall CVS merge tool is so bad as to be unusable between branches (I might be wrong).

Actually the more I think about this, the more it seems most of your fears are rooted in a misunderstanding of terminology.

Generally when we talk about a fork it's a fork to an entirely separate project. With git/bzr (especially in the case of git) because every clone is a complete copy of the repository it tends to get referred to as a fork. In CVS terms it's probably more correct to think of a clone as a branch. It just happens that the branch is on a remote machine.

Don't just think fork. Think fork and merge. Fork and merge! They are the yin and yang of DCVS workflow.

I suppose diff and patch is in effect a very crude merge tool. Modern VCS systems with decent merge are far better, with a merge you don't just get a finished patch, you can potentially get back all the change history that went into making the patch!

I'd always imagined that the move to a dcvs on drupal.org would happen in two phases, phase one being simply the transfer of the existing workflow onto the dcvs, as near as possible – it seems enough work in itself – phase 2 would be adding new features. I'm probably not very ambitious though.

i think overthink...

sime's picture

Hey webchick. I'm a VCS hack who uses Git (I can't speak for BZR). I guess you could say I sit somewhere between "don't make me learn a new rule book" and "DVCS is so awesome it's a no brainer".

The git repositories that I use are hosted on github or with a private repository run by the client. "cvs checkout" and "git clone" are analogous to me. I don't do anything different from the way CVS works, I just have a different set of commands. I don't think in a different way.

To emphasise: I have never used the distributed tools of Git. I love it because it's simpler to get started on a new repo. Doesn't spew .cvs/.svn folders everywhere. It's faster. And I can make lots of little local commits to keep my mind clear.

So my understanding is that there is no need to change the requirement for people to post patches to an issue queue. That's what I'd do with my Drupal projects at first.

"I really appreciate that I could do a git fetch on your repository my friend, but I prefer you post a patch in the queue."

We seem to be worried that some mysterious git workflow will be forced upon us. I don't think that's the case. The newbie will continue to have a set of basic instructions for testing/reviewing/submitting a patch. The newbie won't be working collaboratively with others on a fork - they'll see Drupal.org as the mothership and they'll easily learn how "phone home."

Keeping it simple...

reglogge's picture

Adding on to what sime said.
A useful and simple first step might indeed be to just keep the existing workflows on d.o. which rely heavily on posting and reviewing patches and only change the underlying vcs to git (or bzr, which I'm not familiar with). Only centralized repositories with limited write access for Drupal core and contributed modules would be hosted on drupal.org. Everyone who is not a core or module maintainer only gets read access. Checking out a copy of Drupal core to work on would be as simple as "git clone git://anonymous@drupal.org:drupal.git".

Here's how this would affect the use-cases webchick has outlined above:

  1. Bug reporter:
    Nothing would change. We would also leave out the whole automatic creation of a branch.
  2. Patch author:
    - First make sure you are working against the current HEAD with a simple "git pull".
    - Create your own local branch with for example the name #987654 (corresponding to the node-id of the bug report)
    - work commit, work commit, work commit
    - when satisfied, create a patch with "git diff master > whatevername.patch" assuming that the branch with the current HEAD is called "master"
    - post the patch in the issue queue
  3. Patch reviewer:
    Nothing would change. You just apply the patch as you were doing with cvs. Alternatively the reviewer could also ask the patch author for a tarball which is as easy to create as "git archive #987654 > #987654.tar.gz"
  4. Patch reviser:
    The patch reviser can create his own local branch, apply the patch there and work on it, then post his revised patch in the issue queue again - just like we are all used to.

This would eliminate the need for automatic branch creation as well as the need for every d.o. user having his own hosted repository (which would kinda defeat the purpose of a distributed version control system anyway).
Core and contrib maintainers would be the only ones with write access to the core and contrib repositories hosted on drupal.org - just as they are now. Git certainly provides for this (as does bzr, AFAIK).

The main advantages of this approach are:

  • The learning curve is significantly less steep than with the completely distributed approach webchick outlined
  • We gain access to the vastly superior speed and capabilities of git (or bzr) vs. cvs (adding and renaming files comes to mind...)
  • As a developer (or patch reviewer) working on lots of different issues at the same time, it's really easy to switch from one setup (branch #987654) to another one (branch #123456) and back whenever necessary
  • Keeping my changes properly diffed against the current head is as simple as "git rebase master" whenever I work for a long time in a new branch, making it far less likely that my current work diverges from the work others do and get committed into head

Later steps could then be the introduction of the fully distributed workflow first to a subset of d.o. users (like a beta-phase) and then, when everybody has gotten comfortable with using a dvcs going the fully distributed way with users pulling and pushing from and to each other directly. Or maybe just never do it for the reasons webchick has outlined.

I guess what I am trying to say is that there is no need to go all the way and use each and every functionality of a new dvcs when this has potentially negative effects on the community. Using git (or bzr) even in a trimmed-down fashion would still have enormous advantages for the community.

Interesting!

webchick's picture

Yeah, that certainly seems like the best move for "baby-stepping" into this whole thing. If we continue to share patches in the issue queue, we still lose the ability to bring in incremental commit messages, but we're also losing that now, too.

But then I guess I don't understand what would make this move any different than core/contrib developers using the git/bzr/hg mirrors that exist? Just that an extra step of CVS merge wouldn't be involved?

But then I guess I don't

yhager's picture

But then I guess I don't understand what would make this move any different than core/contrib developers using the git/bzr/hg mirrors that exist? Just that an extra step of CVS merge wouldn't be involved?

That extra step of sending commits back to CVS is a PITA. it is such a pain, that projects get hosted elsewhere (rimes with 'pub'), and get imported back to CVS at the end of the development (or not at all) - just to avoid this extra step.

Here's how to do it - http://drupal.org/node/288873 - it sucks (not the howto page). I tried once, and never went that route again.

If we continue to share

reglogge's picture

If we continue to share patches in the issue queue, we still lose the ability to bring in incremental commit messages, but we're also losing that now, too.

Not necessarily. If e.g. the maintainer of a module successively commits various patches that reflect an ongoing patch/review-patch/review process, and finally merges these commits from his new bugfix-branch into the module's master branch, all his/her commit messages would be transported into the master branch. The same holds true for every commit a core maintainer would make to Drupal core if he/she wrote a nice commit message. How about referencing the issue thread id (e.g. http://drupal.org/node/448292) in the final commit message? Then everybody with a local checkout/clone would have an easy way to check all the discussion that's been going on around this very commit that shows up in the repo history.

But then I guess I don't understand what would make this move any different than core/contrib developers using the git/bzr/hg mirrors that exist? Just that an extra step of CVS merge wouldn't be involved?

The big difference would be that
a) nobody would have to deal with cvs anymore (big +)
b) git/bzr/hg would be the "official" vcs of Drupal, thereby ending all the trouble you have with nagging people to put their stuff on cvs (Bartik and Kiwi themes lately ;-)). Also, the mirrors are what they are - mirrors - and not the real thing.
c) if Drupal were to provide a service similar to github for hosting repositories of Drupal-related projects it would further consolidate the community. Just think of somebody trying to write a new module or theme. The hassle at the moment to set this up on d.o. as an official project drives many devs to github or other places. On github it's a question of 5 minutes to set up a repo and start committing, branching, forking, collaborating.

Total agreement.

sun's picture

Total agreement.

Not sure what happened to the etherpad-doc we worked on in IRC, but most of your points boil down to:

  • Drupal, the product, and Drupal, the eco-system, and Drupal, the community, heavily depends on peer-reviews
  • Peer-reviewers need targets.
  • A "target" must be concrete, unique, up-to-date, and a common target across developers and reviewers.
  • Right now, we have a single target, CVS HEAD. (YMMV)
  • Anything new is coded against that single target, and everyone in the process knows that target.
  • Anything that does not apply (anymore), or is using outdated APIs or patterns, needs to be re-rolled against that target.
  • ...

Stuff we have to prevent:

  • Outdated targets. It makes zero sense to review or even test a patch that applies against "HEAD" from 2009-08-24.
  • Gazillions of targets that cannot be tracked or followed. Evolution and innovation happens through involvement of people having very different backgrounds.
  • Code, reviews, and discussions outside of drupal.org. Prevents reviews from the right people and therefore effectively wastes energy and time of everyone.
  • Too large changes turning into entire rewrites over time. Forking from a branch from a forked branch from a branch, aso.

Daniel F. Kudwien
unleashed mind

Yes, yes, 1000x yes

webchick's picture

Please ignore my non-sensical blathering above and instead pretend I had said this. ;)

So the question before us is really what drupal.org and Drupal community integration for a distributed vcs looks like, taking these fundamental requirements into account.

Some more agreement

reglogge's picture

There really is nothing that would prevent us from using git (again, I don't know the others, it might be the same with them) in a way that preserves this having one definitive HEAD of core or a module/theme. This would provide write access only for core or module maintainers.
Now if anybody does work on a checkout of this HEAD he/she would have to get a maintainer to commit the changes that are delivered as patches or tarballs.

Git provides the "git rebase" command that allows me to keep changes in my local branch constantly up to date against the HEAD of core or a module. A typical workflow before posting a patch would be:
- update local master branch (which is linked to Drupal HEAD)
- work, work, work... in my local issue branch
- update local master branch (which is linked to Drupal HEAD) again
- rebase the local issue branch against my local master branch
- create a patch with git diff
This way a patch can always be applied against the current HEAD.

As much as I like git for

pwolanin's picture

As much as I like git for certain workflows, I thnk bzr (at this point) would be a more accessible and appropriate tool.

@webchick - minor point, but with git/bzr the "official" repo for any given contrib can be totally separate from the per-user repo.

w.r.t. preventing the per-user repo from being a repository for trash, perhaps we could have some restriction such that you could only create branches based on an existing official branch.

If you want to go down this route...

chx's picture

... then bzr is the tool of choice again because of it's shared repository feature which is made for this. To recap, in bzr, a directory==branch==repository but if you have a lot of similar directories then one directory up you can create a shared repo and bzr will only store the differences. Ie. if /var/bzr/ is a shared repo, and /var/bzr/d7 contains HEAD then storing /var/bzr/d7-somepatch will take minimal space. Which, I believe, is just what you wanted. Edit: one HEAD takes about 52MB to store (this is every revision ever made), one branch once that's stored in the shared repo takes 264KB.

I just want to give

dawehner's picture

I just want to give comparable infos for git and hg
git:

  • Pure Clone: 34728KB
  • One branch: 6KB addition storage

hg:
One thing i recognized when cloning from http://hg.shomeya.com/drupal-7/rev/c3067104bf71, it needs a lot of time, compared to git, but this could be perhaps just be the mirror.

  • Pure Clone: 48712KB
  • One branch: 0KB addition storage, i guess hg branch is not the right commando

Actually no. This is a reason

sdboyer's picture

Actually no. This is a reason to NOT use bzr - because the "shared repository" system is one that has to be specifically set up (as you've described here). Git works this way natively.

I'm surprised that there

AdrianB's picture

I'm surprised that there seems to be no one rooting for Mercurial so far.

Now, I personally can't speak on the matter, I know way too little about any of these VSCs to form an opinion.

But from what I read elsewhere I got the impression that hg was on the rise and on par with git on most critical features. In the Mac community it seems popular with projects and developers switching (i.e. Adium and Daniel Jalkut) and Google chose hg over git for Google Code after extensive analysis.

Sure

chx's picture

hg is great too. I do not have any problems there.

This "extensive analysis" can

CorniI's picture

This "extensive analysis" can be questioned at least for general use cases, and besides, google is a python company, so hg is the logical choice for them.

That is probably true, I just

AdrianB's picture

That is probably true, I just used the words from Ars Technica.

Bazaar is more purely Python

David Strauss's picture

Bazaar is more purely Python than Mercurial.

There's barely any difference

lut4rp's picture

There's barely any difference in hg and git in terms of complexity and power. Both of them have the same complex interfaces and advanced toolset. But we as a community don't seem to have many hg proponents, giving git the edge.

I would have rooted for

Garrett Albright's picture

I would have rooted for Mercurial, were it not too late to do so - this thread (or, rather, Webchick's post linking to this thread) appeared on the Planet just today. =[

What has sold me on Mercurial is that it sells itself as being easy to use - and, indeed, once you wrap your head around the concept of a DVCS in general, it is. Its ease of use would make it a natural choice for Drupal, particularly when we have to consider that a good deal of people wanting to use this system are going to be non-programmers (themers).

But Hg has been (prematurely, IMO) nixed, so I cast my lot with Bazaar. Our experiences with the difficulties people (myself included) have had with CVS tells me we should not choose a system because lots of programmers are familiar with it, or because it is fast; we should choose the one that's easiest to use and understand, and that is not Git.

Find me some d.o infrastructure folks who know Hg...

webchick's picture

...well enough to deploy and support it in place of CVS, and a huge crew of Hg fans who can help folks on IRC who are having trouble, and we can add it back to the list.

There don't seem to be either in our community, however.

I can confirm the ease of use

AdrianB's picture

I can confirm the ease of use in Mercurial from a complete newbie perspective.

I'm new to the whole VCS thing, I've never before used svn, almost never cvs and none of the new DVCSs. But I know god kills kittens if you don't use VCS (ask johsw about DrupalCamp Sthlm :)) and 2010 would be my year of VCS.

So I did some reading on the subject and found out that Mercurial seemed like a good fit. It was easy to begin with but still powerful enough to compete with git. I read convincing arguments like git is pc, hg is mac, bzr is os/2 which sold me :)

Seriously though, what I found out with my very limited research was that some of those who actually tried both git and hg seemed to favor hg. And the comments here is telling because I haven't found a single one negative about hg. Most comment are "I use git and I like it a lot but I haven't tried the others." Git seems good and people like it. But most of them haven't tried the others.

Now, I haven't used any others either, so I wouldn't raise my voice about hg because I can't say that it's better or worse.

But what I can say is that I now - after reading the guides I put in the wiki - can use hg to develop locally on my Mac and diff, commit and the push commits over ssh to my shared hosting account where I login and update. I still only know the very basic of commands, but it's enough for now and way better than no VCS at all. I've yet to use it on a Drupal project, but that is my next step.

If Drupal should use whatever is most popular among the developers then go with git, it's obviously the most well known and used DVCS so far.

I'll end with this tweet: "Feels like #hg will go down as the betamax of version control."

Mercurial will NOT be the betamax of Version Control

cossovich's picture

HG will not be the betamax of VCS due to the fact that Mozilla, Python and Java (not to mention OpenSolaris) are some of the large projects that are using hg and have been for a while. Mercurial has a friendly learning curve, is fast and I think is the only competitor to Git.

Git gets a lot of milage because it was written by Linus and it's used for the Linux kernel (and it's wickedly fast). The downsides are the workflows and the learning curve. Technically the only system that I think is an equivalent is hg and in terms of user-friendliness hg is miles ahead of Git for new users to VCS and people migrating from SVN or CVS.

I'm disappointed I found this thread so late and that hg has somehow been ruled out already.

(Also, I'm sure if the Drupal community reached out to the Mercurial community there'd might be an opportunity to get some input on some of the issues we face moving from CVS.)

Problem is not just migration...

webchick's picture

...but ongoing maintenance, and ongoing developer support.

We have experts in both the bzr and git camps who already are on the d.o infrastructure team and have signed up to do the work required to move us, and continue to take care of it after the move long-term. We also have a large contingent of bzr and especially git users who have signed up to help with answering questions on IRC, providing documentation and tutorials in the handbook (some of this has already been started in the case of Git), etc.

Hg looks like a great VCS, but the traction is just not there in our particular community. That is what is important, for our purposes.

bzr is also python is seems

pwolanin's picture

bzr is also python is seems (and launchpad is built on zope) so not sure why goggle would prefer hg to bzr if language was the sole issue.

Aside from its potentially better merging algorithms for renames, I think i'd favor bzr for drupal.org since it supports a checkout mode where it operates just like svn (update and commit talk directly to the remote branch) - also it can integrate with svn or git or hg remotes, comes with a GUI, etc.

I guess Google didn't want

lut4rp's picture

I guess Google didn't want developers to feel hampered with bzr. Bzr is still slow to work for larger projects. IIRC, Mozilla tried to shift to bzr from CVS (when they shifted mozilla-central) and bzr just crapped out for them.

I'll drop my git favoritism here for a while (:-D) and consider both the Python systems: I would pick Hg. Bzr is the least used of the three. Both hg and git have very huge projects using them. Bzr is almost all Launchpad/Canonical. Hg has GUIs and interfaces available for all 3 OS's as well.

But for git there is

CorniI's picture

But for git there is Gitorious, and for bzr, there is launchpad, and in theory you could install both on drupal.org. I'm not aware of any similiar solution for hg.

I have nothing against

gordon's picture

I have nothing against Gitorious, but I have actually written my own version so you can maintain your ssh keys via Drupal and then when you are pushing/pulling it will check to see if your user has access to a organic group before letting you access the code.

Gitorious is ok, but we could do so much better with no too much code, and do things like check access directly on the project.

--
Gordon Heydon

would you mind sharing your

CorniI's picture

would you mind sharing your code or even contribute to the versioncontrol and versioncontrol_git module?
These modules try to implement something similiar.

Yes I would be happy to

gordon's picture

Yes I would be happy to contribute my code some of my code. I build a lot of this stuff when developing a github like drupal system using og as the core of sharing the code.

Basically I have a couple module which would make a good start.

  1. an SSH key handling module to allowing keys to be added saved against users/nodes, and this also updates the .ssh/authorized_keys to allow a git@example.com login.

  2. a heap of shell and php scripts which validate user access to organic groups based upon the ssh key of the user, and I will be extending it to allow differentiation between read only and read/write accounts for a user.

Once you pull gitosis apart you see that it is very easy to do all this. Linking this with the current project system will is not a big deal.

Most the methods of doing this can be found by just doing a "man authorized_keys" and it is all there. All you need to do is use the command before the key and it all works.

I am not 100% sure as I have not really used bzr, but it should be able to be used across any of the platforms.

Gordon.

--
Gordon Heydon

There's BitBucket for hg.

mcrittenden's picture

There's BitBucket for hg. It's nice.

Well, I don't think anything

pwolanin's picture

Well, I don't think anything in Drupal core or contrib is "large" on a scale where speed differences matter much.

hg seems to be supported/developed by one small company? would be nice to see where the usage trends are heading.

Speed always matters

Coornail's picture

As a Gentoo user I have to disagree on this.

First, I think Drupal cvs is pretty big to make a difference.

We know it form web interface design that even the smallest waiting affects the user in the unconscious level (that's why we have content-first and script-last).
But a step forward, half seconds can make the difference between "Ah, commited. A great workday again" and "This thing screwed with me for 5 hours and now it doesn't want to commit..."

So I'm on git's side because of this.
Also I have two other reasons:

  • Git has a cvs integration, so we potentially could have an overlap period while everyone get used to it
  • Git proved pretty nicely in the linux kernel development

(I don't know the others to make the comparison, sorry)

The size of Drupal CVS

David Strauss's picture

The size of Drupal CVS overall is irrelevant to DVCS considerations.

FYI Drupal's complete history

gordon's picture

FYI

Drupal's complete history of the last 10+ years is 57Mb in CVS, and 22Mb in git.

In CVS this is not a really problem as you only check out one version at a time, but with git and (I think) bzr to do a complete checkout will be 22Mb every time.

So we do need to consider this.

We could do things like purge some of the older history, but I think that it is best to keep it all. I find that it is a great method to see where drupal came from.

--
Gordon Heydon

Before you think it's a

David Strauss's picture

Before you think it's a problem, read about:

  • Bazaar lightweight checkouts
  • Bazaar stacked branches
  • Bazaar shared repositories
  • git shallow branches

Yes I do know about that, but

gordon's picture

Yes I do know about that, but the majority of people will just do a "git clone" and be done with it. And these days unless you are on dial up 22Mb is not a big deal.

--
Gordon Heydon

With the number of people

sdboyer's picture

With the number of people already using the various cvs-to-something integration methods, we effectively are in an 'overlap' period. And it sucks, and needs to be ended as soon as possible.

I'll also second David's comment - size of Drupal's CVS is irrelevant - and would go the additional step to say that the difference in speed between bzr and git is, for our purposes, negligible.

Radically adjusted wiki content

webchick's picture

Our action list has gotten a lot less action-y. :) So I got it back to concrete to-dos, along with ____ to represent blanks that need filling in. Please go nuts filling in said blanks, signing up for $vcs brigade, and investigating into drupal.org integration issues and posting back with your results.

version control documentation

chachasikes's picture

I am not familiar with bzr/hg - but I can say that I very much appreciate how git's documentation tries to be friendly, approachable, and clear for new users.

especially pages like this: http://learn.github.com/

Just a random observation...

webchick's picture

I find it extremely interesting that despite the concerns about git being overly complex (which I have no doubt are true), the Git documentation resources listed so far are actually really, really great. And of the people involved in this thread so far who have voiced a preference (which presumably means they use these systems and would be able to help others who were having trouble), about 10 of them support Git, 2 of them support Bzr, and 1 supports Hg. This is a huge indicator to me, as someone whose primary concern is keeping the community intact and couldn't care less about what we choose as the technology, that Git is worth some serious consideration, and may indeed be the easiest path forward for our particular community.

Is this actually reflective of our community's preferences and general comfort level, or have the bzr/hg people just not heard of this initiative yet? I've Twittered about it a couple of times, but I'll also try and write up a Drupal Planet post early this week to get the word out better.

I've stayed out of this

EclipseGc's picture

I've stayed out of this thread just yet, since I like to argue my case(s) in irc more often then not, but I'll respond to this particular point as I think it's a really really good point.

The Drupal community is the reason I'm using git in the first place. I'll admit to not having used bzr, or hg, or even svn (well there was that one time but...). Point being, I've limped along utilizing d.o's CVS for at least 2-3 years now, and I already feel more empowered utilizing git than I EVER have with CVS and I've been using it for a rather short period of time now. It's the documentation, and especially github's attempts to make git easy to learn that have really pulled me into it. Now, with that said I'll make two more points.

1.) We all know CVS is not where we want to be any longer, so I'm not trying to raise that point, just contrasting against git since they're my two major experiences.
2.) I obviously have a preference, and I'd love to see d.o using git, but I'm certainly not an expert in this area, so if those who are wiser than myself feel bzr or hg would be a better fit for d.o's use case... so be it. However, I would heavily advise that we look at the ubiquity of tools like github and gitorious and ask ourselves if the momentum inherent in what they're doing isn't worth considering as well, since I think it might be a powerful argument/indicator concerning the direction of VCS in general.

Eclipse

well, just one note: The

CorniI's picture

well, just one note:
The people from versioncontrol API (jpetso, sdboyer, marvil07, haxney, me) already coded for a d.o on vcs, and these are the only ones who coded for a move, and everyone there (silently) agreed on git, and the versioncontrol api backend for bzr is nonexistant and the one for bzr very old. I don't say versioncontrol_git is in a good shape, or ready, but at least people worked on it at all.
Does this make a difference?

I think the entire Version

David Strauss's picture

I think the entire Version Control API approach is flawed. If we move to a DVCS, we want to use the advanced features, not be restricted to interfacing the project module with what CVS, Subversion, git, Bazaar, and Mercurial all universally support (which is pretty pathetic).

so you want to hardcode

CorniI's picture

so you want to hardcode project* to one specific DVCS?
and even then, there is parser code in versioncontrol_git which could be reused, or which someone needs to write for bzr. The same goes for experience in writing this code, which is nontrivial, because of some inherent properties of the DVCS's of our choice (especially forming a DAG which is presented more or less in linear form, stuff like branch deletes not being recorded in the commit log, ...)

webchick's picture

Reasons why modules such as Views get a lot of wide usage and lots of patches submitted from various people are because the module is a generic tool built to handle a lot of problems.

Integrating Project* with Version Control APIs means people running CVS, SVN, Bzr, Git, etc. can use and build from the same code, and invites more collaboration into keeping the code maintained.

It's a long-term strategy for more crowd-sourced sustainability into our infrastructure. But that said, there has been a tremendous amount of thought behind version control API, so I'd take a close look before dismissing it outright.

My opinions here are also

David Strauss's picture

My opinions here are also influenced with my lack of interest in developing Project* as a general-purpose system for use outside d.o. It hasn't gotten much traction elsewhere in the past, and I'm not convinced lack of broad VCS support is the issue. Every bit of abstraction we add just increases our own maintenance burden on a system few people understand already.

Sorry, but I totally don't understand this position..?

webchick's picture

The biggest hold-up porting Drupal.org to the next version each and every time is Project* module, and this is explicitly because it has historically had so many hard-coded drupal.org assumptions in it so only people who really, really care about Drupal.org bother to put effort into porting it. That is a number several thousandths the size of people who need a halfway decent issue tracking system for their project management.

The work that dww and hunmonk put into cleaning up Project*'s hard-coded listing queries and migrating to Views not only made this module more appealing to folks outside of Drupal.org, but also allowed us to leverage the existing work the rest of the community has already been actively using/bug fixing/feature adding. They also managed to flesh out some really tweaky advanced edge-case bugs on Views while they were at it. :) This kind of symbiotic relationship between Drupal.org tools and tools the entire community uses can only help us, IMO.

Another option is to scrap project module and install Gitorious/Launchpad instead, and that's a whole separate discussion. But I certainly don't understand resistance to making our tools more generally useful so that the community can pick up some of the maintainership aspects instead of just the module maintainers themselves.

I'm in the "let's just

David Strauss's picture

I'm in the "let's just install Gitorious or Launchpad" camp. I'm not sure where Gitorious stands, but we just can't compete with the millions of dollars Canonical has invested in Launchpad development. I'd rather migrate the projects once (which would not be hard with the Launchpad API) and put the entire era of slow Drupal.org upgrades and neglected project tools behind us.

Free and open-source software is important to me, not dogfooding.

Hi all, great to see you again

jpetso's picture

As much as it seems weird of me writing this, I second David's stance on moving out of project*. Version Control API was the most steadily developed piece of open source software I ever attempted, and yet it only covers the basics of what other VCS/collaboration tools offer. There is no commercial support behind it, and I think the current volunteer community behind it is not strong enough to make it a viable contender for other tools that are specialized in supporting code collaboration.

If this is a hard blow for marvil or CorniI, please take my apologies. I still love Version Control API and I think over time and with a clear vision where to go, it can still evolve into something unique. But, and that goes out to you webchick, I honestly think that the required work to get to that point is not going to get done in any sort of remotely the timeframe that you're looking into. And actually, that's not even the main issue, I'm sure with some involvement of the Drupal association and maybe one supporting company (er... parent poster?) it could be done if there is strong will behind it.

The issue that I think is key is that by trying to eat our dogfood without issue tracking being our core competence, the Drupal community loses its ability to choose the best system that matches our needs. As great as project* is from a workflow perspective, its co-dependence on the various parts of the module is death to modularity. If there is an issue tracker that supports all kinds of crazy notification stuff and comes with a hook script for your favorite VCS that updates or closes the issue when stuff is being committed, you cannot "upgrade" because project* is all tied together and if you remove one part, it will break all workflows terribly. If you want to improve the process with neat tools like Review Board, you can't do it because someone will fear the cost of comments or even code being dealt with in two separate tools when everything ought to be centralized in project* on drupal.org.

By themselves, those arguments may well be valid, but summed up I think Drupal is losing out on a sizeable number of opportunities. That includes switching VCS when the time was right for switching VCS instead of switching from project/cvslog to project/Version Control API. Drupal being a community of capable web hackers, I think we probably would have come up with all kinds of funky scripts that made integration of the other VCS and related external tools into the rest of drupal.org a lot nicer. By delaying that switch until we had a full replacement (Version Control API), we kept the disadvantages of our old system while not gaining enough traction for the new thing to appear.

The key issue is which tool is more important; is it commits from all over d.o aggregated on a single page and automatic assignment of commit priveleges based on associated users' project maintainer status, or is it usable diffs, upgradeability and a VCS that can do merges without inducing headaches? Do we actually need the ability to centralize everything in a single issue and then move that across between projects, or is it sufficient to crosslink where necessary and go on from there? Is it more important to not lose features or is it more important to gain new ones?

There are a lot of good reasons for not losing existing features, but in the end I think it's time for you core guys to say "yes, we are losing features; yes, we may be losing lazy programmers that hate to learn new stuff; but we, the people who do stuff, have fun trying out new stuff, and by getting new awesome stuff, we'll also get new awesome features and new awesome contributors". Or something. Sorry, I tend to drift off into drama, please excuse me for doing so :P

I suggest to switch to separate tools not because I think project* or Version Control API is an inferior solution, but because modularization of tools will get your freedom back, and you'll have more fun upgrading them or swapping them out. Take an array of modular tools, and connect them first by cross-linking, and over time by an elaborate network of simple bots.

Oh, and the choice of VCS really doesn't matter at all. I think git has more traction to become the main and most popular VCS (therefore attracting more people who already know how to do awesome stuff with it), but in the end whatever Dries and webchick like more should win. Bazaar has the advantage of having a Launchpad that doesn't suck. On the other hand, Launchpad is a ginormous one-size-fits-all beast in itself, and about as modular as project*. My experience so far is that modular always trumps featureful. I might be wrong about Drupal though, who knows.

... well, damn!

webchick's picture

Great response, Jakob. Certainly some things to ponder here, hmmm... It'd be interesting to get dww and hunmonk's feedback on this point (and this whole issue, really).

Since bzr and git are basically neck-and-neck atm as far as I can see, it might also makes sense to evaluate Project* replacements for each VCS, just to see what that avenue might look like should we wish to pursue it. I'll add a section about it to the wiki.

I just wanted to say

jpetso's picture

Please note that I'm not implying that I ask for project* being traded for something different. I think my above comment made it sound like that, which is absolutely not my intention. The issue tracking parts of project* are amongst the finest that I've seen to date, maybe not in features but in attention to detail on workflow stuff, and honestly, I think d.o's issue tracking plays an important role for making the community appear so inclusive to yet non-contributing newcomers.

The co-dependency goes mainly one way, meaning cvslog is not so very useful without the rest of project*, but project_issue and even project_release, with a few adaptations, could do without cvslog just fine. sdboyer's list of options that includes git/project*, bzr/project* and bzr/Launchpad seems spot on to me.

What I do think is that the all-encompassing regulation scheme for CVS commits that d.o employs right now is maybe not wholly incompatible with DVCS, but far less necessary than our current CVS requires it to be. By moving projects (Drupal core + all the separate contrib modules/themes/etc.) to separate repositories, the vast majority of access control requirements magically disappears. I can't commit to Views because it's a different repository with a different list of committers anyways, and path-based access control becomes totally unnecessary. Branch restrictions are also less tragic, because even if unintended branch names are pushed, they don't clobber the whole rest of the (CVS) repository, and can be just as easily deleted. My favorite mode of working would actually be to allow all branch names for a repository, but let releases still only be created for those that match the well-established branch/tag regexp.

Right now, project*'s approach to version control has no notion of different users' clones of an upstream repository, and even Version Control API has the weakness of currently requiring pretty much a full clone of the commit history (minus the actual file contents) in the Drupal site's database. I believe sdboyer is well aware of this shortcoming and once d.o switches version control systems, my gut feeling is that some nice Drupal-based git/bzr integration would appear rather quickly (whether or not it's based on Version Control API). But it'll still take time, and this is where external tools can help us bridge the gap between the current lack of DVCS collaboration support provided (or rather, not provided) by Drupal modules.

Personally, I'm in favor of project_issue being preserved as is, project_release adding a new packaging script for the other repo (ok, not quite that straightforward as it currently depends on cvslog, but not all too hard either) and an external pure-code collaboration tool like Gitorious, which is gradually integrated with project* and might even be replaced by Drupal modules in the end - go Version Control API! - or maybe not.

Agreed

David Strauss's picture

I agree that the choice of project tools is more important than the choice of VCS. Part of my advocacy for Bazaar is really just a proxy for wanting the opportunity to install Launchpad for Drupal work (or having the option of migrating to it open in the future).

Brilliant post, Jakob, and I

sdboyer's picture

Brilliant post, Jakob, and I agree completely:

  • I also agree with David - deep integration is the way to go, and we're shooting ourselves in the foot if we don't - but I also don't think that moving out of project* entails moving TO a fully external system.
  • I'm personally advocating git because it's the one I I know really well. I had some difficulty getting into bzr initially, but that was mostly because I started with the biggest, fugliest challenge (lossless cvs->bzr migration). But you're right, the vcs we pick ultimately doesn't really matter, rendering this discussion mostly pointless unless/until it identifies what old features we can and can't lose, what new features we do and don't need.
  • Yes, non-modularity is absolutely what screws us, and switching one behemoth for another is a scary, bad prospect. We've got all the APIs already written to do most of the individual things that project* does - manage emails, add & change properties on issues/tickets - and I don't know why we aren't talking more here about the work required to start using those.

Meh

Heine's picture

about 10 of them support Git, 2 of them support Bzr, and 1 supports Hg

This only indicates what VCS people invested in (or has the most evangelism). IMO, these VCSes are mostly interchangeable; I only switched from bzr to hg purely for mercurial-server (bazitis is pretty much broken).

git is a no go for me personally atm, because there is no release of msysgit for Windows. As we host the repo on d.o., this might not be such a big issue.

msysgit is available for Windows

scor's picture

Please download the latest version of msysgit from http://code.google.com/p/msysgit/. I was able to install it quite easily on Windows and using Git Gui, browse HEAD log.

previews

Heine's picture

All that's available is a preview from October or earlier. VCS + beta software don't go well together. As I said; we host the repo on d.o., so this might not be such a big issue.

It's not beta software; I

sdboyer's picture

It's not beta software; I dunno why it's marked as preview, but msysgit is pretty stable.

Add one more person to the Hg column please!

cossovich's picture

Although I'm not a core maintainer or anything I'd like to put some support behind Hg. We recently looked at moving from SVN at my work and we came up with Hg because it was a nice learning curve, let's a use a bunch of different workflows, is well supported and has a great community.... and is a technical equivalent of Git (or as far as I can tell from reading stuff by people way smarter than me).

Has this decision been time-boxed? ... What would it take to get Hg back in the mix?

Without much experience in

nenne's picture

Without much experience in the matter I do feel we should pick the one that is the easiest to start using. A smooth learning curve for anyone wishing to contribute is important. And even tho we have 10 people willing to support git users we might need alot less of them for an easier system having people become independent users far quicker. I dont know which one is the easiest of these but i heard bzr is supposed to be really intuitive and easy?

If that is the case i think bzr is what we should use.

On the topic of learning curves...

webchick's picture

I'm definitely concerned about learning curve, as you are. But the general architecture of all three systems is the same, as far as I know. Centralized vs. Decentralized is going to be a big mental leap to make it through, but all three systems have this hurdle. It then primarily comes down to a couple of extra features, and some syntax differences in the actual commands. (Please someone correct me if I'm wrong.)

But in terms of syntax differences, I have a feeling that how most people are going to interact with $vcs is how most people interact with CVS right now; they just blindly copy/paste stuff from http://drupal.org/handbook/cvs/quickstart and hope for the best. To them, they could care less whether the command they copy and paste is git clone or bzr branch.

What is more important is that if they accidentally mess something up, or get an error, or need some fundamental basic concepts explained to them, they're able to get help. Right now, I can count on one hand the number of the people in the Drupal community who understand CVS to the degree that they're able to provide meaningful support on CVS. And as a result, these people are extremely busy. It doesn't help that there isn't a thriving open source community around CVS, as most CVS users have long-ago switched to Subversion, so our community is on the hook for providing this support.

If we switch to something with popular support behind it, which a huge swath of our community is using day-to-day on their own, non-drupal.org projects, suddenly we vastly increase the support network available to new users, which will itself address much of the learning curve. So while this definitely isn't strictly a popularity contest, popularity is definitely an important angle to consider, to help ease the learning curve that will be associated with changing to any of these systems.

not quite

pwolanin's picture

That's one feature of bzr I keep trying to point out - it has a checkout mode where the workflow is just like using svn. That's why I think it would have the lowest threshold to adoption.

For some reason...

webchick's picture

I have it in my head that when you set bzr up this way, you are basically stuck with it, and you can't then go back and use a distributed workflow on the same checkout. Which means we'd have to lock the entire community back into a centralized workflow, when the developer/project hemorrhaging we're seeing is due at least in part to it.

Is that total crap that I must've picked up from somewhere, or is there some truth to that?

I don't think there is any

pwolanin's picture

I don't think there is any truth to that - this is just one possible way top set up your local workspace, and would potentially make sense for the case of someone working on their own smallish module or theme where they want all commits to go to the central repo without having to remember to push later.

The branch that you are referencing is just a normal bzr branch, so it can still be cloned, merged, etc.

I think you can convert the

pwolanin's picture

I think you can convert the checkout back to a normal clone branch (certainly the docs describe going the other direction). In the wose case you just make another local clone of the branch if you want to use pull/push.

There is no truth to that

David Strauss's picture

There is no truth to that whatsoever. Checkouts are known in the internal Bazaar architecture as "bound branches." Really, what happens is Bazaar does a two-phase commit on both the local and remote branches. Thus, converting a checkout to a normal branch just means telling Bazaar to only commit to the local branch; it's trivial. Converting independent branches to one being a checkout of the other is a bit more complex, but it has to be.

In addition, the project itself isn't set up for a "checkout" or "branching" workflow. Developers individually choose how they want to pull the project's code and work with it. Drupal.org would not have to choose.

Got it; thanks for the clarification

webchick's picture

I think I must've been thinking of converting independent branches or something. Ignore me. :D

Yep

cha0s's picture

All you have to do is bzr unbind.

Concepts

chx's picture

While of course you can live off the basic "I just copy-paste commands" but this won't always cut and especially for people wanting to do real work with the system need to understand it a bit.

To me at least, once the basic concepts are in place, the rest is always easy. This is where the git learning curve is much more steep. While bzr has only the concept of branch which is also the repository and is simply a directory (with shared repo being just a convenient storage method for many similar repos/branches/directories and you are free to not use it and once it's set up it does not affect your work in any way or form), in git a repository can contain multiple branches and you are even encouraged to do so.

Also, git has an index / staging area which you need to manually add things to before committing or just commit with commit -a but that will just sidestep the issue of this staging area. Of course, there are countless other tricks you can do with the index (stash for example).

So to learn bzr you need to understand how to branch a remote repository which literally is just copying over, and then you can immediately work with it as it is. There are exactly two players: your local branch and the remote branch (whether you used the checkout or the branch command does not change this, checkout only means you commit directly to the remote branch but the number of players did not change). Git has more players:

Only local images are allowed.

staging is optional

fago's picture

As you said, you can easily "bypass" the staging area by using commit -a. But still people comfortable with it can make use of it. I recently switched to git and in the beginning I just used commit -a, but with the time I discovered the more sophisticated features like stashing and now I really love that great stuff :)

In the end I don't think staging is an argument against git, it's an argument for git. Just use it when you like it.

Actually I have found that

gordon's picture

Actually I have found that with git I am been quite successful with a short training session and teaching people with only 7 git commands which get them started very quickly.

Basically...

  1. git clone
  2. git add
  3. git commit
  4. git push
  5. git pull
  6. git status
  7. git diff

With these commands they can get started quickly and be productive without needing be overwhelmed but how git works. I can generally train a client in 1-2 hours and they can get back to coding.

This was basically how I trained Simon and his guys at emSpace and from that with a couple of questions they were able to find out a lot more themselves.

--
Gordon Heydon

I started using a VCS a few months ago

lavamind's picture

And it was to contribute to the Moodle project, which uses SVN, but provides a Git mirror. I didn't know much about version control systems before, but because I wanted to learn something that would be useful for many projects and for a long time, I went with Git instead of SVN. Like Webchick, I found the documentation to be well-written and the online ressources abundant and very helpful. Furthermore, it's quite true what she said about copy-pasting lines of code from the Drupal CVS handbook, I do that all the time : even though I've become reasonably confortable with Git, I still know close to nothing about CVS!

Git is not THAT hard

reglogge's picture

I dont know which one is the easiest of these but i heard bzr is supposed to be really intuitive and easy?

I'm no super-brainiac myself but got my head wrapped around using git pretty fast. One reason for the common perception that git is harder to use than bzr might be that git is very very powerful with lots of features and is used in many different and sometimes really arcane situations (hmm... reminds me of Drupal).
But for 99% of my work with git I use maybe 8 to 10 different commands (pull, push, add, commit, log, status, branch, merge) and that's it. I also think that people who have mastered Drupal should really have no problem with either one of the systems dicussed here. And I don't only use git for versioning but also for deploying and maintaining local dev, staging and live servers for my customers. Git is even available on the really cheap shared hosting servers most of my clients use.

See also This StackOverflow

mcrittenden's picture

See also this Stack Overflow thread for a lot of good info.

Most of that data is woefully

David Strauss's picture

Most of that data is woefully out of date now.

Mollom won't let me add this

mcrittenden's picture

Mollom won't let me add this but here's a CVS to HG link: http://mercurial.selenic.com/wiki/CvsConcepts

Infrastructure has planned to move to Bazaar

David Strauss's picture

While this doesn't have direct bearing on public choices for Drupal, the infrastructure team decided recently (well before this wiki page existed) to move to Bazaar for deployment and vendor branch management. This was partly because Narayan, Josh K., and I have more experience using Bazaar. Gerhard also reviewed the proposed workflows and agreed that they're streamlined.

Going with Bazaar (or even git because of bzr-git support) for Drupal work in general would make our vendor branch management considerably easier, as we can directly merge from the source branches.

Is this documented anywhere..?

webchick's picture

Like in the infrastructure queue or the mailing list? I usually follow that stuff pretty closely, but this is the first I'm hearing of this. Then again, I've been pretty swamped the last few months for obvious reasons. :P

But if it's not, that's bad. The community needs to be kept in the loop on this kind of stuff, otherwise we can't help, and we can't build critical mass required to help with the training requirement. :(

And in any case, where is our "bzr brigade"? So far I only see your name up above. :(

The Bazaar on infrastructure

David Strauss's picture

The Bazaar on infrastructure decision was made on an email thread just among the people whose brains bleed every time we need to upgrade modules on *.drupal.org.

People aren't really clear what joining the brigade means. Is it a voicing of agreement? Of ability to help others in IRC? Of willingness to volunteer on infrastructure consideration? (Granted, it says IRC help, but I think people anticipate a larger commitment than that.)

commitment

anarcat's picture

I have added myself as willing with the infrastructure migration. I have played numerous times with conversion of modules from CVS to git and can contribute to that process.

Wisdom of the crowd

BartVB's picture

I've been pondering this DSCM choice myself a few years back. At that time I compared the different Bazaar versions, Mercurial, Git (1.4 IIRC) and Darcs. After countless hours of research and trying out the different systems I selected Git and never looked back. Sure, Git does have a learning curve if you want to do more than just checkout/edit/diff/commit but that's true for all DSCMs if you want to go beyond CSV. It's also a one time learning experience that has great ROI :)

About the 'wisdom of the crowd':
http://www.google.com/search?q="git+to+bzr" (approx 9000 results, quite a few are academic questions)
http://www.google.com/search?q="bzr+to+git" (approx 30.000 results)

The community around Git seems to be much larger than the other systems combined, it already has quite a bit of traction in the Drupal community. It's not perfect but if you ask me this is where most of the DSCM energy (documentation, GUI tools, web interfaces, core development) is going to be focussed.

I for one would be really, really happy if Drupal could use to something a bit more usable than CVS. I use Git for all our own projects but any DSCM system out there would make development quite a bit easier and more pleasant.

Anyway, going to see if I can do something to help d.o implement something new.

Your other points are good,

David Strauss's picture

Your other points are good, but making any serious decision based on Google result counts is absurd. Plus, the "wisdom of the crowd" clearly points to using Subversion. You've just restricted the set based on other criteria (being distributed), and we should do the same until we end up with one option.

Choices for the immediate future

BartVB's picture

That's why I added the quotes :) I'm certainly not saying that a decision should be based on this but it does say something that a nontrivial number of people have switched from Bazaar to Git for numerous reasons. But most (D)SCM discussions tend to go towards a rather religious direction at one point or another. Git and Bazaar are both very capable, some minor differences but it mostly comes down to personal preference and priorities.

For the coming months it doesn't really matter that much what system is going to be picked, but it would certainly make things easier. I for one hate working on a project (or prototype) that get's ditched for one reason or another. I know it's sometimes necessary but it's not something I like doing.

Some hard choices for the immediate future:
- Is a DSCM going to be implemented in steps (i.e. first with the current patch based workflow)?
- Is Project* going to be dropped and are we moving to Launchpad, Gitalicious or the likes?

trends on drupal.org

scor's picture

I did a keyword search on drupal.org on for both git and bzr...

git 58 result pages
bzr 15 result pages

Filtering by projects specifically:
git 19 projects
bzr 1 project (the Temporary placeholder for once we get the vcs api backend for bzr going).

Yes, it's something we can't solely base our decision on, but it seems the community is clearly going into one direction.

Let's use Wordpress then

chx's picture

There are a lot more Wordpress sites than Drupal. So what? I would like to base our decision on merits.

Ruling out Mercurial...

webchick's picture

At this stage I think it is safe to rule out Mercurial.

Grrr. Hit enter too soon...

webchick's picture

...because we seem to lack community support (no one signed up for the "Hg brigade", almost all blanks on the wiki page), and drupal.org's existing infrastructure team doesn't have experience with it. Sorry, Hg!

Summary

webchick's picture

I also added a summary of the existing contenders. I don't really think we can make a bad choice here. My "gut" gives Git a slight edge due to the enthusiasm and general community momentum behind it, but having people actually lined up to do the work on Bzr, along with the learning curve advantages place them about even-keeled, at least from where I sit.

Info on mozilla on migrating

Use case

jpetso's picture

I added web admins as "Users" in the use case section. Actually, I see no reason why drush shouldn't ship with the capability of cloning project repos instead of downloading tarballs. You can even re-apply existing private patches (or preliminary cherry-picked ones from upstream) by rebasing, instead of fuzzing with the regular patch tool over and over again. Pretty cool stuff can happen if you keep your site under version control!

Oooo, great point!

webchick's picture

Yes, we did indeed forget that important use case. :)

For this use case, Bazaar is

David Strauss's picture

For this use case, Bazaar is highly portable Python that runs quite well without a build environment or root access. To my knowledge, it is not possible to run git without compilation. This is part of why we use Bazaar at Four Kitchens. We need to work on sites internally and deploy to client servers, which are often either VPS-like or locked down administratively from package installation. We can untar Bazaar, add it to our user's path, and run it.

Planet Strauss

adrinux's picture

To my knowledge, it is not possible to run git without compilation.
Utter nonsense.
There are packages available for every major linux distro, installable packages for OSX, and even Windows (although the Windows ones are labelled beta).

All are linked from from the git download page: http://git-scm.com/download

Still true...

chx's picture

git is a binary. bzr is a script. On restricted hosts that David is talking about you might not be able to run a binary but you can run a script. That'd be the difference. (Obviously?) he did not mean you need to compile it but someone needs to.

Comment titles are silly :+

BartVB's picture

You need root access to install a package, so having packages is not really an advantage for these use cases :)
Git can be installed locally (if your hosting provider allows you to install executables) so there it's sort of on par with Bazaar. One advantage of Git is that it is quite a bit more widespead than Bazaar. Your average provider is more likely to have Git installed than Bazaar.

But all this is not that relevant, drush and the install process can't depend on the availability of a DSCM tool.

Launchpad vs. Gitorious

David Strauss's picture

Launchpad

  • Based on Zope, Python, and PostgreSQL
  • Tightly integrated with Bazaar but can import/mirror most other systems
  • Focus on full development and release workflow, including blueprints, sprint planning, bug tracking (including cross-project and external)
  • Questions and answers system
  • Support for development/release series, in the same way we currently have them for 5.x, 6.x, and 7.x module versions
  • No wiki system
  • Tools for creating releases, managing tarballs, and linking bugs/blueprints/etc. to each release
  • Merge proposal tools that clearly interleave comments, reviews, and updates to the code. A net diff is visible from the merge proposal; viewing detailed commit and branch content requires hopping to Loggerhead (linked from the branch).
  • API that supports external testing tools
  • OAuth for external app integration
  • Supports translations, including committing the changes from the web-based work back to the Bazaar branches
  • Integrates with Mailman to provides lists for projects that desire one
  • Announcement tools with RSS/Atom/something feeds
  • Branches link to Loggerhead (a tool not fully Launchpad-integrated) for branch and commit browsing
  • Rich user profiles: https://launchpad.net/~jonobacon

Gitorious

  • Based on Rails
  • Tightly integrated with git, and imports are handled outside of Gitorious
  • Mostly focused on merge workflow
  • No bug tracking
  • Wiki system
  • Allows downloads of branch content, but not release-style tarballs
  • Merge proposal tools that allow comments and multiple changes to the code, but without a clear relationship between comments and code changes other than both being timestamped. Each commit's diff is individually browsable directly from the merge proposal.
  • Native branch and commit browsing
  • Basic user profiles: http://gitorious.org/~mvo

migrating committers, releases, issues

Gábor Hojtsy's picture

From what I'm reading above, it seems like your suggestion is to stop using our current release and issue system and freeze it at that point, and then start over in the chosen tool, right? So we'd fade out tracking the ongoing issues and open new issues in a new system? I've not seen a hint of migration of releases and issues to any new system, so that is why I'm asking. I assume migrating the list of committers for a given project's main branch would probably be the easiest to do of these.

Also, there are certain very valuable components to our issue tracker like issue tagging, cross-project move of issues to delegate responsibilities (which is a huge win in using drupal.org), the ability to set per-project components, etc. It would be good to know how these proposed tools stack up to our current feature set and which do you propose to loose in favor of gaining on other grounds.

Here you go...

David Strauss's picture

It's quite possible to migrate our data. Issues would probably be migrated through the Launchpad API. It's possible for Launchpad to automatically discover releases based on file patterns on publicly hosted HTTP or FTP, and it automatically binds discovered releases to the proper development series (5.x, 6.x, etc.) based on the filename.

Every project on Drupal.org would migrate to a corresponding project and group on Launchpad. The migrated branches and project administration would be assigned to that group. After this point, projects could customize permissions to take advantage of Launchpad's considerably richer access controls (for example, different maintainers for different major versions).

Tagging bugs (including display of tag clouds) is built into Launchpad, and Launchpad goes quite a bit further than merely supporting cross-project moves. Bugs can be simultaneously assigned to any number of projects and development series. This means it's possible to mark a bug as affecting Drupal 6.x and 7.x and track severity and progress for them separately. Launchpad also has the concept of bugs in packaged distributions, allowing a bundle like Acquia Drupal to watch and independently prioritize issues affecting the bundle.

While I have no difficulty

sdboyer's picture

While I have no difficulty believing that the projects & releases could be adequately handled, this part:

Issues would probably be migrated through the Launchpad API.

is vastly oversimplified. What's being migrated - just issues and the raw comments made on them? No status changes? Would a migration capture issue/branch associations for the old issues? How about issue assignments? How about all the input filter-based links to other issues, or comments in other issues? Do all the testbot results come through nicely? Moreover, how about the incalculable number of static links that are scattered across the internet to some issue(+comment) - do we put into place some kind of routing system to resolve those properly to their LP equivalents?

I'm not saying all these things necessarily need to be captured - more just pointing out what's at risk in a migration. As I said elsewhere, the issue queue is the heart, soul and history of Drupal, and I'm hesitant to do anything that threatens that.

What's being migrated - just

David Strauss's picture

What's being migrated - just issues and the raw comments made on them? No status changes?

There's no reason we couldn't migrate status changes.

Would a migration capture issue/branch associations for the old issues?

Optional, but supported. Launchpad supports zero to infinite branch (in Launchpad "development series") associations for each bug. Each association can independently track resolution status and assignment, too. We would probably import every issue with exactly one "development series" association. We could use heuristics to identify things like issues that were resolved for 7.x but need a backport for 6.x and associate them with two dev series: resolved for 7.x and open for 6.x.

How about issue assignments?

Optional, but supported. Just as on Drupal.org, issues can be unassigned or assigned specifically.

How about all the input filter-based links to other issues, or comments in other issues?

Like Drupal.org, Launchpad has a globally unique ID for each bug, so we should be able to keep the IDs the same. There is a basic filter for identifying bugs mentioned in text, including bug comments. I don't believe it shows status. Additionally, Launchpad has real support for handing inter-issue concerns like duplicates instead of what we do: change the issue status and mention the other issue in the text. That means mentioning bugs in text would be less important to the workflow overall.

Do all the testbot results come through nicely?

Testbots on Launchpad are commonly handled through the merge proposal system, not patches on bugs. They typical serve as a "reviewer" for the merge proposal and can change the status and post a comment about why. This is the preferred way because the testbot can be considerably dumber: you're giving it a branch with a defined merge target, so it doesn't have to play guessing games based on the patch name. Launchpad automatically marks merge proposals that no longer merge cleanly (produce conflicts). Merge requests typically get linked to bugs.

However, we could still run the test bot on patches. Launchpad recognizes patches as a special kind of attachment to bugs, and bugs can have target information about which development series they affect. It gets complex, though, if a bug is targeted at more than one series, which is something that we can't do on Drupal.org now. As with merge requests, a testbot can use the Launchpad API to change issue status. If we wanted to automatically test patches posted to issues (and not just merge requests), I would rather build a bot that creates merge requests from patches instead of a bot to test patches directly.

Moreover, how about the incalculable number of static links that are scattered across the internet to some issue(+comment) - do we put into place some kind of routing system to resolve those properly to their LP equivalents?

Most likely, we would need to write a module for Drupal.org to redirect to the appropriate Launchpad page from obsoleted nodes (project pages and issue pages). Because issues = bugs and projects = projects, it should be rather easy to direct users, especially if we ensure the Launchpad import keeps the same IDs.

Launchpad's bug router is smart: you can route to a bug ID at a URL that only has the ID in it, and Launchpad will redirect the user to the bug URL that includes the project name. For example, if we had an issue at drupal.org/node/422516, we could redirect to dev.drupal.org/+bug/422516, and Launchpad would redirect to dev.drupal.org/PROJECT/+bug/422516. You can even go to dev.drupal.org/WRONG-PROJECT/+bug/422516 and it will redirect as necessary.

As on Drupal.org, Launchpad primarily identifies projects with a short, URL-friendly name like "alpha-beta-gamma." We would router drupal.org/project/alpha-beta-gamma to dev.drupal.org/alpha-beta-gamma.

Other stuff...

Launchpad supports releases bound to development series, release notes, marking releases as security-related, and most of the other stuff we do.

Less is more

yhager's picture

I am on the border between "let's just use gitorious/launchpad' and 'lets continue to use project*'. Most of the time I am favoring using the best tools exist, vs. eating your own dog food, but all this launchpad talk makes me a bit wary. It is without doubt that LP is very much capable of doing everything we need, and then some. David does a tremendous work here describing its features and migration path.
However, I must admit that I can't help thinking that LP might be an over-designed, over implemented project. Our own issue queue is simple enough for newcomers, and in the same time, provides good enough tracking for core devs too. It's simplified interface and LACK OF features is what keeps the community together. If we will add releases, teams, blueprints, hidden issues, hidden branches and whatnot, plus all the UI complexity, I am not sure what will happen to all those simplistic/copy-paste/non-devs/non-technical users in our community.

I think that moving to LP should cause a bigger concern in migrating users than the selection of bzr/git as tool. Everybody in the community is using the issue queue, but only a small portion of that community is actually committing code. I get emails from non technical clients with links to projects and issues - I am not sure the zealous amount of features in LP will not scare them away, leaving that to "the pros".

I don't count myself as an example of a web user (hey, I can never find anything in facebook UI), but I must admin that when I first saw the LP page for Pressflow - I was puzzled for quite a few minutes and found myself clicking around almost randomly. I can't remember what I was looking for exactly - I think I wanted to view the diff of Pressflow from it's parent Drupal project. After a few minutes I ended up downloading the tarball and diffing manually. Again - not an average usage example - just a note to demonstrate my point above.

We already have those

David Strauss's picture

However, I must admit that I can't help thinking that LP might be an over-designed, over implemented project.

It's basically impossible for us to consider non-homegrown systems if our standard is that the off-the-shelf system must simultaneously support everything we already do (as I've been asked to explain for Launchpad) but not add anything we don't want (as you're worried about).

If we will add releases, teams, blueprints, hidden issues, hidden branches and whatnot, plus all the UI complexity, I am not sure what will happen to all those simplistic/copy-paste/non-devs/non-technical users in our community.

  • Releases: This is definitely part of our system right now.
  • Teams: We manage "teams" in the form of authorized maintainers for a module. Drupal.org supports exactly one team per project, while Launchpad allows a single team to be assigned to multiple projects. I don't consider that feature bloat because we often see projects with several modules maintained and packaged together, partly to make it simple for the same team to work on them all.
  • Blueprints: These exist now as meta-issues. Blueprints are just designed to fulfill that purpose better.
  • Hidden issues and branches: We have a horrible equivalent of these in the form of the issue queue on security.drupal.org and patches being emailed to security team members. Launchpad's capability here would be a giant improvement.

It's simplified interface and LACK OF features is what keeps the community together.

In that case, I guess I don't understand your support of git.

It's basically impossible for

yhager's picture

It's basically impossible for us to consider non-homegrown systems if our standard is that the off-the-shelf system must simultaneously support everything we already do (as I've been asked to explain for Launchpad) but not add anything we don't want (as you're worried about).

You're right, it's a contradiction.

This thread started as ways to replace CVS. Nobody has mentioned the issue queue as a problem at first. The logic you are leading is: We need to replace CVS => We need to integrate $vcs with drupal.org => scrap project*, and replace with gitorious/LP => LP is superior => LP requires bzr => $vcs = bzr

I am convinced by your comments in the this thread that LP is superior to gitorious. However, I do not accept your claim that replacing CVS should replace the issue queue too.

I am a strong believer in baby stepping, so replacing CVS and adapting the current project* and workflow to that $vcs is the minimal effort with maximal gain possible IMO. Of course, if we choose git, we can't use LP, and vice versa, but as I mentioned, our existing system is good enough, and "good enough" is a lot to achieve (and very hard to).

I understand the need for hidden issues for the security team - it might be handled using node access somehow - definitely not something that will cause such a massive replacement of the platform.

In that case, I guess I don't understand your support of git.

I've expressed in this thread that I don't really care which system gets chosen. I know git, and I can be part of the effort (thus I volunteered for the git brigade), but I will be happy to learn bzr, and add another tool to my arsenal. If I vote git, I think of the people I can help, if I vote bzr, I think of myself lurking at the bzr team.

You mentioned in this thread that version control API is generalizing something it should not. I find it somehow analogous that you generalize the subject of version control infrastructure with issue management. I haven't seen a line of code from project* modules, but I am sure it will not be so much of a pain to adapt it to git/bzr without adding new features.

Let's not try to take the moon. Going up the next hill will make me much happier, and much sooner.

I resonate best with this comment

Elijah Lynn's picture

I have been using Drupal for over a year now. I could be classified as a user who submits issues but not patches. I know how to use SVN but just barely and with the help of somebody else who knows more than I do and helps me when I get stuck.

I switched from Joomla to Drupal because I would always go to #Joomla on IRC and no one was there to help, I would jump into #drupal-support every now and then and was shocked there were more people in #drupal-support, the people I met in there were very helpful. I remember that one thing that reallly, really stuck in my mind and impressed me was that I was trying to do a one-click install on Media Temples gridserver and it was on 4.7 and I was having issues with something. Someone on IRC, promptly submitted a ticket/issue on DO. At the time I had no clue you could even do that, they linked me to the issue and I was like, "Woah!" you can submit issues right on the site and not have to use a forum!?

Then it all clicked that everything on D.O. was very tight and these ticket issues were amazing. It was all integrated into one "thing", something Joomla lacked big time but I never knew it because I didn't know it existed.

My point is this. The seamless integration and of everything on D.O. made a memorable impact for me. I realized why the Drupal community was better and I didn't even switch then. It took me many more months to even get Drupal installed. I gradually got more and more pulled into the awesome Drupal community. The issue queue was a very enlightening aspect and was THE big thing for me.

If we were to have a separate site that had a different look and feel to it (ie. Launchpad, Gitorious), the user experience would be mixed and while I think that from a programatic experience it may be ideal, the shocking user experience could destroy the community. Can launchpad be made to look like D.O. and exist on the same domain?

The community is Drupal. Without the community there is no Drupal. Community is #1 priority.

I think all these solutions can be made to work and I am not saying yay or nay for any of them but I just wanted to emphasize my experience with the Drupal Community and why I think that is the most important yet fragile aspect of all of this.

Change is good. Although it is bitter sweet and this is kind of scary, I think everyone will make a great decision that builds the community and it is because of open discussions like this.

More clarification

Elijah Lynn's picture

If anyone else was thinking that people were talking about migrating to a hosted issue tracking system and the user being taken to a separate website then this is for you (because that is what I was thinking).

Can somebody please give us... (Comment Thread title)

Garrett Albright says... "So the discussion is about if we should run our own install of Gitorious/Launchpad on drupal.org as opposed to updating the Project module. Nobody is talking about actually moving Drupal development to the non-self-hosted Gitorious or Launchpad services."

Better than tags

David Strauss's picture

While mentioning that Launchpad has solid tag support, I neglected to mention that some of our leading uses for tags are handled, fundamentally, better in Launchpad. Blueprints can link to any number of bugs as a way to organize work for a sprint or a major architectural project. Launchpad also supports watching (via email notification) and subscribing (via RSS) many things we use tags to group and follow right now.

any good examples for launchpad blueprints?

drifter's picture

Do you know of any good examples for Launchpad projects using blueprints, tags etc? I opened a few randomly and most aren't using these features. It would be useful to see...

Here's one that's well

For the security team...

David Strauss's picture

Launchpad solves our biggest workflow issues for security. Bugs can be posted and marked "security sensitive" so that they're hidden from everyone except the security team, the maintainer, and the reporter. Branches to work on solving security issues can be hidden while work proceeds and is reviewed by the same restricted group as the bug itself.

Moving to LP/gitorious is

sdboyer's picture

Moving to LP/gitorious is something that's been brought up several times in the thread, and I could probably reply at a more opportune spot, but I'll put it here:

When you (David) and I initially talked about migrating to use a different system entirely back in September, I had a mixed reaction. And it's still mixed. I very much agree with you that WHATEVER system we use, we should go for deep integration - however, that does not eliminate VCS API as a useful tool (and btw, given my agreement on that, I suspect your earlier comment that "you disagree with the entire VCS API approach" is not based on a thorough understanding of what using it actually entails).

On the upside, I hate our NIH complex. A lot. And though I've found LP confusing in the past, when I take an honest look at your listing of features here, and click around a bit through LP again, there's a lot of stuff in there that surrounds project work - sprint planning, the Q&A system, Mailman integration, etc. - that REALLY would enhance our collective experience to have. And though we could do them in Drupal, it's been the case that we could do them for quite a while, and they don't happen. This might be a case where delegating the responsibility for feature development to a group that's not in the community - and therefore, not subject to protracted & ultimately unproductive community processes - could really be beneficial.

OTOH, it does feel like we're trading one problem for another by switching from Project* - a system that most people don't understand by virtue of its crustiness - to LP - a system that only some people will understand because, at minimum, we can't rely on Python expertise in our community. I'd also feel a lot more comfortable if we were talking about switching in an external tool for something that was less the heart and soul of the community's interaction. Our project management system needs to be thoroughly interlinked and seamlessly integrated with the rest of the *.d.o experience, which means really high demands on the LP API. One example that gave me shivers from your list - LP offers rich profiles. Great. Except I don't want my d.o profile handled by LP. I'd like the data in it to be presented by an API, and shifted into a drupal-managed profile. Point being, I think that switching to an external tool for such a _core_ system that so many other parts of our web experience want to interact with raises the difficulty and expertise requirements for allowing our web presence to flourish as an interlinked whole.

Oh, and - my own looking at gitorious etc. had been to get ideas for features to implement, never to actually use gitorious itself. I don't know Ruby, and I wouldn't be comfortable even considering gitorious as an option unless we had someone with real expertise on it to lead the way. So it looks to me like we're talking about three choices here: git+project, bzr+project, or bzr+launchpad.

I am not sure

chx's picture

As the number of developers vs nondevelopers shrink some separation is already inevitable and the redesign will do that. There are many (all?) open source projects where the bugzilla/trac/jira is separate from the main site.

And none of those projects...

webchick's picture

Have the thriving development community we do, nor as much cross-over from mere "user" of the project to contributor.

We are not moving our developers off into some silo separate from the "rest" of the community. This is not something I'm willing to compromise on.

Yes.

sun's picture

Couldn't have expressed it better.

Daniel F. Kudwien
unleashed mind

Not a silo

David Strauss's picture

That's a good point, but I don't see this as creating a silo, at least not any more than the documentation and localization subsites we have/want. With all the feeds and API support in Launchpad (and presumably Gitorious), we can easily surface development activity to our front page. The single sign-on tool we developed and deployed -- which we would incorporate into any development infrastructure -- has done wonders for breaking down walls between different systems on our infrastructure.

I agree with your wariness

David Strauss's picture

I agree with your wariness around rich profiles. We don't want or need another "people management" system. In my comparison, I wasn't suggesting they are an advantage for us choosing Launchpad, just that they're a point of contrast with Gitorious.

And, yes, Launchpad is big and imposing. However, that's somewhat inevitable for such a technical and complex tool. Canonical has spent considerable time doing usability studies and improvements that we probably wouldn't do if we rolled our own, equivalent system, so I doubt we would end up with something better.

Email integration

David Strauss's picture

Another killer Launchpad feature we've always wanted on d.o: email integration for bugs. Replying to bug and merge request email on Launchpad adds your reply as a comment and even hides most quotation and footer garbage.

Project hierarchy

David Strauss's picture

On Drupal.org, we have projects, which are loosely categorized, each with branches for Drupal core versions and major release. On Launchpad, there are:

Each can have independent permissions or be unified with the same groups controlling them. Project groups aggregate data about the projects under them: issues, Q&A, etc. On a migration from Drupal.org to Launchpad, we would probably try to group projects under things like "CCK/Fields," "Views," and other groups that interoperate or leverage each other's work.

The downside is that each project can associate with, at most, one project group.

In contrast, it seems that Gitorious has a flat project space with projects self-tagging by category.

What I'd like to do, for a

Dries's picture

What I'd like to do, for a while, is to pull changes directly from git and bzr and use them concurrently for a while. During this initial phase, we should continue to share patches in the issues too, but it gives people an opportunity to compare both. Would that be helpful?

--dry-run

cha0s's picture

Sounds like a great idea, though I think most people would probably end up not changing their minds too much (except for people just trying out DVCS). Still, it's probably a good idea because using it in the regular maintenance workflow could help to identify strengths/weaknesses in the given platforms.

Not much help

chx's picture

The problem here is clearly education so what exactly would this help ...?

Strongest support team would "win"...

webchick's picture

I have a feeling that new users, not sure which one to try, would go in search of which one had the best documentation and which one they were able to get support with the easiest. This would light a fire under the asses of the respective Git/Bzr fans to get our collective house in order, and we could see first-hand which one our community is actually best equipped to deal with. (Personally, my money's on Git, but I remain interested to see how this goes.)

Although I don't quite understand how we could do this "trial run" procedure for anything other than core... opening up contrib to Git/Bzr would mean a whole pile of work on our package management scripts, etc. as far as I can tell. And it's not really the core developers I'm worried about, but new module developers and especially themers.

Clarification...

webchick's picture

chx called me out on the "it's not really the core developers I'm worried about" part of my comment, noting that core has a very long tail of contributors, and this is very true.

What I meant by that is that core has baked into it (by nature of the peer review requirement) a collaborative, "buddy system" support network thing where people can (and do) get help by more senior/experienced people (if nothing else, core maintainers themselves). I'm not worried about this mentorship aspect carrying over to version control; heck, we have people tutoring others on how to create/apply patches periodically in the core issue queue right now.

In contrib, though, people generally work in isolation on their own stuff, and only if they've written something extremely useful to a wide number of people (Views, CCK, Drush, etc.) do they get this same sort of "community" in their issue queue. This is where we really stand to lose a lot of people if they can't make the jump on their own, and if they can't get support when they get stuck/frustrated.

Can already be done

BartVB's picture

People can already try both. There are public repositories for both Git and Bazaar that are kept up to speed with g.o CVS. It's a matter of installing the tools, pulling in a copy of HEAD and playing with it. But it's hard to really test it because you can't (easily) use the more advanced and DSCM specific functions like collaborating on an 'issue branch'. That would only work if everyone in an issue queue would be working with the same system.

For that it's more useful to create some scenarios and show some examples of what the workflow would look like with the different systems (although there are not a lot of big differences).

In the end, not sure about this...

webchick's picture

On the surface it seems reasonable to try both on and see how we like them, but in practice I think it will just terribly confuse new people on what they should do (and so they'll stick to CVS which is what they know), and we'll still end up with staunch proponents of option A vs. option B by the end of the evaluation period, and still need to make the same choice.

But if we approach this differently -- what kinds of data are we trying to gather from such an evaluation that would help us with this decision? -- we might be able to gather that in other ways, without raising the support burden on the infrastructure team.

For example, one thought is to get the $vcs brigades to work out some documentation for how to use their preferred system with a Git/Bzr mirror, gather up a bunch of module and theme developers who've only ever really used CVS (and/or SVN), and have "usability testing" with them. Throw them at the existing docs, have an IRC channel for them to ask questions to the $vcs brigade, and observe what kinds of questions hey ask, and what sort of things they find challenging, and basically determine how painful this is going to be altogether.

I'd totally be willing to volunteer as a Guinea pig for this, and also to act as someone running around with a "clipboard" during these sessions. We just need to work out exactly what we're looking for.

Great idea

David Strauss's picture

That sounds like something we could do at one of the Drupalcon code sprints. However, "how to use their preferred system with a Git/Bzr mirror," while representative of contributing to core, is not representative of using Bazaar or git on a contributed project where the person is expected to push/commit code back. Some of the biggest workflow differences between git and Bazaar emerge for getting code upstream following local development.

Drupalcon code sprints++ !! YES!

webchick's picture

I didn't even think of that idea, but that makes tons of sense. That also helps time expectations appropriately, since we can't even begin to think about doing this migration until after Drupal 7 is out and some infrastructure things fall into place. Plus, we'll have all the major decision-makers in one place where we can hash this decision out in real-time. YEAH! :D

Ok, so then let's start gathering a list of stuff like:

  • Drupalcon SF attendees who are willing and able to lead these kinds of "info" sessions, both for Bzr and for Git.
  • Drupalcon SF attendees who are willing and able to be Guinea pigs, for each of the defined use cases above.
  • Drupalcon SF attendees who are willing and able to be "clipboard" folks.
  • What criteria are the clipboard folks looking for?
  • What documentation needs to be prepped ahead of time?
  • Who's going to prep it?
  • What infrastructure needs to be prepared to perform this testing?
  • Who's going to prep it?
  • What sort of data are we hoping to gather from this session?
  • What sort of infrastructure do we need to capture it?
  • Who's going to prep it?
  • Other...?

(I'll go add this to the wiki page above shortly ;))

We should still make sure we conduct some of these in "virtual" space too, because obviously 99.9% of our users will never have the opportunity to talk to people like David Strauss and Sam Boyer about any problems they come across in "real life" and so the results will be skewed. But the ability to gather feedback in "real time" is just too good to pass up.

I'm available to play the git

anarcat's picture

I'm available to play the git master, even though i'm not a jedi yet.

I am not sure I can volunteer for bzr, by lack of time and interest.

I am willing to help using git at the next montreal D7 codesprint, however. :) Koumbit has a mirror of common modules available that we could use, and there's the drupalfr.org one too.

volunteering for ..something..

chachasikes's picture

maybe for taking some notes or helping with documentation. i know basic git, wouldn't mind doing more research. though i am definitely not the best person to speak about doing complicated merges & big collaborations. i can definitely help w/ the newbies though.

Remotely help?

DamienMcKenna's picture

There is where not being able to get to DrupalCon really sucks :( I'll try to help via IRC as much as I can.

I think this is a great idea,

sdboyer's picture

I think this is a great idea, and what we need to focus on moving forward. When I was originally trying to spark this discussion back in the fall, my goal was that we'd pull together brigades whose responsibility it would be to actually hammer out "proposals" for what switching to their preferred VCS would look like, then debut them at something like a Drupalcon. Such a proposal should include:

  • A mapping of current features/systems to their equivalents in the new system.
  • An inventory of the under-the-hood/architectural changes that would occur if that VCS is selected.
  • A detailed plan for a migration process to the new system. This includes the migration of CVS data itself, and whatever additional migration work will be required.
  • Written updates (fully written, wherever possible) to our existing documentation wherever possible, and a concrete outline for additional documentation/resources that cannot or should not be written until a VCS has been selected.
  • A breakdown of which community members are stepping up as volunteers to work on the different segments (documentation, IRC-helpsquad, architecture work).

I think the use cases we've begun to map out in the wiki entry are a really good start to this. And I'm of the opinion that the teams really ought to deal with all these bullet items if the thing that's ultimately produced for the smackdown in SF to actually be representative of the final system that gets implemented.

The first bullet in that list, though, still requires that we compile a clear list of just what "features" we have, so that the teams can both work from the same list.

To me this sounds like a more

BartVB's picture

To me this sounds like a more efficient way to spend time than 'usability testing' actual endusers. A usability test of casual users will test the documentation, not the tools. As has been said earlier, 99% of the users/developers are going to stick with copy/pasting what's in the Drupal documentation. They don't want to dive into the theory about remote branches, three way merges, cherrypicking, etc.

Time during DrupalCon would be better spent on deciding how $vcs is going to be implemented. When that's clear it will either become apparent that one of the two systems is more complex to use/less robust or that there is not a real difference (more likely outcome).

We're not trying to optimize a system here or define usability bottlenecks but we're trying to pick the best tool for the job.

Mercurial

ogi's picture

I added the following to pro-Mercurial points for shifting Drupal.org but it's worth to add it here too. Remember that the choice of DVCS affects not only Drupal core but themes and modules too. The latter are where most contribution goes and Drupal has always been proud of its community and modules eco-system.

Drupal (module and theme) developers are mostly not hard-core developers and the joy factor is very important. Mercurial delivers that. Git is created for Linux and developing for kernel automatically makes you hard-core developer that uses Linux. But web developers are very different and good Windows/Mac support is a must.

You've gotta be kidding me.

mikey_p's picture

You've gotta be kidding me.

bzr rebase

cha0s's picture

Just wanted to point out that bzr also has a rebase functionality, though it is not core but available as a plugin. A link to the docs: http://doc.bazaar.canonical.com/plugins/en/rebase-plugin.html#rebase

I am also going to find a way to edit this into the wiki page. :)

...just not interactive

David Strauss's picture

Bazaar's rebase plugin does not support interactive rebasing, which is a favored workflow among git users.

Really?

jpetso's picture

Like, seriously? I thought this was a standard feature that every DVCS is bound to have in one way or another. Interactive rebasing is a killer feature, it makes cleaning up commits (in a feature branch that is to be merged back into upstream) a breeze. It's pretty unimportant for the actual upstream branch (i.e. Dries and core committers) but it's a huge help for people that want to make sense out of their quite sizeable dbtng layer, for instance, and clean up commit history before it's actually pushed to Drupal core.

Unlike what most git users

David Strauss's picture

Unlike what most git users seem to think, there's an entire world that not only uses workflows other than rebasing, but thinks rebasing is a terrible idea.

Those people include me:
http://fourkitchens.com/blog/2009/04/20/alternatives-rebasing-bazaar

Really, the rebase obsession seems limited to the git community. The only useful thing rebasing seems to do, overall, is provide a cleaner (and fictitious) set of commits to arrive at the current code. If this is the goal, I think there could be better commit-cleanup approaches that don't falsify the branch history.

More importantly, rebasing can't be part of our collaborative workflow, anyway. Whatever system we choose, branches for working on issues will be shared. People who rebased shared branches are especially evil.

Hey...

cha0s's picture

I use bzr and I'm wondering, how do you remove all the 'sync with trunk' commits that you get from merge->commit that you don't get when you rebase?

That's why I rebase and if you know how to do it without adding that crap, I'd like to know...

You don't

David Strauss's picture

I'm wondering, how do you remove all the 'sync with trunk' commits

You don't. Merging from trunk modifies the branch and must be captured in a commit (this workflow) or sprinkled into existing commits (rebase workflow). Personally, I don't understand the obsession with a pretty commit history. I want a clean mainline, but I'm perfectly happy to see lots silly bugfixes and refactoring in the history nested under the merges to the mainline. A version-control repository isn't a temple. We don't go back and delete all the rejected patches when working on a Drupal.org issue on our current system, and I don't see why we'd want to remove false starts once we have a version-control tool that supports the equivalent.

I'll also reiterate what I said elsewhere: the focus of moving Drupal.org to DVCS is to open collaborative branches, and those aren't compatible with a rebase workflow -- in any DVCS.

I want a clean mainline, but

fago's picture

I want a clean mainline, but I'm perfectly happy to see lots silly bugfixes and refactoring in the history nested under the merges to the mainline.

When working on small chunks of pieces, little feature of bug fixes I'd have to create a feature branch for everything that should be a single commit in the end. So you'd have to think where something belongs every time before you commit something, switch to the right branch and then commit it - so that takes my time and I can't concentrate on hacking as I have to think about where to commit.

Of course interactive rebasing is only useful before you push the stuff to the open collaborative branch, but it can help to keep the history of this open collaborative branch clean and meaningful, which still helps to improve the collaborative workflow.

When working on small chunks

David Strauss's picture

When working on small chunks of pieces, little feature of bug fixes I'd have to create a feature branch for everything that should be a single commit in the end.

But we disagree on the "should" here. You think commits should be a way to organize and present your set of changes. I think commits should be immutable snapshots of the working tree, and that's why I dislike the git stage and rebase functions.

Bazaar's workflows and defaults strongly support the "immutable snapshot" perspective on commits. If you want a partial commit, you generally shelve (in git, "stash") the content you don't want, removing it from the working copy.

There's an important advantage to the snapshot perspective: you can run unit tests, see them pass, commit, and have confidence that you've just captured a working tree that passes all tests. With git's stage and rebase model, the only way to know commits pass tests is to have a bot keep traversing the history, retesting rebased commits and ensuring that what actually got committed from stage represented a working tree. It's possible to have a working tree that passes all tests and still commit a broken tree, assuming you use the git stage.

I think commits should be

Pisco's picture

I think commits should be immutable snapshots of the working tree

That's your personal opinion and there's nothing to ad to that. In fact it's not important wether someone want's to use Git interactive rebase feature or not. At work however, we try to make patches that actually work and make sense. When you implement a new feature it's not always clear from the beginning who you're gonna get there. You start, you try, you end up in dead ends, you retry and eventually you have your feature implemented and working. That's when rebase time has come! You look through the history of the feature and group things together, so that each commit makes sense in it's own. Changes that belong together might go in one commit. Useless whitespace changes are dismissed and so on. Now you have cleaned up the history of this particular feature or bug fix. Now you run your unit tests and hopefully they fill pass. Now that your feature is complete and the unit tests pass, you'll merge your feature branch back into you mainline or whatever. Only after having finished with all that, do you push your changes to the central repo. What you then have is a history that actually makes sense, and a state that passes unit tests. Done.

This is not a discussion on specific features of a VCS and personal opinions concerning them.

What you wrote in your last paragraph is not true, at least not more for Git than for Bazaar.

No one has really bothered giving solutions to suffice the use case described by webchick. I suspect that a very similar solution could be achieved with Bazaar, is that the case?

What you wrote in your last

David Strauss's picture

What you wrote in your last paragraph is not true, at least not more for Git than for Bazaar.

What I've said is completely true, though you're right that rebasing on Bazaar causes the same problems as rebasing on git. The difference is that Bazaar has designed itself around workflows that don't involve rebasing.

With git's stage and rebase

fago's picture

With git's stage and rebase model, the only way to know commits pass tests is to have a bot keep traversing the history, retesting rebased commits and ensuring that what actually got committed from stage represented a working tree.

You assume that commits are rebased that have been published on collaborative repository. But the use-case I described didn't do so and in fact the git manual advices you to don't do so (as of obvious reasons). As I said previously rebasing makes sense to clean up things before you publish your work.

But it's true that it would be important to avoid non-fast-forward pushes in our repositories, which can easily be configured with the option "receive.denyNonFastForwards". That way the history always stays the same. Still users have the freedom to make use of interactive rebase to polish things before publishing their work.

You assume that commits are

David Strauss's picture

You assume that commits are rebased that have been published on collaborative repository.

No, I'm assuming you're running unit tests on your local development branch, which probably is being rebased.

But it's true that it would be important to avoid non-fast-forward pushes in our repositories, which can easily be configured with the option "receive.denyNonFastForwards".

While I've clarified that I'm not talking about collaborative repositories, I may as well take this opportunity to mention append_revisions_only, an identical setting in Bazaar.

No, I'm assuming you're

fago's picture

No, I'm assuming you're running unit tests on your local development branch, which probably is being rebased.

The code in HEAD won't change when you rebase (as long as you don't drop commits). So if the tests passed previously, they still pass. If you really need any single commit to pass, of course you have to go and test each commit of the polished history - there is no way out of that regardless of the dvcs or method used to create the commits.

Still interactive rebasing is a powerful tool that helps me to concentrate on coding, not on organizing the changes I'm going to make beforehand in the right commits or feature branches used for creating clean commits in the mainline.

Anyway, if you don't like it, nobody forces you to use it.

I frequently make use of

fago's picture

I frequently make use of interactive rebasing to clean up the commit history. Thus I can hack on something and just commit when I feel I have something remarkable - that way I can do so frequently and won't loose anything. I don't have to bother whether the commit history will be clean - it won't. At the end of the day/work I go through the commits, merge commits that belong together into one and so clean up the commit history before I push.

That way I can concentrate on hacking and still end up with a polished commit history - so we could avoid spamming others with meaningless commits.

Licenses

adrinux's picture

One thing that hasn't been mentioned yet, and that people round here take an interest in is Licensing.
Git, Bzr and hg are all GPL.

But issue/patch management:
Launchpad: Affero General Public License
Gitorious: GNU Affero General Public License
Project.module: GPLv2

See: http://en.wikipedia.org/wiki/Affero_General_Public_License

AGPL is a good thing

yhager's picture

The AGPL is an extension to GPL, that aims to close the loophole in GPL. A GPL web app can actually be closed sourced, since the software is not sold to customers, and thus the source code does not have to be shared with them. With AGPL, they extend the requirement to provide the source code of a web app too.
I don't see how this affects us in choosing a source management and issue tracking tool.

a comparison of issue trackers

drifter's picture

I'd say battling over git or bzr is missing the point - either one would be fine, I'm more familiar with git but would have no problems using bzr. Workflow is more interesting, and how it integrates with issue tracking and project planning. I've created a comparison of several open source projects using git or bzr and added it to the wiki page:

http://groups.drupal.org/node/48818#trackers

Most projects are using legacy issue trackers (bugzilla or trac), which have no direct integration with DVCS systems. Most projects use a patch based workflow, often by using email / a mailing list and then linking to it, or uploading a patch. Even Ruby on Rails is mostly patch based.

Launchpad offers both branching and patches. It also has tools for planning, decision making and process (see blueprints):

https://help.launchpad.net/Code/Review
https://help.launchpad.net/Blueprint
https://help.launchpad.net/Code/BugAndBlueprintLinks
https://blueprints.launchpad.net/sprints/uds-jaunty

It's an interesting approach, but I don't know whether it's a good fit for the Drupal process. It's often noted that the drupal.org issue queue isn't a good place for X (where X is architectural decisions, user experience, large patches, etc. etc.), lauchpad tries to address these.

Github has a simplistic issue tracker, fast and simple but only adequate for small projects:

http://github.com/blog/411-github-issue-tracker
http://github.com/defunkt/github-issues/issues

Lighthouse is similar to github (proprietary code, beautiful interface, free for open source use), and integrates with github.

SproutCore is a javascript framework and it has an eat-your-own-dogfood fully javascript bling task system. I haven't been able to figure it out though :)

To summarize: except for launchpad, no other repo and issue tracking system is well integrated. Personally I find lauchpad ugly and confusing, I'm not pushing it, I'm just saying it's integrated with bzr. All the others are either based on patches or link to forks / changesets. There aren't any good examples of issue tracking taking advantage of DVCS.

So... in my opinion, Phase 1 as described above, keeping the patch based workflow while switching to a DVCS is fine. I think git has better patch management (format-patch, cherry picking) simply because that's how Linux kernel development works.

There should definetly be a Phase 2 to take advantage of forking, but since there are no truly good examples, rolling our own with Project could be a good way to go, it might even be a model for other projects.

git rocks for patch workflow

adrian's picture

I've also very successfully used stacked git before for managing large changes that resolve to multiple interconnected patches - http://drupal.org/node/337933

this way you can test the interaction of multiple patches on the same system, and merge them in as they get committed. The moment one of the patches gets merged in you can regenerate and upload all the other patches in the series immediately.

I am a bit late here, but I

killes@www.drop.org's picture

I am a bit late here, but I need to point out that people shouldn't forget the guestimated 95% of our CVS users who never needed more than "cvs add", "cvs ci" and the occassional branch and tag command. In other words: whatever will be implemented needs also be able to be used without wrapping one's head around complicated concepts of DRCSs.

We're sorry, but the spam filter thinks your submission could be spam. Please complete the CAPTCHA.

GNA

I'd have to agree with

pwolanin's picture

I'd have to agree with launchpad being a bit ugly and confusing - so I think there is a tradeoff is between maintaining (and coding) our own project* monster with bzr in place of cvs (and the workflows we are already familiar with) versus installing, configuring, and explaining something like launchpad.

I spent some time trying to find clear docs on setting up a multi-project, multi-user bzr server, and I found the bzr docs themselves totally lacking. One bit is that bzr is that branches can be available over http as well as bzr+ssh, so at even people behind a firewall can always do a checkout.

both bzr and git are firewall friendly

scor's picture

One bit is that bzr is that branches can be available over http as well as bzr+ssh, so at even people behind a firewall can always do a checkout.

both bzr and git are firewall friendly and allow access through http(s) and ssh when their own protocol (bzr:// and git://) does not work. Definitely another win vs. CVS.

I spent some time trying to

David Strauss's picture

I spent some time trying to find clear docs on setting up a multi-project, multi-user bzr server, and I found the bzr docs themselves totally lacking.

I think you're misapplying the Subversion (or any centralized VCS) model here. There isn't a "server" in the same sense. Bazaar just has branches, and branches can talk to each other. This can happen locally or over HTTP, FTP, SSH, or SFTP -- in either direction. If a branch happens to be available over public internet SSH, be writable to a few people (SSH), and maybe be readable anonymously (HTTP), that's about as close as you get to a server. Any branch/checkout hierarchy is purely by convention.

Yep, same for git. Achieving

sdboyer's picture

Yep, same for git. Achieving a multi-project, multi-user "server" is much more about the right set of interlocking configuration options.

"More developers are already

Garrett Albright's picture

"More developers are already familiar with Git" is a very poor reason to choose Git in Drupal's case. If we want the developer base to grow, we should choose the system which is the easiest for newbies to use - people who are new to using DVCSes, who are new developers or maybe aren't developers at all (themers or translators). It's going to be a lot harder for these people to learn their first DVCS than it will be for an experienced DVCS user to learn how to use their second or third. Given this, a should choose the system with a proven record of ease of use and "it just works"-ness, and that would not be Git (shall I "go there" on the Windows issue?).

Please do :)

BartVB's picture

I've read several comments from Bzr proponents stating that Git is just too confusing, difficult to use, etc. It took me some time to wrap my head around some DVCS concepts, actually using the tool was not the problem.

But I'm curious what the perceived problems with Git are (or what the perceived advantages of Bzr are) when it comes to usability/learnability.

I already did

chx's picture

But I am glad to repeat. bzr has the following concepts to learn: repository and working copy. And this is explained as "the history of all the files in a directory is stored by bzr in what we call a repository, it's actually stored in .bzr inside the directory". Now comes git with its index/staging area and things get complicated. As I included an illustration on the first page, once you increase the number of players, the number of possible interactions become an awful lot.

But yet...

webchick's picture

...multiple Git people have chimed in and said "You don't need to use all that fancy crap if you don't want to." So to some extent this argument feels to me (as someone without huge familiarity in either one) like parroting hearsay.

Here's what I think would actually satisfy this once and for all: someone from each team make a slimmed-down copy of http://drupal.org/handbook/cvs/quickstart that includes all the commands needed to support this workflow.

I'm pretty sure this will be easy to translate into Bazaar. I am not sure about Git. But if it is, then I think this argument gets rendered moot.

first git version

marvil07's picture

Here it's a first version of git drupal handbook ;-) .. there is some language issues I think ..but it's a start :-)

Please notice that I'm assuming a gitosis like workflow .. and that can influence the "add new project" part

PS: I do not know why I use asciidoc for that :-p, the source is also uploaded

Update: I made a github for collaboration based on cweagans willings to help :-) and btw people can try git :-p

Needs some work

David Strauss's picture

It seems to be missing many of the "push" operations back to Drupal.org. For example, the "Saving changes to your module" and "Creating an official release" sections wouldn't result in changes leaving your desktop.

And a Bazaar version

David Strauss's picture

For Bazaar:

http://drupal.org/node/710906

My favorite part of the update:

Take some time to figure out the directory structure for your module. Renaming files in Bazaar is easy, and moving around directories is really easy. If you have your directories structured and files named how you want them before your first Bazaar commit, you will save yourself very little time.

Excellent!

webchick's picture

Great work, folks!! So glad to see this documentation coming together, as it really helps "visualize" how this would work.

Marvil, can you get yours over on d.o too as soon as is practical so we can toss some newbie victims at both of them? :)

For Git

jpetso's picture

I started a clone of David's Bazaar thing for Git: http://drupal.org/node/711070
Not done, but running out of battery.

Er, whoops! I thought that

jpetso's picture

Er, whoops! I thought that marvil07's link was pointing to the other page ("Git for core contributors"). Sorry.

On the other hand, I think my version has stuff that marvil's is yet lacking, I'll try to incorporate the rest of his page when I got more time and a power cable.

In any case, we always want to do rebase and never pull for updating. Rebase is safe as in "doesn't update a dirty working tree", safe as in "never produces history that can't be merged back", and also does regular updates without any kind of local patches involved.

Ok, I think it's done now

jpetso's picture

The above page got a couple of improvements over the course of the day; I now ported the rest of the guide and incorporated some further improvements (or what I think are improvements, such as not using the "master" branch in favor of Drupal-version-based branches).

I think it's complete now! Please review and feed back. (That goes out to webchick in particular, as I'd be interested whether there's stuff in there that's too detailed or too complicated.)

Thanks jpetso! It looks good,

Pisco's picture

Thanks jpetso! It looks good, I'll go over it more thoroughly this evening!

Can somebody please give us

mcrittenden's picture

Can somebody please give us semi-newbs a little explanation as to what is meant when you talk about using Gitorious, Launchpad, etc., instead of project*? Would we be still be hosting the repo at d.o via a launchpad/gitorious install or would the repo just move to launchpad.com, etc.? Would we lose some integration with the rest of d.o? What would something like launchpad handle (e.g., will we move documentation and issues there as well) and what would be left for the Drupal part of d.o to handle?

Or are these all questions that we're still trying to decide upon?

Sorry, I'm just trying to keep up.

Both Gitorious and Launchpad

Garrett Albright's picture

Both Gitorious and Launchpad are hosting services, but they also have opened up their code base, so if you want to build your own hosting service using the Gitorious or Launchpad code, you can do so. So the discussion is about if we should run our own install of Gitorious/Launchpad on drupal.org as opposed to updating the Project module. Nobody is talking about actually moving Drupal development to the non-self-hosted Gitorious or Launchpad services.

The rest of your questions are basically the same questions which are being currently discussed.

We would install our own

David Strauss's picture

We would install our own instance and would integrate the Drupal.org and project management user bases, probably with single sign-on. What each tool would "take over" differs. Launchpad is quite a bit more comprehensive than Gitorious, but neither would take over documentation, forums, and general content management. They would mostly supplant Project*.

launchpad is apparently done

killes@www.drop.org's picture

launchpad is apparently done in python, gitorious in ruby. I don't think that this would really bode well for any integration efforts.

Integration would like be via

pwolanin's picture

Integration would like be via XML-RPC or other remote API calls - not direct DB integration. That might make it easier to manage and upgrade drupal.org - since the tight integration of project* means that drupal.org cannot track the latest version of Drupal without waiting for project* to be upgraded.

If I had to guess, I'd feel like the time it would take to install, configure, and integrate something like an instance of launchpad is going to be at least equal and liekly greater than the time to convert the existing project* functionality to use some other RCS - but that leaves us with just the same functionality we have now.

Yeah, I think I agree...

webchick's picture

Not to mention the data import and training component, which are quite substantial.

On the surface "We wouldn't ever have to upgrade project* again! We can let someone else worry about making it better!" sounds extremely pragmatic. But replacing cvslog with gitlog/bzrlog will be a whole pile of less work, will still keep our "core" community collaboration tools tightly integrated into our community, which is key to our project's success.

In fact, why not write a Git integration piece into Launchpad instead? :) Then individual sprint teams could use it for in-person collaboration, but we could still keep the benefits that Project* brings, not only to Drupal.org, but to all other places that use it (jQuery, etc.). And we have "in-house" expertise to extend it to meet our needs.

Would looove to see cvs go

whatdoesitwant's picture

Though I am a long time lurker, I'm still a relatively newbie themer. Half of what you are discussing here goes over my head.
My development machines are ubuntu or windows based. I do my theming on windows.
Like many programmers prefer to diff from the shell, and many osx themers do everything in textmate, i like to work from my ide (Eclipse/Aptana).
These are links to the Eclipse plugin repositories & their documentation.

For Git:
Name - Egit
Update site - http://download.eclipse.org/egit/updates/
Documentation - http://wiki.eclipse.org/EGit/User_Guide

For Bazaar:
Name - bzr-eclipse
Update site - http://verterok.com.ar/bzr-eclipse/update-site/ (check http://wiki.bazaar.canonical.com/BzrEclipse/Installation for current link and additional requirements)
Documentation - http://wiki.bazaar.canonical.com/BzrEclipse

DVCS yes, but hell NO to replacing project*

sun's picture

As already written elsewhere, I do not understand why everyone is bashing on project* modules and d.o infrastructure. Having worked with plenty of open-source systems, their issue trackers, hosted and custom issue trackers and ticketing systems, our collaboration, issue tracking, and project management solution on drupal.org by far outreaches all other systems I've worked with so far -- in terms of usability, day-to-day work, ease of use, controlling, transparency, scalability, and whatnot.

  • Looking at Launchpad, I see a system that loves to throw weird terminology at me, blueprints, drivers, WTF? Especially its hard separation between features and bugs, notion and treatment of "delivery" milestones, approvals and assignments, busts the hell out of me. That may be suitable for your company's business product development and perhaps understood by you and your co-workers after having had a workshop or completing the Launchpad certificate™, but how does that remotely apply to Drupal's flexible and successful usage of issues with various transitioning states...? Implementation: Good progress? Of course.
  • Trac has been mentioned before, but please show that to an average developer or even users... totally unusable and I bet your user won't even find the issue queue or how to report a bug. I don't even want to see this total abuse of wiki-functionality in action when it comes to managing 4,000+ projects.
  • ...

I fear that everyone on this page is some sort of in-depth developer. We forget about the Drupal community -- which consists of users, consultants, translators, too.

Daniel F. Kudwien
unleashed mind

let's not drift in the project management bikeshed

anarcat's picture

I do not share your frustrations with trac and launchpad. I find them excellent and miles beyond what project_issue offers.

Thing is: we are talking about version control here, we shouldn't drift into the project management bikeshed, it's already complex enough as it is.

webchick's picture

Because tools like Gitorious and Launchpad are inexorably tied to either one VCS or the other (granted, this is not so for Trac). So if we do indeed decide that moving off of Project* module is part of our future infrastructure plans, it influences choice of VCS considerably.

But so far, I think sdboyer's done a great job of shining a very harsh light on exactly what would be involved in such a migration, and it makes it a lot less attractive. Especially when you consider how central the issue queue is to how our community functions.

This line from one of his posts resonates strongly with me:

the issue queue is the heart, soul and history of Drupal, and I'm hesitant to do anything that threatens that.

If you take moving from Project* off the table, I agree that the choice of VCS becomes a lot less important, and needs to be evaluated on other factors.

git really has everything we need, bzr does not

anarcat's picture

My personal experience with git has been interesting. I was originally rebutted by git's intricate commandline interface (about a year or two ago), laughing at stuff like "fsck" or "gc" being a possibility in a VCS. Since then, git has greatly matured, especially on the documentation side. There are still some oddities with the interface, as git is strongly model-driven (compare drupal nodes with git blobs), but those are smoothing out over time. To a newcomer, git is very useable, and Koumbit is switching to git (from svn and mercurial) everywhere. There's a windows client with a TortoiseGit interface. The recent clients have very good inline help, so the documentation side of git is fixed, in my opinion. And the fsck/gc commands are internal commands that are used to fix catastrophes or optimize the repo (can your VCS do that? ;).

I believe the project_issue part is also almost fixed. A lot of work has been done to make version_control_api work with git, and the only manifestation of that critical piece of infrastructure for bzr is this project, a "placeholder" without any time frame for development, and nobody here, as far as I could read, committed on developping that.

I remember a very important post in the handbook that webchick brought up outlining very basic requirements for leaving CVS: windows/mac support (both fulfill that requirement), Project module migration (only git fits the bill), documentation (both seem to have adequate documentation), release scripts have to be re-written (that needs to be done for both, but I already wrote a hook that creates tarballs automatically out of git tags).

Basically, I feel that there are key points lacking for bzr, while git has everything we need.

Not quite.

David Strauss's picture

And the fsck/gc commands are internal commands that are used to fix catastrophes or optimize the repo (can your VCS do that? ;).

Yes, it can.

...but I already wrote a hook that creates tarballs automatically out of git tags

Bazaar can do that out of the box.

I feel that there are key points lacking for bzr

It seems like the only "point" actually lacking is updating Bazaar's Version Control API integration, which is neither a criticism of Bazaar itself nor a huge amount of remaining work, especially relative to everything else to be done.

User experience

emmajane's picture

So this is perhaps an obvious statement to make...but: I am in favour of Bazaar. I first needed to use Bazaar for the Ubuntu Desktop Course project (i.e. documentation) in 2007ish. I currently use Git for StatusNet
documentation.

I found the very basic documentation for git to be lacking (i.e. how do I clone a project to start working on it) within the tool itself. I asked git users about the system the response was generally that it was "trivial" and I should find it easy to use. And did you know how powerful Git is? As a result Git made me angry inside. It made me feel stupid that I couldn't understand the very basics of how the system worked and as a result it made me not want to contribute because it was too embarrassing to not understand.

The culture of Bazaar users is quite different. The internal help system has been tested by new contributors to Ubuntu. It gives immediate hints on how to do the basic steps that most "casual" contributors will need. Yes, I have been working with Bazaar for longer. I have never felt bad about asking for Bazaar help because of the way the system has been designed. It encourages people to ask more questions.

Compare:
git --help <-- listed alphabetically, assuming you know what you need
bzr --help <-- listed by step-by-step task usage and grouped according to functionality

I've always received very patient help from Bazaar users and core developers. I find the channel to be very accessible. The first time I watched a bzr presentation it was by MySQL community manager Lenz Grimmer at DrupalCon, Szeged. He was enthusiastic and really liked showing off how bzr had made it much easier for their team to collaborate on the development of the database. It was as much about the "code" as it was the community of people. I like that. Having contributed some documentation to Bazaar as well as several (unpaid) conference presentations I then had the pleasure of working with the Bazaar team on the redesign of the front page of their Web site. I found the contributors to be passionate and interested in making the system accessible to all sorts of new people. Not just developers and coders.

For our purposes Git and Bazaar are probably about equal from a functionality point of view. They will both have some things they handle more naturally. The workflow of either system can be adapted to be centralized. I can't comment on how easy it is to script either system because I don't automate my tasks. My guess is that most casual contributors won't either. I also can't comment on how easy it is to work on more than two or three versions of a project or rebasing because I've found my contributions to be very task-oriented. I have no problems working with teams using Bazaar. I have never found it to be lacking in some functionality which prevents me from working and contributing on a project.

The version control system of choice will not change how I participate in Drupal; however, I feel that the Bazaar culture will be welcoming to casual and non-traditional contributors within our Drupal community such as documenters and designers.

Thanks, this perspective is really valuable.

webchick's picture

We definitely can't under-estimate the learning curve here, and it's wonderful to know that bzr goes out of its way to make this easier, and has a supportive community behind it. Your --help comparisons were indeed interesting.

I'd be curious to hear too from some of the designers who are currently working on the core themes, all of which are being developed in GitHub atm since CVS is such a royal pain in the ass to make patches that include new files.

Just curious: is the 2007-ish timeframe also the first/last time you tried Git? I ask because I've been passively monitoring the conversations going on in #bzr and #git since this conversation started, and have basically seen both communities being extremely helpful to people asking questions, which is definitely not exactly what I expected given Git's ... "reputation." But I also don't have a concept as to whether those questions are really 'newbie' or whether they're more advanced; it's indeed possible the threshold for "dumb" questions is much lower in the Git community.

Then again, I'm fully expecting our own "$vcs brigade" to provide first-line support for the vast majority of our contributors (people have long been asking Git questions in #drupal already, for example), so we'd really only be turning "upstream" for particularly gnarly questions, asked by someone who already grasps the fundamentals.

To help make the question of learning curve easier, I almost wonder if making that $vcs version of http://drupal.org/handbook/cvs/quickstart should be a pre-requisite to making the choice...

Timelines

emmajane's picture

I started working with StatusNet fall of 2009. My real Git experience starts there. I'm not finding Git people to be unfriendly. It's just more of a, "oh it's so easy and you're so smart you won't need any help so I'll just continue working on what I'm doing over here in the corner." Which makes you feel worse for not understanding. Whereas in Bazaar saying, "I don't get it" or "I'm confused" seems to trigger very different responses. This is both in project-specific IRC channels and also on microblogging platforms. The knee jerk response with Git people seems to have a period of initiation where you fight for yourself; in Bazaar the knee jerk response seems to be to explain (even when it's not asked for). I wonder if this is handed down from the initial dvcs project teams?

I'll save you the effort

adrinux's picture

Here's a screen shot of what emmajane is talking about:
http://emberapp.com/adrinux/images/git-help-vs-bzr-help/sizes/o

"No extra plugins required"

David Strauss's picture

I can't honestly remove the "no extra plugins required" statement from the wiki page because it's technically true. But, honestly, it's deceptive. Bazaar enables and ships for Mac and Windows with all the major plugins (including rebase), and on Linux systems they're available via packages. It's like criticizing Drupal for having menu navigation as "only a module" -- in both cases, the chunk of functionality is both shipped and enabled for almost all users.

GIT page is bizarre

peterx's picture

There are a lot of times when site owners need to track, install, and test the dev version of a module. With TortoiseCVS I could show a non programmer how to get an update from CVS. Not completely easy as in teaching a kid to eat an apple but easier than teaching people how to eat a durian without breathing*. How do Git and Bazaar shape up for that purpose?

A quick visit to Bazaar produced: http://doc.bazaar.canonical.com/explorer/en/visual-tour-windows.html
The Git equivalent page appears to be: http://code.google.com/p/msysgit/wiki/GitCheetah
Does anyone know of better documentation for GitCheetah in Windows or will we have to write our own? Will we have to wait years for the code to mature then years for documentation to appear? Does anyone know anyone involved in GitCheetah development?

I can understand an experienced developer wanting Git. Every time someone recommends distributed version control to me, they use Git or something not under consideration here. The Git promoters have never shown me a user interface suitable for every user. Bazaar appears to have the interface. I have not used the Bazaar interface but I can understand it from the documentation page. If we convert today, Bazaar would be the best choice. If the conversion is aligned with D8, then GitCheetah might be ready when you want beta testing of D8.

  • Durian lovers understand.

there's tortoise git

anarcat's picture

As I mentionned earlier, there's tortoise for git too: http://code.google.com/p/tortoisegit/

Tortoise for git looks good until you get to the install

peterx's picture

Tortoise for git looks good until you get to the install:
Please install msysgit 1.6.1 or above before install tortoisegit http://code.google.com/p/msysgit

Bazaar looks like the complete package while Git appears to be scattered, haphazard. On two major projects where I suggested replacing CVS with SVN or something else, I included Git because some of the developers were using Git when working on open source projects. On both occasions the Git people put forward proposals that looked like multipage regular expressions and we went with SVN. The first time around I did not look at Bazaar because it looked like Git. After the second project I looked at Bazaar again and it looked usable by everyone involved in the project.

As a comparison, Debian is used on about 0.000000 desktop computers while Ubuntu has several percent. Git keeps looking like Debian support pages and forums where everyone talks command line and there are always prerequisites you have to install before you can install the missing prerequisites. Bazaar looks more like the Ubuntu forums where most people speak English. If you are converting today and you want to attract more people to development and testing, Bazaar looks like the best choice.

Of the two contenders, Bazaar also looks like the one more likely to be adopted by the organisations not currently using open source version control. If I was proposing a switch from Coldfusion to Drupal and a complementary change in the development process, I would show them the Bazaar web site but not the Git site+scattered pages.

Every time I look at Bazaar v Git, I see something similar to emmajane's comment:
The version control system of choice will not change how I participate in Drupal; however, I feel that the Bazaar culture will be welcoming to casual and non-traditional contributors within our Drupal community such as documenters and designers.

And there's also Smart Git,

sdboyer's picture

And there's also Smart Git, which is quite nice, though not FOSS.

Also Drupal and Git are a

gordon's picture

Also Drupal and Git are a perfect match, and there is a good reason for this.

Git internal commands are referred to as the plumbing, and at the Drupal is the community plumbing, and therefore it is undoubtedly a perfect match.

I rest me case!

;)

--
Gordon Heydon

Provides the benefits of a

Pisco's picture

Provides the benefits of a distributed version control system, while also supporting a traditional centralized workflow, which will make transition easier for people ...

From what I read about Bazaar, Git allows for a centralized workflow just as well as Bazaar does. In fact I use Git at work in a centralized manner. I'd remove that point as a pro for Bazaar.

I'd recommend using Git because it suits perfectly for the workflow we already have with projects at Drupal.org. Each project should have it's own Git repository. Each project can be given very fine grained access rights using Gitotis or any other standard way of controlling access to a Linux account over ssh or HTTP/S. With gitosis you can give read-only or write access to groups or individuals. It works with ssh using public keys, which is easy and secure. Have a look at the Gitosis example conf. And a very nice detail about Gitosis is that itself uses a Git repository for storing the configuration ... eat your own dogfood. :-)

As for Git guis I'd recommend GitX for Mac users, or Gitk on Linux (and Windows?).

From what I hear Git can do everything Bazaar can ... and more. As far as the documentation is concerned, I find it very good, but I understand that the initial learning curve might be perceived as steep for a newbie coming from Subversion. But there I must say: if the available documentation is not god enough, why not make it better or write our own custom tailored documentation that describes the common use cases found when working with Drupal?

In fact I use Git at work in

David Strauss's picture

In fact I use Git at work in a centralized manner. I'd remove that point as a pro for Bazaar.

Sorry, but git does not support anything close to the Bazaar checkout/update/commit workflow. You can bless a "central" branch, but the similarity ends there.

I'd recommend using Git because it suits perfectly for the workflow we already have with projects at Drupal.org. [...]

You seem to be recommending git as a "perfect" fit here with reasons that are not unique to git or Bazaar.

From what I hear Git can do everything Bazaar can ... and more.

Vague hearsay about capability doesn't contribute to making a decision here.

Sorry, but git does not

Pisco's picture

Sorry, but git does not support anything close to the Bazaar checkout/update/commit workflow. You can bless a "central" branch, but the similarity ends there.

As I said I don't know what the possibilities with Bazaar are, but as a matter of fact I checkout a repository, I work with it using all the branches I want, merging (interactive) rebasing and all of the Git wickedry, and whenever I want to push back to the central repository. Maybe you want to explain how and why Bazaar is better at that. Or if I've overseen your comment on exactly that matter, please give me a pointer. I'm not a Git evangelist after all ...

You seem to be recommending git as a "perfect" fit here with reasons that are not unique to git or Bazaar.

That may be true, but isn't the import part that it is a perfect fit?

Vague hearsay about capability doesn't contribute to making a decision here.

You're right, I apologize. A Google search for "git version control" gives me 694,000 results, compared to 459,000 results for "bazaar version control". My impression might not be all that wrong ... Also have a look at the Google trending tool Git vs. Bazaar.

But hey, I really don't want to start a flame war here!

Please read the rest of the

yhager's picture

Please read the rest of the comments. We've discussed the popularity route before - we are not deciding based on that - but based on what is best for our case. Both systems are pretty on par in terms of capabilities, and the discussion focuses on migration issues, documentation, training etc.

As I said I don't know what

webchick's picture

As I said I don't know what the possibilities with Bazaar are, but as a matter of fact I checkout a repository, I work with it using all the branches I want, merging (interactive) rebasing and all of the Git wickedry, and whenever I want to push back to the central repository.

Right, what David is talking about is a very simple, centralized workflow that doesn't involve any such wizardry. :) Bear in mind something killes said earlier: about 95% of our contributors never use anything other than cvs checkout, cvs up, and cvs commit. Then cvs add/remove once in awhile, and very occasionally, cvs tag (-b).

Right, what David is talking

Pisco's picture

Right, what David is talking about is a very simple, centralized workflow that doesn't involve any such wizardry.

Maybe I should have written differently, having powerful features at hand doesn't mean you have to use them. What you want is this:

  • git clone git://github.com/jquery/jquery.git (to get the repository)
  • git pull (to get the newest changes from the central repository)
  • make your changes
  • git commit -a; git push (to send your commits to the central repository)

wasn't that easy? Again, Git having these powerful features doesn't mean you have to use them. (You won't have rights to push to the jquery repo!)

Thanks, this helps a lot.

webchick's picture

Seems to me that clone == checkout, pull == update, commit == commit (for incremental changes), and then there's a new extra step push, which basically says "Ok, these changes are good enough for Drupal.org."

Now could one of the Bzr folks respond and say why this isn't exactly as easy as the equivalent in bzr?

Of course...

chx's picture

... if you want the most basic workflow then that's it. However, that you are already using a switch in git is a glimpse behind the curtain where the uglier things are. And of course, you can alias that ugliness away and try to use git as you would bzr -- keeping things simple.

However, that's not git is made for. What we debate here is this: do we want the most powerful, most high performant version control system out there or the friendliest? That's it. Because there is no doubt that bzr in documentation, help and mindset is about friendliness while the primary drive for git is being powerful. Decide.

Edit: Before someone twists my words -- this does not mean that bzr is slow and not powerful nor it means that git is unfriendly. Neither is true any more. That does not change the primary drives, however.

However, that you are already

Pisco's picture

However, that you are already using a switch in git is a glimpse behind the curtain where the uglier things are. And of course, you can alias that ugliness away and try to use git as you would bzr -- keeping things simple.

What exactly are you talking about?

The example I gave works as is, no alias, no special setup, just plain git. You can try it with the given URL for the jQuery repository.

The question was, how does one do the same with Bazaar.

you can alias that ugliness

scor's picture

you can alias that ugliness away and try to use git as you would bzr -- keeping things simple.

oh, on the topic of the aliases, can you tell me how you get the function names to be displayed in patches without using some sort of alias in bzr? my ~/.bashrc still contains alias bdiff='bzr diff --diff-options -up' back from when I tried bzr back last year. I eventually dropped bzr when it refused to merge and threw errors about KnitPackRepository and rich-root support (didn't find anything useful online at the time). Later some files would not get updated with bzr pull (but I'm not blaming bzr, it's easy to use and I must be a real noob)... anyways, the point here is that it's hard to expect a tool to do 100% of what you want, and that's why you use aliases.

Is this one of those games

yhager's picture

Is this one of those games where the last man standing wins? we are going in circles. All this has been said and handled in this thread.
There is no consensus that one system is better than the others (can there ever be?). I am convinced that both systems support any workflow that we need, and staging this/centralized that will not lead us to a decision. I am taking a guess here, that most people are thinking in similar terms.

There is a lot of info gathered up in this wiki page - it's time to move on.

What I suggest we do now that will help us decide is start careful planning of the transition - list out the tasks, assign them to people, and get time & effort estimates. Just making this list will help us decide. Or maybe we can decid that now based on the brigades assigned (not their size, their strength).

webchick, you probably have the toughest part of all, to represent the quiet majority, who is not commenting anywhere in this thread. It's much easier to take a side :).

My goal...

webchick's picture

...is to understand the fundamental issues that are going to confront our community when we make this move, so that I can help lay down some infrastructure to deal with it. I'm coming at this squarely with my "community manager" hat on.

Part of that is cutting past the "fanboy" and "hearsay" BS and evaluating both options on their actual merits. This means figuring out exactly what real, actual challenges we will have, rather than unhelpful and vague statements like "Git is too complex" / "Bzr is more user-friendly." Posts like the above, which lay out exactly what the workflow would be like for module/theme developers, help with that.

So, no. It's not time to move on yet, at least from where I sit. I don't feel comfortable enough as an evaluator that we've adequately addressed the learning curve aspect, which is the main criteria blocking a decision.

Simple example

cha0s's picture

Well, the equivalent in bazaar is

bzr checkout http://drupal.org/whatever

make changes

bzr commit

Now this is if you want it just like CVS, where commits go right to the master repo. If you wanna do it more distributed you do this:

bzr checkout http://drupal.org/whatever

bzr unbind

changes
bzr commit

changes
bzr commit

changes
bzr commit

bzr bind

bzr commit

Really, it's that simple...

Or you can do

bzr branch http://drupal.org/whatever

changes
bzr commit

changes
bzr commit

changes
bzr commit

bzr push

To work in a more wholly distributed way. You can choose.

Whoa, we can do a lot better than that

David Strauss's picture

bzr checkout immediately followed by bzr unbind is the same as just running bzr branch. And if you want to use the centralized-style workflow, it's easier to use bzr commit --local to sneak in local commits instead of binding and unbinding. Everything committed with bzr commit --local gets queued up for the next bzr commit.

Yep, you're right.

cha0s's picture

You are correct, but the --local commits are undone by a revert iirc. If that's (still) the case, IMO it's better to unbind first.

You are correct, but the

David Strauss's picture

You are correct, but the --local commits are undone by a revert iirc.

Just tried. It doesn't.

Curium:Sandbox straussd$ bzr init junk
Created a standalone tree (format: 2a)                                        
Curium:Sandbox straussd$ cd junk
Curium:junk straussd$ touch A
Curium:junk straussd$ bzr add
adding A
Curium:junk straussd$ bzr commit -m"A"
Committing to: /Users/straussd/Sandbox/junk/                                  
added A
Committed revision 1.
Curium:junk straussd$ cd ../
Curium:Sandbox straussd$ bzr co junk junk2
Curium:Sandbox straussd$ cd junk2                                             
Curium:junk2 straussd$ touch B
Curium:junk2 straussd$ bzr add
adding B
Curium:junk2 straussd$ bzr commit --local -m"B"
Committing to: /Users/straussd/Sandbox/junk2/                                 
added B
Committed revision 2.
Curium:junk2 straussd$ bzr stat
Curium:junk2 straussd$ bzr log
------------------------------------------------------------
revno: 2
committer: David Strauss <david@example.com>
branch nick: junk2
timestamp: Thu 2010-02-11 13:58:37 +0000
message:
  B
------------------------------------------------------------
revno: 1
committer: David Strauss <david@example.com>
branch nick: junk
timestamp: Thu 2010-02-11 13:58:19 +0000
message:
  A
Curium:junk2 straussd$ bzr revert
Curium:junk2 straussd$ bzr log                                                
------------------------------------------------------------
revno: 2
committer: David Strauss <david@example.com>
branch nick: junk2
timestamp: Thu 2010-02-11 13:58:37 +0000
message:
  B
------------------------------------------------------------
revno: 1
committer: David Strauss <david@example.com>
branch nick: junk
timestamp: Thu 2010-02-11 13:58:19 +0000
message:
  A
Curium:junk2 straussd$ bzr commit -m"B2"
bzr: ERROR: Bound branch BzrBranch7('file:///Users/straussd/Sandbox/junk2/') is out of date with master branch BzrBranch7('file:///Users/straussd/Sandbox/junk/').
To commit to master branch, run update and then commit.
You can also pass --local to commit to continue working disconnected.
Curium:junk2 straussd$ bzr up
-D  B                                                                         
All changes applied successfully.                                             
+N  B                                                                         
All changes applied successfully.                                             
Updated to revision 1.
Your local commits will now show as pending merges with 'bzr status', and can be committed with 'bzr commit'.
Curium:junk2 straussd$ bzr stat
added:
  B
pending merge tips: (use -v to see all merge revisions)
  David Strauss 2010-02-11 B
Curium:junk2 straussd$ bzr commit -m "B2"
Committing to: /Users/straussd/Sandbox/junk/                                  
added B
Committed revision 2.                                                         
Curium:junk2 straussd$

Popularity!

fago's picture

Things change: Bzr became faster, git user friendlier. So I don't think there is that much a difference. However my feeling is that git is already more popular than bzr - not only in the drupal community.

It would be interesting to have actual popularity statistics here. With google I found some vcs-usage numbers gathered by debian through their source package definitions:
git: 2570
bzr: 212
-> http://upsilon.cc/~zack/stuff/vcs-usage/

Also quite interesting are the gnome survey results:
http://blogs.gnome.org/newren/2009/01/03/gnome-dvcs-survey-results/

So wouldn't it be nice for new developers to just work with the system they know?

As has been mentioned

Garrett Albright's picture

As has been mentioned elsewhere in this thread: This is not a popularity contest, so usage statistics are worthless. We're looking for the most appropriate tool, not the most popular. And considering what new coders and non-coders will have to learn in the future is far more important than what current coders know now.

I disagree. When the

fago's picture

I disagree. When the comparison shows that both tools are an appropriate choice, the more popular one is the better choice. Popularity isn't all, but for sure it matters.

Okay, then, let's let the

Garrett Albright's picture

Okay, then, let's let the comparison show that both tools are an appropriate choice first. I don't think that's happening now. David Strauss is making some very strong arguments for selecting Bazaar, whereas most of the arguments for Git seem to be "I/a lot of people already use it" and "It's not/doesn't have to be that hard," neither of which are helping its case.

Still I disagree

fago's picture

When the main argument for bazaar is that it's supposed to be easier to learn, then it matters when there are less or more users that need to learn the dvcs of choice.

No

garyvdm's picture

No. A git clone == bzr branch. There isn't any thing in git that works like a bzr checkout (aka bound branch.)
This may be helpfull: http://wiki.bazaar.canonical.com/CheckoutTutorial

To be honest I only use bzr checkouts when I'm working in a corporate inviroment and never really when working in open source projects.

I use Bazaar checkouts as a

David Strauss's picture

I use Bazaar checkouts as a place to stage and test changes going into the mainline:

[branch to merge] ---merge--> [my checkout of trunk] ---commit--> [trunk]

The git equivalent isn't far off, though.

Simple git workflow with aliases

scor's picture

Having the right aliases setup for git can give you a pretty simple workflow which ignores the staging area (index), push the commits right away and is similar to what cvs/svn users are used to. I've posted these aliases and instructions for newbies at Simple git workflow for creating core patches. In the end the only commands you need to use are:
- git clone (cvs checkout)
- git df (cvs diff)
- git clear (cvs update -C)
- git ci (cvs commit)
- git pull (cvs update)

git diff --no-prefix && git diff --staged --no-prefix

webchick's picture

That is simply horrifying. :)

Is there a similarly horrifying command for bzr, or is it closer to cvs diff -up?

That alias page is very nicely written, but aliases are definitely not something that our contributors have ever had to deal with before, so does count against Git in the learning curve department.

Still kinda long for us lazy ones

cha0s's picture

bzr diff --diff-options=-p (u is implied)

Edit: should be noted too you'll need to be bound to the master repo for this to work. That is, if you commit locally then do 'diff', it won't find any changes. So if you're talking about an unbind'd repo, it looks more like this:

bzr diff --diff-options=-p --old http://drupal.org/whatever

bzr diff --diff-options -up

scor's picture

for bzr you would type bzr diff --diff-options -up unless there is a shorter version I'm not aware of?

bzr can alias interactively

pwolanin's picture

bzr can alias interactively like git, so you can save typing with:

bzr alias diff="diff --diff-options=-p"

The need to add --no-prefix when rolling a git diff to post to d.o has been my regular bane (though one I've also been too lazy to alias it away).

I've always used clean 'git

yhager's picture

I've always used clean 'git diff'. not sure why I should bother with such an alias.

I have been working with git for ~7 mohths and my only alias is 'co=checkout', saves me some typing :).
I've used 'git diff --staged' a handful of times in all that time.

we don't need that kludge

anarcat's picture

--no-prefix is required only for CVS interoperation at this point. We can integrate regular patches made with diff without any trouble:

anarcat@desktop002:provision$ git diff | git apply -R
anarcat@desktop002:provision$

And even now, someone that receives a patch made from git without --no-prefix can just use patch -p1 to skip the a/b thing.

Not a good workflow

David Strauss's picture

There are serious flaws with that workflow:

  • It doesn't allow local commits. It forces the user to keep all the changes floating in either the stage or completely uncommitted. That isn't good for a user checkpointing her own work.
  • As a side-effect of not committing, there's no mechanism to share ongoing work with other users except sending classic patch files and having each user apply them.
  • When shared in patch form, there's no way to integrate another person's work without trashing your own.
  • The way git handles stage, things added to it aren't removed even if the work is reverted in the working copy. From the way the diff gets built, it seems that would probably produce a patch that (1) does a change and (2) reverts it later. Or, worse, it would make a patch that quietly applies the staged change even though the user reverted it.
  • While not a regression from CVS, pulling from upstream to update can produce nasty conflicts in the working copy that are hard to back out.

A git workflow that solves those issues, in my experiences, requires:

  • Maintaining the local "master" branch as a pristine copy of upstream. (The user "git pull"s into that.)
  • Branching from the local "master" to a new branch for work.
  • Committing to the new local branch.
  • Diffing from the local "master" to the new branch.

To partially duplicate some earlier posts, I have written some on Bazaar workflows. These both need cleanup but give a basic idea:

Bazaar doesn't have the flaws I highlighted above for the git workflow and doesn't add the complexity of dealing with a local "master" and development branch.

Basically the core operations are:

  • bzr branch [source]
  • bzr merge <-- assumes by default you want to merge from where you branched
  • bzr add
  • bzr commit
  • bzr diff --old=[source]

That workflow allows:

  • Local commits
  • Pushing or merging the branch elsewhere for centralized collaboration
  • Using "bzr send" to make merge directives (read: super-patches) that other people can merge to integrate changes with their own
  • Generating normal patches without including "staged" changes that have been since reverted
  • A clean way to merge changes from upstream with an easy way to back out ("bzr revert")

I agree with David in that

Pisco's picture

I agree with David in that the workflow described here is not a good one, and that's certainly not how you're supposed to be using Git. I suppose the reason they described it this way was that they wanted to be able to issue git reset --hard to undo all local changed and being able to create new patches, this doesn't make a lot of sense because who want's to throw away the improvement he just made, given the fact that it might take quite long for the patch to be applied by the maintainer.

What does bzr merge do if after an initial bzr branch [source] you already issued various bzr add and bzr commit commands?

If I do as you described and I created a patch using bzr diff --old=[source], I then move on to work on another issue in the same project. All this happens in a day, no chance that the previous patch is already included upstream. I repeatedly issue bzr add and bzr commit. Is it true that when I then issue bzr diff --old=[source] I will end up with a patch that contains the changes for the first and the second issue?

Anticipating your answer I suspect that you need to work with local branches in Bazaar too.

Here we're delving into advanced usage of DVC. With Git the best practice is to create a new local branch for every feature you want to implement (or fix). This is not a flaw, it's the way Git is intended to be used, that's why branches are so unbelievably cheap, that's why merging (with git you can merge more than two commits/branches in one go).

What you described for Bazaar holds true for Git too:

  • git clone [source]
  • git pull
  • git add
  • git commit
  • git diff origin

and this workflow too allows:

  • Local commits
  • Pushing or merging the branch elsewhere for centralized collaboration
  • Using "git format-patch" to make merge directives (read: super-patches) that other people can merge to integrate changes with their own
  • Generating normal patches without including "staged" changes that have been since reverted
  • A clean way to merge changes from upstream with an easy way to back out (git reset)

What does bzr merge do if

David Strauss's picture

What does bzr merge do if after an initial bzr branch [source] you already issued various bzr add and bzr commit commands?

Well, bzr add creates an uncommitted change. Merges are not permitted (unless --forced) when there are uncommitted changes. So, you'd first want to commit the changes from bzr add.

If you have local commits and the upstream branch has changed, too, the two branches are "divergent." A merge, by default, will perform a three-way merge using the most recent common ancestor between the branches, the latest remote commit, and the latest local commit. Bazaar supports criss-cross merges and multiple merge algorithms, but I won't get into those here.

From there, you get a branch with the uncommitted merge. If there are no conflicts, you're free to commit the merge. If there are, you must resolve them first.

Unlike git:

  • A merge will always have a special merge revision with its parents being the revisions gained in the merge (shown as nested in bzr log). In git, to contrast, if either branch has no unique commits, git performs a fast-forward merge by default. If you want a fast-forward "merge" in Bazaar, you use push or pull.
  • Merges are never automatically committed, even if there aren't conflicts.

Git does have different merge

Pisco's picture

Git does have different merge strategies too, and it supports merging of more than two commits/branches in one go. All that is of absolutely no importance for the overall workflow.

You didn't answer the question concerning the patch resulting from the situation I described.

I just installed Bazaar 2.0.3 using MacPorts ... I'll try to find it out myself ... looking forward to learning something new :-)

All this happens in a day, no

David Strauss's picture

All this happens in a day, no chance that the previous patch is already included upstream. I repeatedly issue bzr add and bzr commit. Is it true that when I then issue bzr diff --old=[source] I will end up with a patch that contains the changes for the first and the second issue?

Yes. If you want to work on two separate issues, use separate branches.

Read the label

scor's picture

The title of the page says clearly "Simple git workflow for creating core patches" and does not pretend to be an exhaustive documentation on how to use git in general or how to use git for maintaining modules. It's simply about mimicking the CVS workflow people are used to today, and provide a workaround for adding new files in patches. You don't do (local) commits, you just create patches. You throw away all your changes (which are supposed to be in a patch) the same way you do with cvs up -C. Most the arguments above don't apply here because you don't commit anything locally and keep the master branch clean. It's just a patch machine. For core development, local commits are worthless, what matter is the patches sitting in the issue queue. There are more sophisticated ways of doing similar things like Stacked Git but it's out of the scope for this page. Further feedback should go as comments at the bottom of http://drupal.org/node/707484

GUI Support

Pisco's picture

For me as someone who doesn't know Bazaar, it seems like Bazaar has better GUI support on Windows. This is clearly a big pro for Bazaar, but don't be fooled by the current state of matters. Git is moving very very fast and has proven to take criticism serious, take a look a this list ofGit flaws I found on a Bazaar documentation. From this site:

Update: Since this page was created Git has come a long way. All of the issues listed have been addressed and this page should probably be retired.

Personally I expect (good) Git GUI to show up very soon ... I don't know if it's true, but I get the impression that Git is more widely use, and from what webchick has written it's especially true for Drupal.

I work on OS X (and Linux (Ubuntu)) I use either the shell or GitX which is a very handy and easy to use GUI, I really recommend it.

A friend just told me that there is a TortoiseGit which seems to be a very handy tool for those working on Windows.

How much does TortoiseGit

David Strauss's picture

How much does TortoiseGit differ from TortoiseBzr?

See the screenshots here:

reglogge's picture

http://code.google.com/p/tortoisegit/

To me it looks quite fully featured, with built-in diffing, a nice graphical log viewer, and tons of other features. Having worked with Tortoise svn for a long time, this seems quite up to par. I haven't used Tortoise Bazaar however, so I can't really compare those two.

Edit: You still have to install Msysgit, though, on Windows.

Frustrated by Bazaar installation

reglogge's picture

Having "outed" myself earlier as a git-user and having read all the comments about Bazaar being easier to use, I just went ahead and tried it out. This is what happened:

As I'm working on a Mac I installed Bazaar via Macports (which is listed as one of the options on Bazaars homepage) since I already had it running and use it for installing all kinds of stuff.
- The install took about 10 minutes with downloading, building and installing all the dependencies (Python took longest). That's not unusual.
- I ended up with a functional Bazaar, but only functional from the command line :-(

Ok, guess I have to install the Bazaar explorer too:
- Heading over to Bazaar Explorer's homepage at https://launchpad.net/bzr-explorer, there are downloadable executables, but only for windows :-(. Mac users have to download a tarball and install from source :-(. No Macports package available :-(
- My repeated attempts to get this installed and running failed, mostly due to a lack of documentation (or my stupidity?).

Next try was installing the binaries for OS/X 10.6 from Canonical's website after cleaning out the Macports install.
- Install went through without a hiccup, but: what next?
- There was no graphical client visible anywhere.
- Trying "bzr explore" from the command line in a bzr-repository failed miserably with ever more error messages piling up on my screen.

I gave up at this point and just want to add two more points:
1.) I know that this is a very subjective and probably unique experience. What I found however was that I was unable to find good documentation on how to resolve these problems. I guess I should have tried the binary installer from Canonical first, since my Python install seems to be messed up now, but then again, there were no indications as to how to proceed. At least it seems to me that we would have to provide some good documentation on installing Bazaar.
2.) Installing git about a year ago also wasn't without problems.

What I took away from this experiment however was the personal opinion that both systems have their pitfalls and general claims like "Bazaar is so much easier" or "Git is much more complicated" IMHO aren't really true.

And no, I don't want to restart the discussion on which system is "better" or "worse". Since I am at best a very marginal contributor in this community I will gladly defer to the really heavy users when they make their decision.

The problem is MacPorts. I

David Strauss's picture

The problem is MacPorts. I can't bring myself to use such a terrible service, and I've tried multiple times. I feel like it makes me compile the whole GNU toolchain every time I install a package. My experience with other Mac package managers has not been better, which is why I do development on Linux for anything non-trivial.

Prospective Mac users of Bazaar should simply install from the .dmg on the Bazaar site. I believe that bundle includes BzrExplorer. I haven't had trouble with those bundles for years.

Unfortunately, I can't vouch for how MacPorts has left your system, but I wouldn't be surprised if it jinxed your install from the .dmg.

MacPorts works without a

Pisco's picture

MacPorts works without a flaw. I have been usingRuby, Perl, Apache2, PHP, mod_perl and many more from MacPorts for years and it just works. If you want to get rid of MacPorts just delete the folder called /opt in the root of you HD and instantly everything MacPorts installed is removed.

But I think that MacPorts is not an option for the average user, as I hear Bazaar offers an easy to install .dmg and so does Git: Git for OS X.

Maybe I can help.

garyvdm's picture

Hi reglogge

I'm a bazaar (non Canonical) developer. I have successfully install bzr on a mac once before (about a month ago), but I don't have access to that machine now, so unfortunatly my ability to help is limited.

I think you may be did not install Qt. You can get that here: http://qt.nokia.com/products/platform/qt-for-mac (The bzr install page does metion that you need to do this, but there is no link, and it should stand out a bit more. I'll fix this shortly.)

If you have install qt, please could you maybe post the error you get when you run bzr explore.

Gary

How about a vote?

ceege111's picture

I would simply like to vote for git because I'm familiar with it

That's probably not the best

mcrittenden's picture

That's probably not the best idea. Git will obviously win the vote, but just because more Drupal developers use git doesn't make it the right choice for this specific use case.

Twitter asked for my

perandre's picture

Twitter asked for my opinion!

I have become a fan of git, and handle all my Drupal projects with git. It's easy to use, and places like Github makes it easy to take the step into using it for all kinds of things. If we go for git, I believe it will be easier for people to contribute, as they're already familiar with the concept.

Per André Rønsen | Front | Twitter: @perandre

Git on Windows rehashed

Heine's picture

A major TODO imo is to find out the future of Git for Windows.

October 22 was the last preview release / beta / very stable beta of the native Git implementation msysgit. (I don't consider Git on Cygwin a solution).

Will it get a release anytime soon? Will it chase the main git releases closely? Do the git maintainers take use on Windows into account when adding certain features?

On IRC there was a rumor that msysgit was lobbying to be merged back into the main project. If that happens, that's a boon for git, if not, will msysgit linger?

As to TortoiseBzr / TortoiseHg, contrary to TortoiseGit, these use (AFAIK) python as an Explorer shell extension. Is that true? I'm not really comfortable with that.

Unlike git, Bazaar has been

David Strauss's picture

Unlike git, Bazaar has been designed from the beginning to not rely on facilities unavailable (or poorly implemented) on Windows. git has a history of designing around POSIX-y things that don't port well to Windows.

That concerns me, but

Heine's picture

That concerns me, but apparantly, I'm the only one :)

Not the only one, it

sdboyer's picture

Not the only one, it certainly is an important issue. My limited experience, and more importantly feedback that I've gotten from other people who've used it more, is that it does work well enough on Windows to be able to reliably use it as a vcs for development. Handling complex production systems may be a different story.

Confirmed

jpetso's picture

I use Git for Windows at work, and though it's not quite as fast as under Linux (not surprising, as its foundations have been optimized for Linux/Unix filesystem principles), it's still very usable and bug-free in the current mislabeled "-preview" versions.

In fact, the worst thing about Git on Windows is the Windows' own lack of a proper terminal. That shortcoming can be fixed with Console2.

The other thing that has bitten us every now and then is the line-ending issue. Due to the local-repository approach of DVCS, you need to catch line endings on the client side, drupal.org can't really do anything about checked-in files with improper (i.e. non-Unix) line endings. Git has a config option called core.autocrlf, but I found this to be spotty in combination with diffs and Git's internal is-file-modified checks. At work, we turn off any automatic line-ending conversions so this is not an issue, but for a huge community with a good number of incompetent programmers, it would be desirable to have something better in place.

This is definitely one of the issues to investigate, if I'm not mistaken then Bazaar has more robust line-ending screw-up prevention in place. (Detailed report appreciated.) Git, like all DVCS, has of course capabilities to check commits on such issues with hooks, only if you create a new repository with git init, these hooks are not in place yet and would need to be fetched from d.o first.

I can't really comment on the GUI tools as I prefer to always use version control systems through the command line. I can say that Git GUI (which ships with msysgit) is not the prettiest GUI ever in existence, but it does provide a working graphical interface with a sufficient feature set. I could well imagine that TortoiseSVN-like tools have potential for better usability. I still don't grasp why people would want to use a GUI for any VCS operation except history with branches and stuff. But hey, this is not my domain so I'll let other people handle that.

...if I'm not mistaken then

David Strauss's picture

...if I'm not mistaken then Bazaar has more robust line-ending screw-up prevention in place.

I would have to know more about the system in git to give Bazaar a relative evaluation of robustness here. However, I can say Bazaar's support is solid:

http://doc.bazaar.canonical.com/development/en/user-reference/eol-help.html

The safest configuration method we can do on our side is make a check just before applying a push/commit/whatever on the Drupal server site that verifies a lack of problem line-endings in the tip revision before approving the operation. And, if we reject the operation, giving the user instructions for fixing things. In Bazaar's case, "fixing things" would involve setting the client's EOL filters and committing once. There's work underway to make the setting branch-level and replicating it with the initial checkout or branching operation.

Or, we could allow the problem tip and provide a one-click, web-based utility for fixing it. That would add one revision on the server-side that does the magic.

This topic has really grown

voxpelli's picture

This topic has really grown quite large - I haven't had time to read through all arguments and I'm afraid I wont have time until it's too late.

Anyway - it seems like one difference between Git and Bazaar hasn't been mentioned - and that is how the integrity of the repositories are guaranteed. I like Git's approach of cryptographic authentication of history - by knowing the sha-1 hash of the latest commit in a branch you can be sure that it and all previous commits hasn't been changed by errors or intruders.

I don't know if Bazaar has something similar - they may have some kind of PGP-signing - but it seems like there's no automatic, simple, bullet proof way of knowing if the history of one repo really is exactly the same as another in Bazaar.

I would prefer Git and have added my name to the list. (I would also love to see a forkable future of Drupal.org that takes heavy inspiration from eg. GitHub. - but that would be a future feature request)

Bazaar has fully support for

David Strauss's picture

Bazaar has full support for cryptographically signing commits. It's optional by default, but it can be made mandatory on a branch-by-branch basis.

more precisely: git signs tags

anarcat's picture

To be more accurate, git allows you to have signed tags, in addition to the checksum of each commit, which somehow ensure the integrity of the archive. Signed tags allow you to tap into the PGP web of trust to authenticate releases, which can be very useful if we eventually want to establish a complete trust path between the developer and the user (a bit like Debian does with its packaging system).

please leave markdown alone

anarcat's picture

It would be nice if people would stop turning off markdown when editing this page, it breaks rendering of those nice nested lists.

Wherein I propose a solution: Don't choose.

adrinux's picture

This is idea has the potential to keep everyone happy:

Why choose between git and bzr?
With git each project would need it's own repo and bzr can work the same way (we don't need to have all projects in one big repo). If we stick with project.module for issue tracking and release management we could support both git and bzr. We could actually let project owners choose at the time of project creation.

Yes it creates extra work in initial development, migration and ongoing maintenance. But we have plenty of people on both sides that seem willing to make their preferred VCS happen, this would neatly put us into a 'put up or shut up' situation. If either side fail to come through then one becomes the de-facto choice. Everyone gets to work on integrating their favourite VCS without fear that their work will be cast aside.

As for documentation, well we already describe an array of VCS workflows in the handbook, it'd be business as usual.
Git and Bzr are similar enough that I think any one individual will be able to adapt easily enough when getting involved with individual projects, especially with the help of decent documentation and a cheet sheet.

It would also mean the proposed code sprint at Drupalcon could begin to focus on implementation right away instead of wasting time in heated debate.

It would however leave us with a slightly contentious choice of which VCS to use for Drupal core...

So is this idea completely crazy? Or might it actually just work?

My concerns with this idea...

webchick's picture

My concern with this is that it raises the barrier to contribution for those who want to contribute patches (which is huge subset of our contributor base -- much larger than those who actually maintain modules/themes).

Right now, if I want to contribute a patch to any module, theme, or installation profile, on the entire site, or core itself, I have exactly one set of commands to remember that'll work universally, anywhere. Those commands aren't the prettiest:

cvs -d:pserver:anonymous:anonymous@cvs.drupal.org:/cvs/drupal co -r DRUPAL-X--Y -d XXXX contributions/modules/XXXX
cd XXXX
vi XXXX.module
cvs diff -up > awesomeness.patch

...but they work, universally, everywhere, all the time.

If project maintainers can pick and choose which VCS they want to use, then I need to learn both Bzr and Git (or Bzr and Git and Hg if we offered all three), and I need to do research every time I go to contribute a patch to find out which set of commands I need to copy and paste for one particular project. That's a pain in the ass, and I'm going to be less likely to do it, if I'm a "typical" developer; I'll just fix the bug for myself and move on with my day.

This also makes VCS-based site deployment much more difficult. You'd have to check out Views, Pathauto, and Token from Git, then CCK and Apache Solr, from Bzr, and God help you if the maintainer changes their mind on which system they want to use...

Basically in git the commands

gordon's picture

Basically in git the commands would not be much different. So to convert this into git it would be something like

git clone git://git.drupal.org/contributions/xxx.git XXXX
cd XXXX
git checkout -b DRUPAL-X--Y
vi XXXX.module
git diff -up > awesomeness.patch

This will work for casual bug fixing. If you are doing a much more in depth change you would commit the changes so they are tracked and the use

git format-patch <hash>

which will produce a patch that can be uploaded to d.o and be committed. This would also retain all the descriptions and the author details so they are not lost. So in the case of me submitting a patch to core, I would still be listed as the author, and you would only be listed as the committer. Unlike with CVS which you are listed as the author and I am only listed in the description.

As for VCS deployment it is actually easier, as I would use gits sub-modules to do it. So when I am deploying a site I would clone drupal core and create a branch for my new site. This means I could just pull in any changes to core, then I would do the following.

cd sites/all/modules
git submodule add -b DRUPAL-X--Y git://git.drupal.org/contributions/token.git token
git submodule add -b DRUPAL-X--Y git://git.drupal.org/contributions/pathauto.git pathauto

and so forth, then updating is just a "submodule update"

Gordon

--
Gordon Heydon

For site deployment, it's

David Strauss's picture

For site deployment, it's possible to just use Bazaar along with bzr-git. I still don't like the idea of two different systems, but this objection isn't the reason.

If project maintainers can

adrinux's picture

If project maintainers can pick and choose which VCS they want to use, then I need to learn both Bzr and Git (or Bzr and Git and Hg if we offered all three), and I need to do research every time I go to contribute a patch to find out which set of commands I need to copy and paste for one particular project.

This is certainly a bit of a flaw, but I don't think it's a deal killer.
Much of that is true of CVS – I know I have to go and look at docs every time I make a patch, and copy and paste commands. Ditto when I'm updating my one module and one theme in contrib.

We've already established that the differences between git and bzr are minimal, a single page cheat sheet would be plenty of space to detail both sets of commands.

That's a pain in the ass, and I'm going to be less likely to do it, if I'm a "typical" developer; I'll just fix the bug for myself and move on with my day.

It's already true because most people moved off CVS in work outside of d.org – and if we go with a single VCS it will continue to be true because so many people are already using the opposite (git/bzr), or svn, or hg. Hell, we're already in a position where contributing to some projects means developing entirely outside of the d.org infrastructure, and probably using a VCS you're not familiar with.

This also makes VCS-based site deployment much more difficult. You'd have to check out Views, Pathauto, and Token from Git, then CCK and Apache Solr, from Bzr

Well, best answer to that is drush_make.

and God help you if the maintainer changes their mind on which system they want to use...

You'd remove the the existing clone and check out a new one in the other VCS? What's the big deal?

(Disappointed this idea hasn't received more comments, seems people are more interested in git vs bzr flame war.)

Eh, I think this is

webchick's picture

Eh, I think this is hand-waving and glossing over things quite a bit. About three times a week I walk someone through their first baby steps with patching, and trust me that one way to do it is amply confusing enough, let alone two. :\ And Drush Make is all well and good for people who've happened to have heard of it, but don't underestimate how big "tribal knowledge" factors in to success in our community. Versus checking things out and running the equivalent of 'cvs up' is pretty universal across all toolchains.

However, I gave this idea some more thought because it does have its merits. And I wonder, how easy it would be to auto-sync commits across to both repos? In other words, if I prefer Bazaar, and I commit stuff to d.o's Bazaar repository, but someone who wants to extend my module with Git would see the same code in the Git repo, and could patch from there. Also, someone who wanted to run their entire deployment through Aegir+Git could do so without being victim to my crazy Bazaar wiles. I'm picturing this being pretty easy, with some post-commit scripts or the like.

However, this entire idea is dead in the water if the Drupal.org infrastructure team feels they don't have the resources to pull it off. It's all well and good for you or me or anyone else to look at a logical argument, but until and unless we are willing to step up to the plate to take on this maintenance work long-term, it's in the infrastructure team's hands. And I know I don't have the knowledge, time, or interest in maintaining the infrastructure side of this (though I will definitely do my best to rock the hell out of the training stuff :D).

Oh, I guess one other huge disadvantage...

webchick's picture

One other huge disadvantage is the lack of agility in drupal.org's various Project* / VCS integration tools that David Strauss points to in his pro-Launchpad posts would only be compounded because now any improvement pushed out would need to work on both platforms.

In fact, that is probably a deal-breaker right there. We need our community development tools to improve more quickly, not less.

...

BartVB's picture

If a user is adamant in using their VSC of choice then they can use 'git-bzr' of 'bzr-git' locally. The systems are fairly similar so tools have been written to translate between the two. But IMO it's not the task of drupal.org to do this for developers themselfs. Doing all that on Drupal.org and keeping everything in proper sync would be quite a headache and it wouldn't gain much.

All this is assuming that the Git/Bzr bridges are working properly, but if they don't I doubt that the sync scripts on Drupal.org will.

And I wonder, how easy it

sdboyer's picture

And I wonder, how easy it would be to auto-sync commits across to both repos? In other words, if I prefer Bazaar, and I commit stuff to d.o's Bazaar repository, but someone who wants to extend my module with Git would see the same code in the Git repo, and could patch from there.

It would be INSANE. Tracking it locally using the various bridges should be fine, but trying to keep everything in sync on d.o? Think of it like master-master db replication, except even scarier. Maybe like master-master replication between mysql and postgres? Not quite that bad, but note that one of the points our various 'experts' agree on is that we should go for one system, and go deep.

and in a DVCS flamewar completely unrelated to ours

drifter's picture

taw is making the point that if bzr/git/hg are functionally interchangeable, it makes all the more sense to choose one over the other. It's a bit of a darwinistic view, but good post nonetheless:

http://t-a-w.blogspot.com/2010/02/could-mercurial-please-die-already.html

yep

fago's picture

That's basically the same as I noted above in comment 131468. If the tools are functionally interchangeable we should use the most popular one. That way we can lower the barrier for new developers to participate.

May I ask why you're posting

David Strauss's picture

May I ask why you're posting this at all? I wrote a long response but deleted it before posting after realizing that I didn't want to carry that flame war here.

just that we should choose and not support both

drifter's picture

I replied to a thread where someone suggested that we could give a choice of either bzr or git. I'm not buying into the "git is more popular so it should win" argument, but I do think that we should definetly choose one over the other, supporting two would be a nightmare. And I thought the article was interesting. Sorry if my point wasn't clear.

supporting two would be a

adrinux's picture

supporting two would be a nightmare

That's the obvious knee-jerk response, but is it true? Yes it would be more work, but 'nightmare'? With git and bzr being very similar in function, I suspect there would be a lot of overlap.

.

BartVB's picture

Supporting two would mainly be a way not to choose. In Dutch we have the saying "zachte heelmeesters maken stinkende wonden" which would translate to "gentle physicians create smelly wounds", no idea what the english equivalent is. It's a nice solution to avoid a hard choice but a choice really should be made (see comments above :D).

gentle physicians create

adrinux's picture

gentle physicians create smelly wounds

A graphic and apt quote. Point taken.
Forget my crazy multi VCS idea ;)

Summary #2

webchick's picture

We recently crossed the 250-reply threshold mark on this monster, so time for another summary. :)

  • One thing that is universally agreed is that CVS sucks, and drupal.org's usage of it is leading directly to community and resource fragmentation, and moving code off-site that never gets contributed back. A group of concerned Drupal netizens want this to stop, and we're trying to hash through what exactly it would take to get modernize d.o's infrastructure to stop the hemorrhaging, and who can come on board to help.

  • We've ruled out moving to a better centralized version control system, such as Subversion, because though it is unquestionably more popular and makes more sense than CVS, it still shares the same limitations as CVS in terms of workflow, and the feature trade-off simply isn't worth the pain it will take to migrate.

  • In terms of distributed version control systems, the big players are Git, Bazaar, and Mercurial. While Git and Bazaar users have turned out in droves to sign up for the tough grunt work to help with the migration, as well as handling the size-able documentation and training components (some of which has already started), we simply haven't seen that from the Mercurial crowd. Therefore, though it is undoubtedly a robust VCS with great features and a smooth learning curve, we've ruled it out as a contender. The only way it goes back is if a throng of Drupal community Hg fans pull off an organized effort on-par with what the Git and Bzr folks are doing, which feels doubtful at this point. Therefore, from a practical standpoint, we are basically looking at Git vs. Bazaar.

  • A lot of discussion has gone on as to which of these two systems is best, and lots of resources have been gathered and linked to form the wiki, and pro/contra points posted to the discussion. For the most part, though, for the purposes of our community it really seems like they're both functionally equivalent. So we're currently in the process of working out exactly what the workflow would be like in both systems for each of our identified use cases: "major" and "minor" patch authors, module/theme developers, patch reviewers, and those deploying their Drupal sites from $vcs. This will allow us to compare apples to apples, and help evaluate the learning curve question, which is currently the biggest stated differences in the two systems.

  • A wonderful idea was raised by David Strauss to have "real life" usability testing on each VCS at Drupalcon SF on one of the code sprint days, consisting of a group of guinea pigs who have no experience in either, and are primarily used to CVS/Subversion (or don't know what a version control system is at all). We're currently looking for volunteers to wrangle this "Bzr vs. Git Smackdown", including preparing documentation and exercises for victims to go through, determining the data to look for to judge the learning curve aspects and how to gather that data, etc. Please jump in and volunteer/help organize this!!

  • On the infrastructure side, "phase 1" will basically consist of the simplest thing that could possibly work -- switching CVS for $vcs, and making whatever adjustments need to be made to our existing tools to get them showing $vcs commit logs instead of CVS commit logs, and packaging up $vcs branches/tags instead of CVS branches/tags, etc. Changes will still be handled with patches, and our existing workflow will change minimally. Itemizing exactly what this work is going to entail is an ongoing process that we need help with from folks who have knowledge of how that's all set up.

  • We also need help determining what exactly phase 2 and phase 3.5 ;) might look like. Analyzing other providers like Launchpad and GitHub and envisioning a way to combine that with our community's tools would be a good initiative for some folks to take on.

  • Speaking of those, we've also explored the idea of replacing Project* modules (which currently power our issue tracker and download system) with something like Launchpad, thus off-setting a major chunk of our infrastructure to folks who specialize in building these tools. This discussion is necessary, since a decision here would lock us in to picking one VCS vs the other. Overall though, this idea has been met with lukewarm reception at best, since the migration work would be quite significant, and it would add even more training requirement not only to our developers but to our entire community. There is also intense disagreement from prominent members of the drupal.org infrastructure team. Therefore, I also think we can safely rule out consideration of other project management systems from factoring into this decision, which brings us squarely back to Git vs. Bazaar on their own merits.

  • Timing-wise, this whole shebang will not be happening before Drupal 7 is out the door, because there are too many potential wildcards. So if you want to help scratch this itch, but don't have the chops/time to actually help the infrastructure team pull it off, a great place to dive in is the Drupal 7 critical issue queue.

Thanks for the summary; I'm

David Strauss's picture

Thanks for the summary; I'm sure this thread is quite imposing without efforts like yours to keep things accessible.

...and the feature-set is not a huge gain from what we have right now.

That is wildly inaccurate, for reasons I've stated regarding the security team's workflow, email notification, branch management, code review, translation and everything else that differentiates a platform Canonical has spent millions developing relative to the products of our mostly volunteer effort. They have full-time engineers and even UX people improving it with major updates every quarter. We're looking at the opportunity to largely retire security.drupal.org, the security team patch process, localization.drupal.org, and the Project module while gaining intense workflow integration in their replacement. From any realistic perspective, those features and that level of integration will never happen if we continue using Project*.

How long have we been linking "subscribe" in issue replies to proposals for better "follow" systems? How long did Drupal.org wait, with Project* as the final thing holding back the upgrade, to move to Drupal 6?

It's OK to say, "Launchpad is confusing" or "we're worried about maintaining a Zope system" or "the migration effort would not preserve our data well" or "we like our existing system more," but it's ridiculous to disqualify Launchpad on the basis of it not having enough features that it would add to our workflow, especially when I've had to defend Launchpad as a system that does too much.

I'm concerned that people who aren't doing the work are influencing this too much. Very few people maintain the current Project* system or worked on the last upgrade of Project* to Drupal 6. In the spirit of our "do-ocracy," I don't like the idea of people outside that group deciding what's "worth the effort" here any more than equivalent people deciding whether adding transactions to core is "worth the effort" or the Field API conversion is "worth the effort."

The decision for what happens with our systems needs to largely lie with the small team of people who aren't just committing to be on IRC for questions, but who are willing to devote the extensive time to travel and work on the conversion and build-out. Choosing a toolset that people like overall is important, but less important than picking something that will actually happen.

"Github on Project" or "Launchpad on Project" will never happen. The choice is between keeping our existing patch-based workflow on top of a DVCS or dropping Project*. We can postpone that decision, but it's one we realistically have to make unless we suddenly get a full-time team working on Project.

Note: I've already promised to webchick that I'll help in whatever transition the community chooses, whether git or Bazaar and Launchpad/Gitorious/Project*.

One thing I thought that may

gordon's picture

One thing I thought that may make the security workflow a lot better is that (now I am not sure if you can do this with bzr) is that instead of having to release a new revision with the security fix and any other development that has been done we could just branch off the tag, eg.

git checkout -b DRUPAL-X--Y-Z-SECURITY origin/DRUPAL-X--Y-Z

Then the changes can be done just to resolve the security issue, and then tag the release like so

git tag DRUPAL-X--Y-Z-SECURITY DRUPAL-X--Y-Z-SECURITY-1

You would only have to create the branch once as the tags would just move out along the branch. Then you can just cherry-pick the commits that fix the security patch onto the development branch like so

git checkout -b DRUPAL-X--Y origin/DRUPAL-X--Y
git cherry-pick {sha1}

You could move the patch very quickly though multiple updates. And would give the much cleaner security release of the stable release plus the security fix, and not the current development release plus the security fix.

Gordon.

--
Gordon Heydon

I'm concerned that people who

merlinofchaos's picture

I'm concerned that people who aren't doing the work are influencing this too much. Very few people maintain the current Project* system or worked on the last upgrade of Project* to Drupal 6. In the spirit of our "do-ocracy," I don't like the idea of people outside that group deciding what's "worth the effort" here any more than equivalent people deciding whether adding transactions to core is "worth the effort" or the Field API conversion is "worth the effort."

Not that I've given any input into this as I don't know enough about any of the systems to be able to offer anything substantive, but this attitude is very dangerous.

Sure, I'm not doing the work, but the entirety of my professional life revolves around the system we have in place on drupal.org. I helped do some of the early work on this, and no, I don't have time to work on the system because I'm too busy working on the stuff I keep in the system. But let me tell you, this kind of attitude is not too far from telling everyone but the 7 or 8 people who work on this to screw off and butt out. This is the same attitude you pulled on me when I objected to fundamental architecture in Field API that we are now stuck with.

But this time i's thousands of people who are tied to this system whose opinions you are trying to discount (at least, if they disagree with you), including some people who are very important to this system. Personally, I'm terrified of launchpad. What happens if, after you do all the work, it turns out that it isn't adequate to support my work? Where am I stuck then? And of course, since I didn't do the work, you get to feel comfortable telling me where to go. The answer will be "to my own sandbox", of course, if launchpad isn't adequate. I don't know that it will or won't be. I do know that my minimal experiences with launchpad with Views have led me to believe that its UI leaves a whole hell of a lot to be desired, though. So it's going to take a lot to convince me that integration will be positive, and it's difficult for me to believe that the effort of integration will be less than the effort that goes into project module right now.

This thing has to be thought through. I, personally, have a gigantic stake in what happens here. And no, I'm not going to be one of the people doing the work, but please do not try and discount people's opinions just because they aren't going to be doing the work. Remember that this decision is going to affect the fundamental tools that every person who contributes to Drupal MUST use, and there isn't a lot of room for regression.

This is the same attitude you

David Strauss's picture

This is the same attitude you pulled on me when I objected to fundamental architecture in Field API that we are now stuck with.

No, I took that attitude with you because you initially turned it into an emotional disagreement and stormed out of IRC (even accusing me of "only being there because Acquia was paying me to" when I was not only volunteering and spending my own money but had covered some expenses for other participants) instead of keeping the discussion professional, followed by weeks of the Field API team being on the defense in response to criticism that you and others seeded in the whole Views and CCK community. The Field API team promptly addressed and fixed every major criticism, yet the post-sprint fallout felt like being pursued by villagers with pitchforks and torches. That's not productive, nor is it a good way to encourage people like me to take a week off paying work to contribute.

Not everyone needs to participate in the work to implement something, but I'm wary of people expressing veto rights to design decisions without commitment to at least the design process itself. It can be especially damaging for an influential community member like you to use "veto rights" on something, so I think you have a duty to use them cautiously.

I'm glad to see you taking this opportunity to participate in this decision. I certainly don't want this to turn into the Field API architecture scenario again, where we decide something and only then see aggressive resistance.

Personally, I'm terrified of launchpad. What happens if, after you do all the work, it turns out that it isn't adequate to support my work? Where am I stuck then? And of course, since I didn't do the work, you get to feel comfortable telling me where to go. The answer will be "to my own sandbox", of course, if launchpad isn't adequate.

Well, we know a few things:

  • People are already moving to their own sandboxes off Drupal.org right now
  • The current system is inadequate for supporting our current needs well, yet doesn't seem to be improving
  • Launchpad (and other tools) fix those specific issues
  • Your objection, at this point, is purely FUD, and I mean that in the sense that you have "fear, uncertainty, and doubt" about it, and in a completely genuine way (not the disingenuous marketing FUD sort of way). There's nothing wrong having initial FUD about a major proposal, but I've spent extensive time in this thread explaining any detail people have questioned of the approach I would propose. At some point, FUD has to have a limit. Feel free to ask questions and voice you concerns, but we can't base decisions on fear of the unknown.

But let me tell you, this kind of attitude is not too far from telling everyone but the 7 or 8 people who work on this to screw off and butt out.

You've taken one part of my concern about this decision-making process and painted it as the primary decision criterion I'm pushing for; that isn't fair. And if you want to push this stakeholder influence argument, am I entitled to demand rights to control how you run the Views project? I'm a major stakeholder: between me and my clients, we use it every day, probably as much as you use Drupal.org's project tools. Thousands of other people use Views, probably more people than project tools on Drupal.org, yet I don't see you designing by referendum or having forums for stakeholders. That is your right as someone volunteering in this community, but I'm not even arguing for that extreme here.

So it's going to take a lot to convince me that integration will be positive, and it's difficult for me to believe that the effort of integration will be less than the effort that goes into project module right now.

I wouldn't dare to argue transitioning to another tool is less work than the basic work to add git or Bazaar support to Project, but where do we go from there?

No, I took that attitude with

merlinofchaos's picture

No, I took that attitude with you because you initially turned it into an emotional disagreement and stormed out of IRC instead of keeping the discussion professional, followed by weeks putting the Field API team on the defense responding to criticism that you (and others) seeded in the whole Views and CCK community. Not everyone needs to participate in the work to implement something, but I'm wary of people expressing veto rights to design decisions without commitment to at least the design process itself.

Well that's an interesting perspective you have on what happened there. It's safe to say mine differs significantly, so I won't go into it here.

You've taken one part of my concern about this decision-making process and painted it as the primary decision criterion I'm pushing for; that isn't fair. And if you want to push this stakeholder influence argument, am I entitled to demand rights to control how you run the Views project? I'm a major stakeholder: between me and my clients, we use it every day, probably as much as you use Drupal.org's project tools. Thousands of other people use Views, probably more people than project tools on Drupal.org, yet I don't see you designing by referendum or having forums for stakeholders. That is your right as someone volunteering in this community, but I'm not even arguing for that extreme here.

For the analogy to be even close, you'd have to have created project.module and maintained it for the last several years. You've been a major participator in the drupal.org infrastructure, but you didn't create it, you aren't the main voice behind it. It has a very clearly defined ownership structure. The analogy doesn't work very well.

A better analogy is with Drupal itself. I've been very comfortable being the autocrat of Views, in the same way that Dries is the autocrat of Drupal. And this is admitted -- Dries calls it a benevolent dicatorship. And yet, the community does, in fact, tell Dries what to do with Drupal all the time. With major contributions, major conversations, and major arguments. Dries gets to say no a lot, and he does. But the community has as much control over the direction of Drupal as Dries does.

Keep in mind that my whole objection here is that your phrasing of the concern leads to the easy interpretation that "Those who aren't doing should not get much influence". It's true that those who are doing get the most influence, but it does not allow discounting arguments out of hand based upon the fact that the arguer isn't doing the work. The arguments still are generally accepted for their merits.

For the analogy to be even

David Strauss's picture

For the analogy to be even close, you'd have to have created project.module and maintained it for the last several years. You've been a major participator in the drupal.org infrastructure, but you didn't create it, you aren't the main voice behind it. It has a very clearly defined ownership structure. The analogy doesn't work very well.

It seems like you're re-shaping your original argument to justify your sole control of Views while discounting my influence on the direction of infrastructure. It's true: you're the creator of Views, and I'm not the creator of Drupal.org infrastructure, but the substance of the rebuttal stops there. I never claimed to have sole influence over infrastructure; I only ask for a proportional one. To build on your analogy, I have around the same influence on Drupal.org infrastructure as webchick has on Drupal 7: I have a duty to subject major decisions to discussion -- and there is a person who can tell me "no" -- but I'm mostly free to go about my work with the trust of the community.

Your original argument, more accurately, was that people's stake in the system -- in your case, "entirety of [your] professional life" -- entitles them to proportional influence over that system, which is the same for my usage of Views and your usage of Project. My argument, which is that we link some influence to do-ership, doesn't mean disregard of non-participatory stakeholders, but it does mean five non-participants don't necessarily get to overrule one person who will actually do the work. (That ratio is just an example; I don't support counting and blindly weighing votes.)

At some point, stakeholder interest should outweigh participant interest, and then the participants have the somewhat selfless task of working to move the community forward anyway. That's why I noted that I'll help with any decision we come to, but it doesn't mean I'll push any less for what I think is best.

It's true that those who are doing get the most influence, but it does not allow discounting arguments out of hand based upon the fact that the arguer isn't doing the work. The arguments still are generally accepted for their merits.

Arguments may be accepted on merit, but merit isn't always the decider of what happens. I could successfully argue that approach A is twice as good as approach B, but if the only people doing the work implement B, and it's an improvement, it will go in. My arguments for A don't block B. If A and B both get implemented, then A will go in. And, of course, we might use arguments for A to decide to work on A instead of B.

What I'm arguing here is related. We have people willing to implement basic bzr/git on Project or work on a transition to Launchpad, but there are quite a few posts arguing the merit of adding GitHub- or Launchpad-like features to Project. Those arguments can win on merit: it's all of the features with none of the community-dividing or migration overhead. But: no one exists to implement it. I have enough project experience to know that picking an unrealistically optimistic design (GitHub for Project!) often results in something worse (staying on the current patch workflow forever) than a more realistic but less flashy option (moving to something like Gitorious).

The rest of this

sdboyer's picture

The rest of this mini-discussion has nothing to do with me, as far as I'm concerned, but this part does:

We have people willing to implement basic bzr/git on Project or work on a transition to Launchpad, but there are quite a few posts arguing the merit of adding GitHub- or Launchpad-like features to Project. Those arguments can win on merit: it's all of the features with none of the community-dividing or migration overhead. But: no one exists to implement it.

Um. Seriously?

I've both explicitly and implicitly put myself on the line to be the one doing since long before this thread began. And I'm not the only one. I don't know how you missed that.

Issue tracking and software releases

Group organizers

Group notifications

This group offers an RSS feed. Or subscribe to these personalized, sitewide feeds:

Hot content this week