Friday, March 22, 2013

Still Not Switching To Distributed Version Control

If Joel Spolsky says it, I (almost always) believe it. He's a brilliant programmer and a terrific writer. But just because he tells me I should love Distributed Version Control doesn't mean I can. In his most recent article he says:

For those of you that have been living under a rock, the single biggest change in developers’ lives in the last decade (besides Stack Overflow, natch) is Distributed Version Control. DVCS is such an important improvement over the previous generation of centralized version control (Subversion, CVS, etc.) that it’s a required upgrade, even though it’s honestly a bit harder to use.

Digging deeper, back in 2010, he said:

With distributed version control, the distributed part is actually not the most interesting part.

The interesting part is that these systems think in terms of changes, not in terms of versions.

That’s a very zen-like thing to say, I know. Traditional version control thinks: OK, I have version 1. And now I have version 2. And now I have version 3.

And distributed version control thinks, I had nothing. And then I got these changes. And then I got these other changes.
...
When you manage changes instead of managing versions, merging works better, and therefore, you can branch any time your organizational goals require it, because merging back will be a piece of cake.
...
This is too important to miss out on. This is possibly the biggest advance in software development technology in the ten years I’ve been writing articles here.

Or, to put it another way, I’d go back to C++ before I gave up on Mercurial.

If you are using Subversion, stop it. Just stop. Subversion = Leeches. Mercurial and Git = Antibiotics. We have better technology now.

Seriously, how can I keep using Subversion after reading the above? The only catch is, I've experimented with git I'm completely underwhelmed. Yes, it technically works, but:

  1. I have to give up the Unix style notion of managing files in version control. Using Subversion I can cat, mv and cp files just like I would from the command line.
  2. I have to give up notion of partial checkouts. When someone has committed 3 gigs of videos to a directory, it sure is nice not to have to check them out.
  3. I have to give up the simple everything's a file model, where tags and branches are managed by creating copies of files and directories.
  4. I have to give up my incremental version numbers. I love the simplicity of being able to attach a version number to a software release, and being able to roll back to that anytime to see what's what
  5. I've yet to run into a nightmare merging scenario alluded to above. Before I start work on any customer's project I create and often use a bugfix branch. And before I deploy, I merge that bugfix branch back in. It's a system that's been working well for years, and wouldn't develop without it.

I'm not saying that git and other DVCS aren't fantastic for some applications. I'm usually operating in lone-wolf mode, with only a few other folks accessing a repository. With a larger team, or better yet, multiple teams, DVCS's benefits may be a no-brainer.

Using git reminded me of the early days of Enterprise JavaBeans. The technology was so cool, and so fancy. Just the word "Enterprise" made it sound like you had hit the big time. So we opted for the whole kit and caboodle: session beans, entity beans, Weblogic container, multiple servers; the whole deal. What we found out, very much the hard way, was that the system was terribly slow and terribly bloated. The only true benefit of our solution was that we were buzzword compliant. PHP + MySQL would have run circles around that ridiculous setup. We had successfully solved all sorts of problems we didn't have (ones that a large enterprise might have, perhaps?), and neglected the ones that really mattered (delivering pages quickly and querying a small database efficiently).

I suppose it all boils down to: use the right tool for the right job.

8 comments:

  1. The Other Ben Simon10:30 AM

    After extensive research and prototype setups with Git last year, I wound up in the exact same place you are. For my team's current workflow SVN continues to be a better choice for us. (Not too many branches and not too much experimentation happening)

    Agree the right tool for the right job. The next thing you need to rant about is the cool factor fascination with NOSQL databases...

    ReplyDelete
  2. Ben -

    Glad to hear I'm not totally crazy. I definitely lag on some things technically, but I like to think I choose my development tools for being efficient and reliable (emacs, anyone?). I hate thinking that I'm latched onto Subversion just because I don't want to try something new.

    As for NoSQL databases - I was burned there, too :-). On paper Amazon's SimpleDB rocks, yet I found the reality to be quite different.

    I still hold out some hope for NoSQL. Heck, I love using Lawnchair in a mobile development context. But for web apps, MySQL is going to be my go to solution for quite some time.

    ReplyDelete
  3. We're using git with eclipse, seems to work alright. I still use emacs for editing most of the time. You can do partial checkouts and rollbacks.

    I suppose you could make scripts for the mv and cp commands, if you like. I just drag them around to where they need to go with eclipse's file browser.

    And, if you need cool version numbers, have I got the script for you. :)

    As for a db, mysql is pretty good and available everywhere, why bother with anything else.

    Change is generally difficult so if its working for you, why change.

    Check out this deal on hosting:

    ChicagoVPS-2048MB-RAM-50GB-Disk-2TB-Bandwidth-2x-IP-address-

    $40 or $30 a year

    https://billing.chicagovps.net/cart.php?a=confproduct&i=0

    coupon code:

    STEALOFADEAL

    ReplyDelete
  4. > We're using git with eclipse, seems to work alright

    That's pretty much my point. Git is cool. It does everything a version control system should do: allows for simultaneous development, gives you diffs, allows you to rollback to a sane version when you break something and more.

    And for large distributed teams, like say, the Linux Kernel project, it must be a godsend. The notion of having multiple repositories, and the ability to push stuff around to these repositories must have been a game changer.

    But for simple team development, I don't see how that buys you all that much? At last I can't justify the increase in complexity for benefits that don't really apply to the work I do.

    Which is why I'm baffled that folks suggest git is a generational leap over subversion. It seems to be different tool to solve different problems.

    ReplyDelete
  5. > 1. I have to give up the Unix style notion of managing files in version control. Using Subversion I can cat, mv and cp files just like I would from the command line.

    On the other hand, you *have to* use specialized Subversion commands, i.e. `svn mv` and `svn cp` to manage files, instead of using CLI or filemanager, or managing files from IDE, etc.

    Also `svn cat FILE[@REV]` is `git show [REV:]FILE`, and with git alias you can have `git cat FILE` too.

    > 2. I have to give up notion of partial checkouts. When someone has committed 3 gigs of videos to a directory, it sure is nice not to have to check them out.

    Well, git _has_ partial checkouts, but this does not give much, as it doesn't yet have (unless something changed) partial cloning. You can always fiddle with submodules, and there are such third-party tools like *git-annex* or git-slave, but here I admit that here (in that situation) centralized version control system have better out-of-box solution.

    OTOH with distributed version control system you can work with full advantages of version control in "on the plane" case, i.e. in situation where you don't have connection to your central server.

    > 3. I have to give up the simple _everything's a file_ model, where tags and branches are managed by creating copies of files and directories.

    It is *too simple* model (borrowed from Perforce AFAIK), which leads to relying on unenforced convention to find boundaries between project, branch and path in project, and leaded to tags being practically useless in Subversion.

    I like very much the simple but not too simple Git model, where revisions (versions) form a directed graph (from revision to parent), and branches and tags are pointers to revisions (nodes in this graph), tags being immutable.

    > 4. I have to give up my incremental version numbers. I love the simplicity of being able to attach a version number to a software release, and being able to roll back to that anytime to see what's what.

    First, a technical reason why there are no incremental version numbers in Git: for global version numbers you need centralized numbering server, which goes against the whole idea of having a _distributed_ version control system.

    So what do you use in Git for releases? You use tags (usually cryptographically signed tags), like e.g. `v1.8.1-rc2` - those actually work in Git (as opposed to Subversion). You can rollback to tagged version easily, see where in history are tagged versions, etc.

    Besides in Subversion from what I remember revision numbering is common for the whole svnroot (server), which leads to using e.g. r1336830 as revision identifier...

    > 5. I've yet to run into a nightmare merging scenario alluded to above. Before I start work on any customer's project I create and often use a bugfix branch. And before I deploy, I merge that bugfix branch back in. It's a system that's been working well for years, and wouldn't develop without it.

    Subversion has (relatively) sane and easy merging only since (IIRC) version 1.5.x. Git and other DVCS had it from very beginning.

    Besides try merge with renames...

    ReplyDelete
  6. Jakub -

    Thanks for taking the time to lay out such a thorough response. I will mention that in the Apache module install of svn, you're able to have multiple projects with their own version numbers (I use that feature all the time).

    Clearly git is a more than capable version control solution. And for teams that are considering winging it, or using git, there's no question that git is the way to go.

    I just don't see anything in your description above that shows git to be what I might consider a "generation" ahead of subversion. Clearly RCS was a massive improvement over no version control. And CVS was a huge improvement over RCS (concurrent development? Wow, that's huge!). And subversion is a leap over CVS (renaming and deleting files, for example). But, I don't see that leap in git.

    Yes, the "airplane mode" of git is sweet. But, personally, I'd rather have the more concrete version semantics than a feature I might use twice a year.

    Git is great. So is subversion.

    ReplyDelete
    Replies
    1. Well, "commit before merge" concurrency[1,2] is a leap over "merge before commit" used by default by Subversion, though the difference is not as large as between locking and non-locking models.

      [1]: http://www.ericsink.com/vcbe/html/history_of_version_control.html
      [2]: http://www.catb.org/~esr/writings/version-control/version-control.html

      Delete
    2. > Well, "commit before merge" concurrency[1,2] is a leap over "merge before commit" used by default by Subversion

      Hadn't thought about it that way.

      Still not a game changer for me or the teams I've worked on in the past, but an important note none the less.

      Delete