The thing about Git is that it’s oddly liberal with how and when you use it. Version control systems have traditionally required a lot of up-front planning followed by constant interaction to get changes to the right place at the right time and in the right order. And woe unto thee if a rule is broken somewhere along the way, or you change your mind about something, or you just want to fix this one thing real quick before having to commit all the other crap in your working copy.

Git is quite different in this regard. You can work on five separate logical changes in your working copy — without interacting with the VCS at all — and then build up a series of commits in one fell swoop. Or, you can take the opposite extreme and commit really frequently and mindlessly, returning later to rearrange commits, annotate log messages, squash commits together, tease them apart, or rip stuff out completely. It’s up to you, really. Git doesn’t have an opinion on the matter.

Remember a long time ago, at the dinner table, when your kid brother mashed together a bunch of food that really should not have been mashed together — chicken, jello, gravy, condiments, corn, milk, peas, pudding, all that stuff — and proceeded to eat it? And loved it! And then your crazy uncle, having seen the look of disgust on your face, said: “it all goes to the same place!” Remember that? No? Then you were probably the one shoving nasty shit into your face, but the important thing to understand here is that your uncle is crazy. And so is Git.

I’ve personally settled into a development style where coding and interacting with version control are distinctly separate activities. I no longer find myself constantly weaving in and out due to the finicky workflow rules demanded by the VCS. When I’m coding, I’m coding. Period. Version control – out of my head. When I feel the need to organize code into logical pieces and write about it, I switch into version control mode and go at it.

I’m not saying this is the Right Way to use Git: in the end, it all goes to the same place. I’m saying that this is the way I seem naturally inclined to develop software, and Git is the first VCS I’ve used that accommodates the style.

I’d like to run through a short example — on the off chance that my extreme hyperbole has left you unconvinced — that shows how one might first stumble onto some of Git’s more advanced features, and that hopefully also brings to light how easily one could then develop a strong addiction to such features.

The Tangled Working Copy Problem

Suppose that, last night, I start work on some enhancements to the “Leave Comment” forms on this site. I figure this will take all of maybe ten minutes, so I begin pounding away in my working copy. After screwing around for an hour or so, I give up and go to bed, leaving the half-baked changes in my working copy.

The next morning, I coffee up and find my del.icio.us bookmarks not being sucked into the site properly and so I start playing with that mess (this is completely unrelated to what I was doing the night before, mind).

After working out the small problem with sucking in bookmarks, I take a peek at git status to see where my working copy is at:

$ git status
# On branch master
# Changed but not updated:
# 
#     modified: models.rb
#     modified: views/entry.haml
#     modified: bin/synchronize-bookmarks
#     modified: js/tomayko.js
#     modified: stylesheets/tomayko.css

I realize, for the first time, that I have two unrelated changes in my working copy:

  1. The experimental comment form tweaks: models.rb, entry.haml, tomayko.js, and tomayko.css. I’m not ready to push this into the live site yet so I don’t want these changes on the master branch.

  2. Bookmark synching fixes: models.rb and synchronize-bookmarks. This needs a commit on master and should be shipped up to the live site, immediately.

The big problem here is models.rb – it’s “tangled” in the sense that it includes modifications from two different logical changes. I need to tease these changes apart into two separate commits, somehow.

This is the type of situation that occurs fairly regularly (to me, at least) and that very few VCS’s are capable of helping out with. We’ll call it, “The Tangled Working Copy Problem.”

Git means never having to say, “you should have”

If you took The Tangled Working Copy Problem to the mailing lists of each of the VCS’s and solicited proposals for how best to untangle it, I think it’s safe to say that most of the solutions would be of the form: “You should have XXX before YYY.”

  • Subversion: You should have committed the experimental changes to a separate branch before working on the bookmark stuff.

  • Bazaar: You should have shelved your experimental changes before working on the bookmark stuff.EDIT: My mistake. bzr shelve solves exactly this problem.

  • CVS: You should have RTFM before wasting everyone’s time with such a lame question.

Here’s a general principle I would like my VCS to acknowledge: moving from the present point B to some desired point C should not require a change in behavior at point A in the past. More simply, the phrase: “you should have,” ought to set off alarm bells. These are precisely the types of problems I want my VCS to solve, not throw back in my face with rules for how to structure workflow the next time.

(To be fair, DarcsMercurial handles The Tangled Working Copy Problem without breaking a sweat and others do as well. UPDATE: see comments below for discussion on this.)

Solving The Tangled Working Copy Problem When Your VCS Won’t

I run into The Tangled Working Copy Problem so often that I’ve devised a manual process for dealing with it under VCS’s that punt on the problem. For instance, if I were using Subversion, I might go at it like this:

  1. Run svn diff over the files with changes I don’t want to commit (the comment related stuff), piping the output into vim.
  2. Remove hunks from the diff corresponding to those changes I want to commit (the bookmark related hunks) and write the diff out to comment-stuff.diff.
  3. Run patch -p0 -R < comment-stuff.diff. This removes the comment related changes from my working copy (-R = “apply diff in reverse”).
  4. Commit the bookmark related fixes sitting in my working copy to the repository.
  5. Run patch -p0 < comment-stuff.diff to reapply the comment related changes to my working copy.
  6. Forget to create branch for comment stuff, again.
  7. Hack on comment stuff for a while.
  8. Find more unrelated brokeness and fix it.
  9. Oops! GOTO 1.

This works well enough when there are no changes to binary files, and the diff doesn’t mind being teased apart, and when there’s only two or three changes tangled up, but it raises the question: what am I paying my VCS for?

The idea of manually managing sets of patches to coerce my patch management program into managing patches is literally absurd.

Viva La Index

Git has this alien thing between the working copy and the repository called The Index. I was entirely annoyed by the concept when starting out – you have no idea why you’re forced to deal with it and you’re always dealing with it. Even after reading multiple accounts of what The Index supposedly was, I continued to be baffled by it, wondering how it could possibly serve any useful purpose at all. That is, until the first time I ran into The Tangled Working Copy Problem.

The Index is also sometimes referred to as The Staging Area, which makes for a much better conceptual label in this case. I tend to think of it as the next patch: you build it up interactively with changes from your working copy and can later review and revise it. When you’re happy with what you have lined up in the staging area, which basically amounts to a diff, you commit it. And because your commits are no longer bound directly to what’s in your working copy, you’re free to stage individual pieces on a file-by-file, hunk-by-hunk basis.

Once you’ve wrapped your head around it, this seemingly simple and poorly named layer of goo between your working copy and the next commit can have some really magnificent implications on the way you develop software.

Solving The Tangled Working Copy Problem With Git’s Index

Let’s review the status of our working copy:

$ git status
# On branch master
# Changed but not updated:
# 
#     modified: models.rb
#     modified: views/entry.haml
#     modified: bin/synchronize-bookmarks
#     modified: js/tomayko.js
#     modified: stylesheets/tomayko.css

We want to commit all of the changes to synchronize-bookmarks and some of the changes to models.rb, so let’s add them to the staging area:

$ git add bin/synchronize-bookmarks
$ git add --patch models.rb
diff --git a/models.rb b/models.rb
index be4159d..3efd4ce 100644
--- a/models.rb
+++ b/models.rb
@@ -256,7 +256,7 @@
     class Bookmark < Entry
       next unless source[:shared]
       bookmark = find_or_create(:slug => source[:hash])
-      bookmark.update_attributes(
+      bookmark.attributes = {
         :url        => source[:href],
         :title      => source[:description],
         :summary    => source[:extended],
Stage this hunk [y/n/a/d/j/J/?]?

The magic is in the --patch argument to git-add(1). This instructs Git to display all changes to the files specified on a hunk-by-hunk basis and lets you choose one of the following options for each hunk:

  • y – stage this hunk
  • n – do not stage this hunk
  • a – stage this and all the remaining hunks in the file
  • d – do not stage this hunk nor any of the remaining hunks in the file
  • j – leave this hunk undecided, see next undecided hunk
  • J – leave this hunk undecided, see next hunk
  • k – leave this hunk undecided, see previous undecided hunk
  • K – leave this hunk undecided, see previous hunk
  • s – split the current hunk into smaller hunks

In this case, I staged (y) about half of the hunks (the ones that were bookmark related) and left the other hunks unstaged (n). Now my index has all of the changes to synchronize-bookmarks plus half of the changes made to models.rb.

I like to review that the changes in the staging area match my expectations before committing:

$ git diff --cached
[diff of changes in staging area]

I also like to verify that my unstaged / working copy changes are as I expect:

$ git diff
[diff of changes in working copy that are not in the staging area]

Everything looks good, so I commit the staged changes:

$ git commit -m "fix bookmark sucking problems"

I’m left with only the experimental comment enhancements in my working copy and am free to move them onto a topic branch, or maybe I’ll just let them sit in my working copy for a while. Git doesn’t care.

Taking Control of Your Local Workflow

We’ve seen how to use git add --patch to pluck specific changes out of the working copy and stage them for the next commit, a nice feature that elegantly solves a once-tedious problem and that makes possible a previously forbidden style of development. There’s more where that came from, though. Here are some related concepts that you will want to also introduce yourself to:

  • git add --patch is actually a shortcut to features in git add --interactive, a powerful front-end for managing all aspects of the staging area. The git-add(1) manual page is a treasure trove of worthwhile information that’s often passed over due to the traditional semantics of VCS “add” commands. Remember that git-add(1) does a lot more than just add stuff – it’s your interface for modifying the staging area.

  • git commit --amend takes the changes staged in the index and squashes them into the previous commit. This lets you fix a problem with the last commit, which is almost always where you see the technique prescribed, but it also opens up the option of a commit-heavy workflow where you continuously revise and annotate whatever it is you’re working on. See the git-commit(1) manual page for more on this.

  • And then there’s git rebase --interactive, which is a bit like git commit --amend hopped up on acid and holding a chainsaw – completely insane and quite dangerous but capable of exposing entirely new states of mind. Here you can edit, squash, reorder, tease apart, and annotate existing commits in a way that’s easier and more intuitive than it ought to be. The “INTERACTIVE MODE” section of the git-rebase(1) manual page is instructive but Pierre Habouzit’s demonstration is what flipped the light on for me.

That’s really all you need to know above and beyond Git’s fundamentals to start dominating your local workflow. From here, you may want to explore some of the various other concepts and utilities specifically designed to augment your local workflow:

  • People seem to get a lot of utility out of git-stash(1), which lets you move changes from your working copy into a lightweight holding area to be reintroduced some time later. I personally haven’t used it much in practice, and I used Bazaar’s rough equivalent of git-stash(1) (bzr shelve) frequently. I find that the staging area removes the need for stashing in a bunch of cases and when I really do need to get stuff out of my working copy and somewhere safe, I just create a topic branch.

  • I haven’t played with it yet but StGIT (“Stacked Git”) looks seriously interesting from the examples in the tutorial. I tend to visualize version control concepts as series of patch operations so I’d probably feel more at home with this style of front-end.

  • There’s a section of the Git User’s Manual called The Workflow that describes, at a fairly low level, the various interactions between the working copy, the index, and the object database.


Attribution

Baby Eats Camera is Copyright © 2007 by mahalie.

Comments

  1. Thank you for explaining git add —patch. Wonderful!

    Ashwin on Tuesday, April 08, 2008 at 06:37 AM #

  2. Ryan, Thanks for the great & informative article on git… I sorely needed it yesterday when I was diff'n & patch'n my working copy (for git :—)

    Jim Holt on Tuesday, April 08, 2008 at 06:56 AM #

  3. I think you mean “woe unto thee”.

    JR on Tuesday, April 08, 2008 at 07:17 AM #

  4. Ha! <blush>

    I suppose “whoa unto thee” is something you’d expect to find in the screenplay to “Bill and Ted’s Excellent Adventure.”

    Fixed. Thanks.

    Ryan Tomayko on Tuesday, April 08, 2008 at 07:22 AM #

  5. Just to let you know, git borrowed the git-add user interaction from darcs which has had it for years. For small projects, you might find you like the simplicty of darcs much better than git.

    -Jim

    PS If you are forced to use CVS for some reason, you might like cvs-commit-patch which allows the same style of work (especially if you editor helps you to edits patches easily)

    Jim on Tuesday, April 08, 2008 at 07:25 AM #

  6. Interesting. Thanks, Jim.

    People: I’d much appreciate further comments comparing how other VCS’s stack up to Git in this regard. I know there’s some nifty features for crafting local workflows in other systems and I’m very interested in exploring the general concept further.

    Ryan Tomayko on Tuesday, April 08, 2008 at 07:32 AM #

  7. I, too, code a lot and then commit a bunch of stuff all at once. A friend (who I just noticed is comment #5!) and I wrote a tool to help us do this back in the CVS days (it basically codified your manual svn patching steps). It’s been modified since to use several different vcs-es including CVS, Mercurial, and Darcs. Check it out at http://porkrind.org/commit-patch.

    I use it on a daily bases with darcs, even though darcs has the same sort of hunk choosing UI as git because I find that editing patches in emacs allows me one last step of editorial control as I review hunks for check-in. I’ll add comments or delete debug prints in the patch but leave them in my working copy. It’s also a life saver if you have 2 unrelated commits that end up being on the same line or bump into each other and cause the hunks to merge (which happens to me more often than you’d expect).

    David on Tuesday, April 08, 2008 at 07:39 AM #

  8. With bazaar, you don’t have to shelve BEFORE working on a separate piece. In your case you can shelve your experimental stuff and it will chunk your models.rb file so you can pick pieces of it to shelve. Once that is done then commit and unshelve. That really isn’t “to go from B to C you should have done A first”.

    Fizz on Tuesday, April 08, 2008 at 08:16 AM #

  9. Index in Spanish is used with the article El so the proper subtitle is “Viva el Index”.

    Nice post.

    Jorge on Tuesday, April 08, 2008 at 08:20 AM #

  10. Git solution with index is wrong. You can’t actually test the “index state” before committing, without extra work later.

    stgit seems like a cleaner solution for this problem.

    One builds the current patch in the working copy, tests it and commit’s it. That is the normal scenario and should be the default. Unfortunately, not in git.

    Mark on Tuesday, April 08, 2008 at 08:25 AM #

  11. bzr shelve allows you to cherry-pick hunks in a fashion similar to git-add. By shelving the parts you don’t want to commit, you can commit the parts that you do want, and then unshelve to continue working on your experimental changes.

    Please consider editing your post to reflect the fact that bzr does handle the problem you were facing. Bzr does not deserve to be lumped in with svn and cvs.

    John James on Tuesday, April 08, 2008 at 08:28 AM #

  12. Fizz / John: good points. It turns out I frequently used bzr shelve --all and was under the mistaken assumption that this limited mode of operation was the extent of its capabilities. The text will be corrected shortly. My apologies.

    If anyone’s interested in more detail on this, see this example of shelving from the Bazaar wiki.

    Ryan Tomayko on Tuesday, April 08, 2008 at 08:31 AM #

  13. Perhaps I misunderstood your explanation, but I find bzr’s shelve command to be both easier to understand and provide essentially the same thing without introducing a staging area concept. Regardless of the VCS, you need some way to differentiate the changes you want (bookmarks) and those you don’t (comments). With bzr’s shelve you are saying put aside the comment changes and come back to them later. With git, it seems you’re just saying what you want to keep. I’d also note that with bzr you can do shelve at any point — once you discover you have multiple changes in your branch is fine — and mark “hunks” in a similar fashion as you described for git. The unshelve command lets you come back to the changes later.

    I suppose this is just about how much complexity you want to manage at once. Shelve certainly feels easier to use to me.

    Tim on Tuesday, April 08, 2008 at 08:41 AM #

  14. While it does not provide exactly the same work-flow, the record extension to Mercurial enables one to cherry-pick changes of the changed files in your working copy. Enable it by editing your .hgrc:

    [extensions]
    record=
    

    and invoke it with:

    hg record
    

    Eivind Uggedal on Tuesday, April 08, 2008 at 08:45 AM #

  15. Sure would be nice to be able to link to the bzr shelve --help output to clear up any misconceptions I may have caused. Oh well, maybe sullying my Git article with Bazaar help text will count as penance enough?

    $ bzr shelve --help
    Purpose: Temporarily set aside some changes from the current tree.
    Usage:   bzr shelve [FILE...]
    
    Options:
      --all                 Shelve all changes without prompting.
      -v, --verbose         Display more information.
      -h, --help            Show help message.
      -q, --quiet           Only display errors and warnings.
      -m ARG, --message=ARG
                            A message to associate with the shelved changes.
      --no-color            Never display changes in color.
      -r ARG, --revision=ARG
                            See "help revisionspec" for details.
    
    
    Description:
      Shelve allows you to temporarily put changes you've made "on the shelf",
      ie. out of the way, until a later time when you can bring them back from
      the shelf with the 'unshelve' command.
      
      Shelve is intended to help separate several sets of text changes that have
      been inappropriately mingled.  If you just want to get rid of all changes
      (text and otherwise) and you don't need to restore them later, use revert.
      If you want to shelve all text changes at once, use shelve --all.
      
      By default shelve asks you what you want to shelve, press '?' at the
      prompt to get help. To shelve everything run shelve --all.
      
      If filenames are specified, only the changes to those files will be
      shelved, other files will be left untouched.
      
      If a revision is specified, changes since that revision will be shelved.
      
      You can put multiple items on the shelf. Normally each time you run
      unshelve the most recently shelved changes will be reinstated. However,
      you can also unshelve changes in a different order by explicitly
      specifiying which changes to unshelve. This works best when the changes
      don't depend on each other.
      
      While you have patches on the shelf you can view and manipulate them with
      the 'shelf' command. Run 'bzr shelf -h' for more info.

    Ryan Tomayko on Tuesday, April 08, 2008 at 08:49 AM #

  16. What the guy at #10 said: by using the index in this way you are committing changes you have not compiled (if needed) or tested. ‘make test’ looks at the changes in your working directory, not the changes in the git index. That you’re really excited by this functionality shows that git has lead you astray.

    Folks have mentioned shelve and other set-aside mechanisms, but it occurs to me that the index, set-aside and patch queues (stgit, hg queues) form a continuum, with patch queues being a superset of the others.

    For example, with mercurial (because I’m not familiar with stgit) you would use the qrecord command to pluck changes out of your working directory to define a new patch. Then you would use the usual qnew and qrefresh to create another patch for your bookmark changes. Then, you would qpop the bookmark patch and your all set to eyeball, compile, test. At each stage, applied patches appear in the repo so all the normal repo commands work, unlike the index which is an entity unto itself.

    The folks who say “you should have done…”, while being a bit unhelpful because, well, you didn’t, are still basically right. You need to keep changes separate. The git index limbo state makes you think you don’t need to do this, but you do and you will be sorry. Mercurial queues (and stgit, I guess), are an easy way to do what you should have done in the first place.

    Hmmm on Tuesday, April 08, 2008 at 09:20 AM #

  17. Thanks! This is very enlightening.

    no on Tuesday, April 08, 2008 at 11:19 AM #

  18. I actually like the Subversion way of doing things better:

    cp -r working one-feature cd one-feature files.select { !committed? }.each { revert } while !test fix end commit

    It gets around the all too frequent “good intentions but broken trunk” problem. Like others said, test before you commit!

    Not to be confused with my current dislike for SVN. The Git/Mercurial tools are so much better, I prefer to use them any day, but I’ve yet to find a better algorithm.

    Assaf on Tuesday, April 08, 2008 at 11:38 AM #

  19. You says darcs handles the tangled working copy problem without breaking a sweat, but unless I’m doing something wrong, I haven’t found that to be the case. It’s fine, right up until the point that diff decides that changes from two separate logical changes are in the same hunk. And then you’re screwed. From what you’ve written, it looks like Git might be able to handle this by splitting the current hunk into smaller hunks. But as far as I’m aware, darcs can’t.

    Tet on Tuesday, April 08, 2008 at 12:21 PM #

  20. #10 – you are only committing to your local copy. Now you can stash or branch your experimental stuff. Go back to your master, compile, test, or whatever to your hearts content. Then push to the central repo possibly rewriting your history before you do.

    StevenA on Tuesday, April 08, 2008 at 12:39 PM #

  21. That is a champion article. I had definitely wondered about committing bits of a file. Thx.

    Dr Nic on Tuesday, April 08, 2008 at 12:47 PM #

  22. I’ve noticed that http://gitforum.net just showed up…

    mike on Tuesday, April 08, 2008 at 12:58 PM #

  23. I find myself drawn to git more and more while putting it off since I wont be able to immediately use it at work and solo projects are more manageable with svk (though nothing like git). It sounds so much like the “dream SCM” I pretty much expected when I started using Subversion only to find out the hard way…

    Baron on Tuesday, April 08, 2008 at 03:03 PM #

  24. What do you do when the hunks themselves are tangled?

    Keong on Tuesday, April 08, 2008 at 07:32 PM #

  25. Mercurial anyone?

    Mercurial has both a record extension that works like darcs record or git add --patch (an interactive commit) and a shelve extension which is pretty much the same as bzr shelve (an interactive shelving)

    But as far as I’m aware, darcs can’t.

    Indeed it can’t, it’s “theorically possible”, but the ui for darcs record doesn’t give that possibility, which is a shame.

    Masklinn on Tuesday, April 08, 2008 at 07:48 PM #

  26. #20 – the problem with git is the abuse of terminology. The problem is that in most VCSs the ‘commit’ is something more permanent (almost never thrown away), but in git it isn’t so.

    What I’d like is for git to have something like ‘publish’ (push would do it automatically) and git tools would be careful (by default) not to modify part of published history (either by rebase or reset, etc..) It would also check that unpublished changes were not available for pulling.

    Mark on Wednesday, April 09, 2008 at 12:06 AM #

  27. Great article. It may be just what I need to leave SVN. Thanks!

    Ryan on Wednesday, April 09, 2008 at 01:48 AM #

  28. Thanks for posting this! I recently started using Git, and find new, wonderful goodies every day. The git add —patch shortcut is particularly nice.

    Hez on Wednesday, April 09, 2008 at 04:02 AM #

  29. Excellent article!

    Thanks for the —patch option. Wasn’t aware of it.

    Bryan Ray on Wednesday, April 09, 2008 at 04:30 AM #

  30. Good to hear about git commit —amend. Originally from darcs (the amend-record command), I believe. This is one thing that I miss in Mercurial, which otherwise simply handles my day-to-day version control needs.

    Emil Sit on Wednesday, April 09, 2008 at 06:01 AM #

  31. I had about the same problems as you struggling with git. I was mockingly saying for months now that git needed a new ‘porcelain’ that more resembled darcs. After getting pushed by various colleagues I finally started writing one to do exactly that about 2 months ago. The various perl/whatever wrappers out there just seem to wrap the existing UI and don’t really give you the nice feature richness that darcs has.

    Please do check out the early alpha version of my tool called ‘vng’. I’d love to have as much feedback as possible ;) http://repo.or.cz/w/vng.git

    I think you’ll like the way that you indeed use very small patches that you can use or skip as you want.

    Thomas Zander on Wednesday, April 09, 2008 at 06:47 AM #

  32. Git has a ‘stash’ command similar to bzr’s shelf discussed here. If you want to test before doing a commit, put everything you’re considering in the index, run ‘git stash’ to temporarily hide away the rest of your local changes, and test away. When you’re done, git commit, git stash apply.

    Stephen Weeks on Wednesday, April 09, 2008 at 10:10 AM #

  33. How about 2 distinct workspaces? If you’re working on 2 different things then you need to check out the source in 2 different directories. This works fine for me when I’m using either Perforce (although each client costs IIRC) or Subversion.

    Cristian on Wednesday, April 09, 2008 at 12:26 PM #

  34. Did you ever know that you’re my hero?

    And then there’s git rebase —interactive, which is a bit like git commit —amend hopped up on acid and holding a chainsaw – completely insane and quite dangerous but capable of exposing entirely new states of mind.

    Bill Burcham on Wednesday, April 09, 2008 at 12:51 PM #

  35. Cristian: If managing multiple workspaces is such a fundamental aspect of source control, doesn’t it make sense to bring those concepts into the VCS?

    This works fine for me when I’m using either Perforce (although each client costs IIRC) or Subversion.

    A lot of things work fine: horse and buggies, Encyclopedia Britannica, purchasing music on physical media, Windows — all work fine. If I was shooting for fine, I’d be using RCS.

    Ryan Tomayko on Wednesday, April 09, 2008 at 12:53 PM #

  36. Thanks for this write-up!

    Choosing between mercurial and git was a very difficult decision for me, but now after the fact I’m extremely grateful I embraced git! You can really do just about anything with it. I love the rebase command, when used properly, it’s awesome. Up until this point I’ve been afraid to try git rebase —interactive… but I suppose I should go give it a shot because it sounds awesome. (like Brondo sounds awesome)

    @mike (if that is your real name) – you’re comment is a shameless self-promotion in disguise. I’m not against self-promotion, but just when you try to hide it, it’s a tad silly.

    Tim Harper on Thursday, April 10, 2008 at 08:42 AM #

  37. Do you know if there is a way to get this same behavior out of patch or diff? I would like to be able to select only certain differences into a patch, or patch only certain differences into a file, but I can find nothing on the subject. Is git the only solution to this problem?

    James Aguilar on Thursday, April 10, 2008 at 02:07 PM #

  38. This is WHY I’m switching to git! You convinced me! Thank you Ryan!

    Karim on Thursday, April 10, 2008 at 02:26 PM #

  39. darcs' “split hunk” problem is issue 126, btw, and is entirely stuck on “how should we do this?” and implementation.

    Nikolas Coukouma on Friday, April 11, 2008 at 03:22 AM #

  40. Awesome. It’s hard to find the time to try new stuff like git, but this makes it seem rather easy. Thanks.

    Greg Donald on Friday, April 11, 2008 at 04:54 AM #

  41. Nice! Thanks for opening my eyes.

    Stefan Naewe on Sunday, April 13, 2008 at 09:02 PM #

  42. I’m afraid your “how to do it when your vcs won’t” is really a huge strawman argument. Here’s a much simpler way that I frequently use at work with Tortoise SVN.

    1. Start the Commit process. (right click on the project folder and hit Commit.)
    2. Right click on each file listed in the commit. Look through the Diff and comment on it in the comment box.
    3. If I see an entire file I don’t want to commit right now, I simply uncheck it.
    4. Occasionally I’ll entangle something and not want to commit the whole file. An example is a file I have that has one line NOPLC=TRUE. In the repository it’s NOPLC=FALSE. This is a pure testing variable. If I edit that file, all I do is slide a copy of that one file into another folder, then right click on the offending blocks of the file and say “Revert.”
    5. Now properly commented, I hit the button and the changes go into the repository.
    6. Now that I’m done with that, I simply move those files back into place from my temporary holding folder. The changes that I’ve committed no longer show up on my svn diff, leaving only those that are yet to commit.

    It’s nice that git and bzr do this automatically, but it’s hardly as much of a problem as you imply it is and hardly intrinsic or an unsurmountable problem in SVN. As for other VCS front ends than TortoiseSVN, I’m sure there’s 2-way diff/merge programs that you can slide into anything that would do the same thing without all the superfluous patches. I haven’t needed to do partial commits in linux yet, but my workflow is much the same with multiple terminal windows.

    I’m planing on playing with Bzr eventually and there are many great reasons to pick Bzr and Git over Svn, but this isn’t really a reason to hang your hat on one or the other.

    Now being able to replicate the repository, edit the sources on a non-network-connected machine, commit, then carry that repository on a memory stick and put it on a network connected machine and commit it again to the main repository… That’s something that Svn can’t really do right now that we want in our VCS.

    Kazriko on Monday, April 28, 2008 at 04:51 PM #

  43. @James Aguilar

    Look at patchutils ( http://cyberelk.net/tim/software/patchutils/ ) and in particular, filterdiff.

    Eli on Saturday, May 10, 2008 at 04:40 AM #

  44. And then there’s git rebase –interactive, which is a bit like git commit –amend hopped up on acid and holding a chainsaw – completely insane and quite dangerous but capable of exposing entirely new states of mind.

    But while being dangerous git has this general behaviour of immortality, ehem immutability… Everything in the object store is immutable and can’t be broken once it is in there (until you get to know git prune…) The only mutable things in your git repository are the heads which are just references to a specific commit.

    So if you accidentally do(rebase, commit) anything wrongly regarding your repository nothing is lost. Everything is still there, but not referenced anymore. You can look in .git/logs how the references have changed and checkout some older commit and work on this one (or git reset your branch to that commit).

    Johannes on Tuesday, May 20, 2008 at 11:48 PM #

  45. I want to say that having used Git for a while, I have gotten myself into and out of a few hairy situations using rebase -i. :)

    What is awesome, is git gives you enough rope to hang yourself ( and I have, borking my tree 3 times now! ), and a knife to cut yourself down! If you haven’t pruned in a while, you can immediately go back and fix your mess.

    I can’t count the number of times I’ve clobbered myself with Subversion’s brain dead merging, or tried to revert a file, and fix it, and pretty much have to resort to all kinds of painful contortions to try and get my branch back into shape.

    Dan on Friday, July 11, 2008 at 12:07 PM #

  46. Best explanation of the index for beginners that I’ve seen so far

    Wesley on Saturday, January 10, 2009 at 06:41 AM #

  47. Ryan, you never discuss branching which I thought was a huge advantage to Git? Branching is cheap, so do it often. Can you elaborate?

    Taylor on Friday, March 06, 2009 at 10:53 AM #

  48. Very useful explanation of git’s index. I like your writing style. Especially the part “I figure this will take all of maybe ten minutes, so I begin pounding away in my working copy. After screwing around for an hour or so, …” sounds quite familiar to me. ;–)

    Rainer on Friday, March 13, 2009 at 08:52 AM #

  49. #24: Exactly. What git is still missing is a way to split hunks manually. Sometimes what looks to git like one unsplittable hunk is actually two separate changes, and as far as I can tell there’s no way to convince it otherwise.

    Katherine on Tuesday, April 21, 2009 at 02:27 PM #

  50. git was driving me mad: now it all makes sense! A wonderful article in a sea of low quality “information”: clear, informative, well written and educational: you’re my hero of the day! Thanks :)

    Tushar on Friday, May 01, 2009 at 01:36 PM #

  51. Thanks a lot for this useful article. It helped me a bunch when I needed to untangle my working copy, to break a gigantic patch down into more easily digestible pieces!

    Fred on Wednesday, May 06, 2009 at 06:33 PM #

  52. Another thank you! A coworker pointed me at this, turns out that “git add —patch” works fine with git-svn, too, and it really does give you more freedom to fix bugs right when you notice them (I’d been doing git-stash/fix/git-stash-pop but this is much smoother.) Best of both worlds…

    _Mark_ on Thursday, July 23, 2009 at 11:55 PM #