— anon
The other day, I overheard two developers discussing pros and cons of various version control systems. I only caught this fragment: “… What sucks about Git is that when you look at a merge commit, you can’t really see what changed!”. It wasn’t the first time I heard such complaints. Time to debunk Git merge commits.
I’ll use this repository which contains two merge commits (labelled ‘M’):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
M 7b224b1 [master] Merge branch 'add_author' |\ | o 7fee5d6 [add_author] Added author o | 41c338a Indented title |/ M fd63230 Merge branch 'add_title' |\ | o 6be3af1 [add_title] Added title o | 69a7968 Replace comma with period |/ o ddfebcd [better_formatting] Inserted blank lines | o f15c2b9 Inital version |
In my initial commit I added a file called ‘song_of_the_bell.txt’ which contains the first stanza of an English translation of a famous poem by Friedrich Schiller:
1 2 3 4 5 6 7 8 9 10 |
$ git show f15c2b9 ... + Walled in fast within the earth + Stands the form burnt out of clay. + This must be the bell’s great birth! + Fellows, lend a hand to-day. + Sweat must trickle now + From the burning brow, |
The next commit (ddfebcd) just added blank lines after every other line and was done on a branch called ‘better_formatting’. You don’t see the branch ‘better_formatting’ (graphically as a branch) because the merge to ‘master’ was a so-called fast-forward merge.
Then, I made a change on ‘master’ that replaced the comma on the last line with a period:
1 2 3 4 5 6 7 8 9 10 |
$ git show 69a7968 ... @@ -5,4 +5,4 @@ This must be the bell’s great birth! Fellows, lend a hand to-day. Sweat must trickle now -From the burning brow, +From the burning brow. |
and a concurrent change on branch ‘add_title’:
1 2 3 4 5 6 7 8 9 |
$ git show 6be3af1 ... @@ -1,3 +1,5 @@ +SONG OF THE BELL + Walled in fast within the earth Stands the form burnt out of clay. |
Afterwards, ‘add_title’ was merged into ‘master’. Since there were modifications on both branches, a fast-forward merge was not possible, so there’s an explicit merge commit (fd63230). What do you think you’ll see when you show this merge commit?
Nothing, or rather, not much!
1 2 3 4 5 6 7 |
$ git show fd63230 commit fd6323029cf0b3aa380d013cbc6305db0e029687 Merge: 69a7968 6be3af1 Merge branch 'add_title' |
You don’t see the typical diff-like line changes and that’s what the developers lamented about. Other version control systems would give them what they want, namely the delta between the merge commit and the previous commit on ‘master’. This would allow them to easily figure out what was changed on ‘master’. Git can do it as well, but for merge commits you have to be explicit, ‘git show’ doesn’t cut it:
1 2 3 4 5 6 7 8 9 10 11 12 |
$ git diff 69a7968 f6d3230 # variant 1 $ git diff f6d3230^ f6d3230 # variant 2 $ git diff fd63230^- # variant 3 ... @@ -1,3 +1,5 @@ +SONG OF THE BELL + Walled in fast within the earth Stands the form burnt out of clay. |
fd63230^- is a shortcut and translates to “show the difference between the predecessor of commit fd63230 and the commit fd63230 itself”. In general,
1 2 3 |
$ git diff fd63230^-2 |
There’s a reason why a regular ‘git show’ doesn’t show much. In order to understand, we need to talk about combined diffs first. Combined diffs show the delta between the merge commit and the merge commit’s both parents in a single diff. Let’s produce a combined diff for our merge commit by using the -c option:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
$ git show -c fd63230 commit fd6323029cf0b3aa380d013cbc6305db0e029687 Merge: 69a7968 6be3af1 ... diff --combined song_of_the_bell.txt index c26b26c,a52b0bf..14147a6 --- a/song_of_the_bell.txt +++ b/song_of_the_bell.txt @@@ -1,3 -1,5 +1,5 @@@ + SONG OF THE BELL + Walled in fast within the earth Stands the form burnt out of clay. @@@ -5,4 -7,4 +7,4 @@@ This must be the bell’s great birth Fellows, lend a hand to-day. Sweat must trickle now -From the burning brow, +From the burning brow. |
First of all, notice the line containing “Merge: 69a7968 6be3af1”: The first hash is the hash of the first parent of the merge commit (aka. the merge-to parent, fd63230^1), while the second hash belongs to the second parent of the merge commit (aka. the merge-from parent, fd63230^2). Next comes the diff output, in which the first column is used to show the delta between the merge commit and the first parent, whereas the second column is used to display the delta between the merge commit and the second parent:
1 2 3 4 |
+ SONG OF THE BELL + |
The +/- markers are in the first column (i. e. markers are not indented) which means that “SONG OF THE BELL” and a single blank line were added to the first parent (on master). This change must have come from the merge-from commit (the branch ‘add_title’). Conversely,
1 2 3 4 |
-From the burning brow, +From the burning brow. |
shows the difference between the second parent (on ‘add_title’) and the merge commit, since the +/- markers are in the second column (i. e. markers are indented by one space). These are the changes that were done on master.
There’s also a –cc option (think “compact combined”) that gives an even tighter output than -c in that it only shows modifications that occur on both parents in the same lines. In other words, it’s a combined diff showing only merge conflicts. Since the changes in the merge commit fd63230 are non-conflicting (they’re on different lines), –cc produces no diff at all:
1 2 3 4 5 6 7 |
$ git show --cc fd63230 commit fd6323029cf0b3aa380d013cbc6305db0e029687 Merge: 69a7968 6be3af1 Merge branch 'add_title' |
Wait a minute! Isn’t this the same output that we got above when we executed a plain ‘git show fd63230’ without the –cc option? Precisely! When showing merge conflicts, ‘git show’ defaults to “compact combined format”, which displays only conflicts. That’s why most merge commits are empty and that’s why there’s so much whining. On the other hand, this little feature makes the life of an integrator much easier, as (s)he can focus on the parts of a merge commit that are criticial: conflicts.
Now let’s take a look at the other merge commit, the one at the top of the history:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
$ git show 7b224b1 commit 7b224b1c15b583369fd939ff83167a92fbc586ad (HEAD -> master) Merge: 41c338a 7fee5d6 ... diff --cc song_of_the_bell.txt index dde3e9d,c50a1c9..d0e5bd1 --- a/song_of_the_bell.txt +++ b/song_of_the_bell.txt @@@ -1,4 -1,5 +1,5 @@@ -SONG OF THE BELL -(Friedrich Schiller) + SONG OF THE BELL ++ (Friedrich Schiller) Walled in fast within the earth Stands the form burnt out of clay. |
Here you do have some output, which means there was a conflict. Again, the first column shows what changed between the first parent (merge-to parent) and the merged version, which is just the addition of the author name “Friedrich Schiller”. Obviously, this change originated from the ‘add_author’ branch. The second column shows what has changed between the second parent (merge-from parent) and the merged version. Clearly, the title “SONG OF THE BELL” was indented on ‘master’. But why is the author name “Friedrich Schiller” marked as a change in the second column as well? It shouldn’t appear, because this is the change that was done in the merge-from parent, right? As always, Git is right, it should. During the merge, as part of the conflict resolution, the author name “Friedrich Schiller” was indented (in the spirit of the change in the merge-to parent which indented the title). It’s this indentation that has changed in the merge commit compared to the merge-from parent.
Understanding combined diffs definitely takes a little getting used to. That’s the reason why most people only care about what changed between the merge-to commit and the merge commit. You already know how to obtain these changes painlessly:
1 2 3 4 5 6 7 8 9 10 |
$ git diff 7b224b1^- ... @@ -1,4 +1,5 @@ SONG OF THE BELL + (Friedrich Schiller) Walled in fast within the earth Stands the form burnt out of clay. |