When the PostgreSQL project decided to migrate to git, we decided not to allow merge commits. A number of people made comments, in a number of different fora, to the effect that we weren't following "the git workflow". While a few commentators seemed to think that was reasonable, many thought that it demonstrated our ignorance and stupidity, and some felt it outright heresy.
So I noted with some interest Julio Hamano's blog post about the forthcoming release of git 1.7.10, which is slated to include a change to the way that merge commits work: users will now be prompted to edit the commit message, rather than just accepting a default one. Actually, what I found most interesting where Linus Torvalds' comments on this change, particularly where he says this: "This change hopefully makes people write merge messages to explain their merges, and maybe even decide not to merge at all when it's not necessary." His comments are quoted more fully in the above-linked blog article; unfortunately I don't know how to link directly to his Google+ post. And Julio Hamano makes this remark: "Merging updated upstream into your work-in-progress topic without having a good reason is generally considered a bad practice. [...] Otherwise, your topic branch will stop being about any particular topic but just a garbage heap that absorbs commits from many sources, both from you to work on a specific goal, and also from the upstream that contains work by others made for random other unfocused purposes."
In other words, if you merge more often than really necessary, the commit log will become a cluttered mess. So don't merge into your topic branches.
But, of course, there's a pretty obvious reason why people might want to merge into their topic branches regularly: if you develop your patch based on an older version of the source code, you might not find out until much later that you have merge conflicts. You might write a bunch of code that seems OK, and then only discover later that it can't be merged cleanly. Or, worse, the merge might succeed even though the code is wrong. We've had a number of cases, during PostgreSQL development, where a developer uses one piece of code as a model for a new one; the original code gets fixed before the new code is committed, but the new code still contains the bug. Developing against an older version of the source code increases the probability of errors of this type.
So, there's a trade-off here. In the comments from Linus and Julio, there's a tacit admission that too many merges are not a good thing, and yet there are obvious advantages to frequent merging. I understand and appreciate the critiques that have been made about the PostgreSQL process: rebasing and squashing commits loses history that someone might want to see; on the other hand, it also squeezes our irrelevant history that no one cares about, and makes the development history easier to follow. Part of the reason we can get away with is that Linux has had 50,071 commits in the last year, whereas we have 1,849. Had we twenty-five times as much activity as we do, it's likely that our workflow would have to change in a variety of ways, and this might well be one of them. Nevertheless, I can't help thinking that it's a mistake to believe that there is one perfect solution here. As good as git is, maybe there's a better technological solution here; or maybe this is just a hard problem.