Oct. 29th, 2014

jack: (Default)
Following on from the previous post, I think I have a clearer idea of what I was imagining, but I'm still not sure if it makes sense or if I have the implications right.

Premises

You own a repository, imposing the convention that commits should use "first parent" dependencies to represent "history on the same branch" and "second parent" dependencies to represent changes added to that branch.

That gives a notion of "the" history of a branch, distinct from "all commits which contributed to it". That contrasts traditional git practice of considering all branches equal, but I don't think contradicts it. If two branches exchange commits, each can consider its own history primary, and the other as "additions being merged in", even if that means you have a commit "merged branch B into branch A" and another commit "merged branch A into branch B" which is the same except for having its parents the other way round where traditional git would just have one commit.

This is basically putting back the notion that branches have an official history, which some revision control systems have, but git rejected. I hadn't realised the distinction until I heard people talk about it.

This also assumes, it's possible to implement "git log --first-parent" and "git bisect --first-parent" and "git annotate --first-parent" etc to mean "which commit made the change on this branch" where they don't already exist. I realise this may not be plausible, but I don't think there's any technical reason why it's any harder than the normal git versions?

Controversial Corollary

If that is already your default, it means you may be able to introduce extra second-parent links where they would currently fill the log history with garbage, break git-bisect, etc.

Specifically, suppose that whenever you change history (primarily when you change the order/content/collation etc of your commits, but also when you rebase your changes onto a new upstream), the new commits are shown as having "first parent" dependencies as normal, but also have "second parent" dependencies to the previous HEAD (which in traditional git rebasing would become orphaned on no branch and eventually be garbage collected).

Are there any benefits?

This fits conversations I had with simont that rewriting history should itself be versioned, but I'm not sure if it actually provides practical benefits.

Would it be the case that if you were making a complicated rebase, you would automatically be assured your previous state was preserved, without having to manually create a temporary branch?

Would it be the case that if you were making a complicated rebase and made a mistake, you wouldn't have to rely on there being no garbage collection between making the mistake and recovering from it?

Would it be the case that if you were making a complicated rebase, it's possible to push and pull those processes into other repositories and share the process between multiple people, rather than assuming it all has to be completed in one repository?

Would it be the case that if you rewrite history on a branch, other branches/repos downstream can do "git pull --rebase" and it will just magically work? This is what I was looking for -- it feels like "tidying up history" should be a thing which is embraced everywhere, not a dirty secret that you have to keep private...

I'm not sure it works at all like that? But I feel like there's _something_ right about the idea?

Active Recent Entries