github branch workflow: moving old branches to an "attic"

Brian Warner warner at lothar.com
Sat Mar 21 23:18:16 UTC 2015


Hey folks.. I'm hoping to do some branch weeding in the next week or
two, and just wanted to mention what my plans were.

The official repo at https://github.com/tahoe-lafs/tahoe-lafs is kind of
a mess of stale branches right now. I count 64 branches, of which:

* 2 are pending pull-requests
* 1 is "master"
* 2 are unavoidable release branches (rel-1.9.1 and rel-1.9.2, both of
  which came from darcs and do not live in the ancestry of "master")

leaving 59 branches which are either stale (merged) pull-requests, or
superceded development branches.

I'd really like it if, when people fork/clone our repo, they only wind
up with useful things. Also, I'd like to be able to glance at the github
branch displays and know that everything there is relevant and active.
As it is, most of those branches are useless for anything other than
historical research.

So my plan is to:

* create a new "attic" account
* fork the tahoe-lafs repo into the attic, including all 59 stale
  branches
* remove those stale branches from the main tahoe-lafs repo

In the future, as new branches are added (and merged or abandoned, and
rendered stale), I'll copy those branches over to the attic, and remove
them from the main repo. I may even build some automation for this some
day, but it'll be a manual process to start with.

In the weekly Nuts+Bolts meeting, we've discussed how to balance several
conflicting goals:

* keeping the main repo tidy
* maintaining valid URLs for branches (mainly in ticket comments, but
  also on the mailing list)
* archiving development work for future code-archaeologists

With the "attic" account, if you see a URL to the main repo in an old
ticket, and that link 404s, you should be able to insert the word
"attic" into the account name of the URL and get a working link. Or, if
you really care about forward-validity of the links, you can
preemptively discuss "attic" URLs instead of the real ones (assuming my
automation pushes branches to the attic as soon as they're created,
instead of just before they're removed). I don't think we came up with a
way to make accessing historical branches as easy as accessing current
ones.


There are two other issues worth discussing. One is whether to recommend
a practice of pushing pull-requestable branches into the official repo,
or to push them into a personal one. I prefer to use a personal one, and
of course you only have the option of using the official repo if you're
a member of the core team. If we used personal-repo branches, then we
won't be accumulating the stale branches in the official one, and the
owner of the repo (the developer in question) could decide to keep the
stale branches for as long as they like.

The second is how to name branches which have been superceded by later
versions, e.g. as a pull-request evolves (responding to review feedback,
test failures, etc). If the PR gets pretty long, with lots of minor
tweaks, or if the trunk has moved on a lot since the PR was filed, I
prefer to rebase the branch and rewrite the patches to "tell a better
story". It's probably best to close the original PR and open a new one,
rather than force-push to the PR branch (which loses/confuses a lot of
the commentary). Some folks have gotten into a habit of incrementing a
branch name with each such new PR, so branches named like
"TICKETNUM-description-words.COUNTER". That practice seems fine to me,
although I'd suggest 1: keeping the branch name short, maybe putting the
ticketnum at the end instead of the beginning, and 2: leaving out the
COUNTER until the second PR is needed.

My overall goal is to keep the repo and the history clean: I've worked
on other projects with a really tangled DVCS history, where the github
"Merge Pull Request" button was the only way to land code, and it's a
real nuisance to figure out what happened when. My metric is how wide
the "railroad" diagram of history is (the kind displayed by gitk or
gitx): if that takes more than 3 or 4 columns to show, then things are
too complicated. Single-patch changes can be applied with
--fast-forward. Multi-patch changes should use --no-fast-forward but get
rebased first, and each patch should make a sensible change.



Anyways, just wanted to get feedback from folks and let you know how to
find old branches if you were so inclined.

cheers,
 -Brian



More information about the tahoe-dev mailing list