Making sure we have a complete authors.txt file

People want us to export our bk repos to github. This requires a complete authors.txt attribution file, and, if I understand it correctly, while that file may acquire additions over time, it must not contain any changes to previously-used attributions.

We get the list of authors (user names) by running:

bk changes -and:USER: | sort -u

and then we manually fill in the full name and email address.

I think I can ‘diff -u’ the new list and the old list and scream if there are any lines that begin with an initial ‘-’, which would indicate that a previously-defined and possibly previously-used author has been changed.

While there might be unused authors in that file who could be deleted, I’m not yet sure how to detect this case.

We want to detect entries we need to add to authors.txt as soon as possible.

If the USER information is added to the ChangeSet(?) file during checkin, that means I could use a pre-commit or possibly a pre-apply trigger to make sure we don’t need to add any entries to authors.txt.

And that’s my question: when will ‘bk changes -and:USER:’ contain the USER name? After the checkin, which means I can use pre-commit or pre-apply?

Is the USER name visible in a way that I could check it in a pre-delta trigger?

I suspect pre-resolve is not ideal, and might not even work.

Where can I learn more about this? Of course, if somebody has the answer for me that would be swell, as it would save me a bunch of time :slight_smile:

I would suggest you test this, but I believe that as long as you are doing incremental exports to our repository and then your repository in the future then changing the authors file for existing users is OK. The incremental export looks at the ‘bk: MD5KEY’ annotations on the git repository to identify already exported csets so those don’t get regenerated. And it is OK for new cset to have an updated email address.

And in that case, you would look at the new stuff ‘bk changes -and:USER: -rTIP.. | sort -u
Where TIP is the top bk rev on the git tree.

In a trigger, you want to use ‘bk getuser’ to get the username and complain if it doesn’t appear in the BitKeeper/etc/authors file. That will use the same rules that will be used to make the commit itself. Having a trigger in pre-commit makes sense, but I don’t think you want one in pre-resolve because resolving that failure is difficult. pre-resolve was generally designed for code to automatically resolve conflicts in certain files.

Then your git export script can double-check that nothing is missing and you might require an occasional addition to hit the odd case where someone created a merge commit but never did any commits themselves.

manpages:

  • bk help trigger
  • bk help prs
  • bk help fast-export

Thanks, Wayne!

I think I mostly understand :slight_smile:

Any hints on how we might test this, and the range of tests we might want to do?

Good to know about the updated email address.

In the repo where we’re doing the initial bk commit, that’s where it’s probably safe to check ‘bk getuser’ to make sure it’s in the authors.txt file? My initial concern about this is the likely only times this would find a problem is with a new committer, and I think we’d want to script the addition/update of adding a new entry, as it will happen rarely, and developers new to the project might not be familiar with bk. Mostly because I wouldn’t want a newbie to mess up the authors.txt file out of ignorance. Hmmm, now to figure out a not-too-intrusive way for existing developers to be reminded of the entry we have for them in authors.txt.

Is there a way to see what the default is for the -A/–authors file on ‘bk fast-export’? For some reason, I have a recollection that at one time we had to specify that option. And am I correct in assuming that as long as the author file is committed, fast-export will use it even if it’s not checked out in the repo?

Thanks again…

H

Yes you have to specify --authors to fast-export every time. We use BitKeeper/etc/authors by convention of where to keep this file.

You want to always do the export from the same master repo in bitkeeper to the same git repository and then don’t roll either backwards.

When I original did the export work I was planning on supporting a bidirection sync which would allow new csets made in the git repository to get pulled into the bk repository. Unfortunately I was never able to solve all the problem with that code.

The script we use: https://github.com/bitkeeper-scm/bitkeeper/blob/master/src/doGitExport.sh