The more I think about making this change the less I am sure about it.
Current Implementaion
Above @ob shows a sample SCCS output, but that is not the actual file format. Currently, we save a single string per file delta to save the. This string is like this:
$ echo hi > foo
$ BK_USER=user BK_HOST=host.com bk new foo
foo revision 1.1: +1 -0 = 1
$ bk log -nd:FULLUSERHOST: -r+ foo
user/wscott@host.com/x99.wscott.bitkeeper.com
Without the BK_USER
& BK_HOST
then just wscott@x99.wscott.bitkeeper.com
would have been saved.
So BitKeeper records the actual local username and hostname where the commit was recorded and the requested user@host from the env overrides. That BK_USER
name is that is usually displayed when the history is being browsed.
We normally use BK_USER=name
when creating a commit that was actually written by another person. So in a why this is like the committer/author split used by git. But not really as I hope to explain/
Internally these are called HOST and REALHOST (same for USER):
$ bk log -nd:HOST: -r+ foo
host.com
$ bk log -nd:REALHOST: -r+ foo
x99.wscott.bitkeeper.com
The REALUSER@REALHOST
is very important internally because the delta uniqueness guarantee is based on the assumption that a hostname is a unique name for the current machine and that user’s home directory is the same for all csets made by this user in any repository with this hostname.
So while USER@HOST could be a valid email address and corresponds to git’s ‘author’, the REALUSER@REALHOST is unlikely to be a valid email address and certainly not the canonical name for the ‘committer’ field.
Proposal
Above @ob proposed we add a new email field in addition to the existing :FULLUSERHOST:
field. I don’t think that is really necessary. Just embrace the fact that we already have store two names and use USER@HOST
as the email address of the user who created the cset.
We don’t need to save the user’s name with each cset since the email is a unique key for that user. We can have a BitKeeper/etc/authors
file that is automatically maintained giving the mapping from email addresses to names if we want to include the user’s name in some reports or have a place to import/export data from git repositories. (Yes there will be some inaccuracy as git could have multiple names for the same person.)
So in the $HOME/.bk/config
file we can record the Name/email for csets that are created on this machine. And perhaps make bk require that this be set in normal operation. The existing BK_USER
and BK_HOST
could also be set in the environment, but I would probably extend BK_USER
so it can take the whole email address if needed.
Questions
- Do we need a separate
committer
email identifier other that just to unix user and hostname?
- I think that is from Linus’ model of committing email patches.
- What about dual credit for csets developed by multiple people? It the past we have done stuff like
BK_USER=ob+wscott
, but that only works if all the tools expect that.
- Unlike git, in Bitkeeper you can have a different author for each file in the cset and that works pretty well. Files changes are owned by the person who made most of those changes and the overall cset has a single owner.