Bk: fslayer.c:504: fslayer_utime: Assertion `0' failed

(From our-ref D130681.)

martind@swiftboat:~/tmp/D130647$ bk cp /home/martind/work/misc/tools/scripts/mean /home/martind/work/misc/tools/scripts/variance
bk: fslayer.c:504: fslayer_utime: Assertion `0' failed.
Aborted (core dumped)
martind@swiftboat:~/tmp/D130647$ 

(gdb) bt
#0  0xf76e1d70 in __kernel_vsyscall ()
#1  0xf7538367 in raise () from /lib/i386-linux-gnu/i686/cmov/libc.so.6
#2  0xf7539a23 in abort () from /lib/i386-linux-gnu/i686/cmov/libc.so.6
#3  0xf75316c7 in ?? () from /lib/i386-linux-gnu/i686/cmov/libc.so.6
#4  0xf7531777 in __assert_fail () from /lib/i386-linux-gnu/i686/cmov/libc.so.6
#5  0x080b9d0c in fslayer_utime (path=0x92de8b0 "/home/martind/work/misc/tools/scripts/SCCS/s.variance", buf=0xff8edb14) at fslayer.c:504
#6  0x08145d1f in sccs_setStime (s=0x92cce50, newest=0) at slib.c:1182
#7  0x08150fcc in sccs_finishWrite (s=0x92cce50) at slib.c:2413
#8  0x08168072 in sccs_admin (sc=0x92cce50, p=0, flags=8, f=0x0, z=0x0, u=0x0, s=0x0, mode=0x0, text=0x0) at slib.c:12589
#9  0x081682e2 in sccs_adminFlag (sc=0x92cce50, flags=8) at slib.c:12222
#10 0x0816830a in sccs_newchksum (s=0x92cce50) at slib.c:12162
#11 0x0809620d in cp (quiet=, force=, to=0xff8f4839 "/home/martind/work/misc/tools/scripts/variance", from=0xff8f480e "/home/martind/work/misc/tools/scripts/mean") at cp.c:140
#12 cp_main (ac=3, av=0xff8f2554) at cp.c:45
#13 0x0804dd3d in cmd_run (av=, ac=, is_bk=, prog=0xff8f480b "cp") at bk.c:982
#14 main (ac=-1983893163, av=0xff8f2554, env=0xff8f2568) at bk.c:741
(gdb) 

Oh dear, that's left me with some detritus:

martind@swiftboat:~/work/misc$ bk -r check -ac
===============================================================================
check: tools/scripts/variance has pending deltas but no d.file, repairing.
===============================================================================
martind@swiftboat:~/work/misc$ bk -r check -ac
martind@swiftboat:~/work/misc$ bk sfiles -p
tools/scripts/SCCS/s.variance
martind@swiftboat:~/work/misc$ 

I was running bk-7.3.1ce+hnas0 at the time of the original occurrence. Lazily, I haven’t checked that it still happens with 7.3.2. The problem was reproducible but, and I imagine this is the key, it didn’t happen if my current directory was ~/work/misc, ie in the repo tree. That gave me an obvious and easy work-around, so this wasn’t even really an inconvenience. Nor was it a regression, in that the assertion failure happened with the commercial bk too (7.1), though that didn’t leave me with a repository that needed mending. My searches found no match for the rock and roll legend fslayer_utime.

Ah! You were running bk cp. My guess this is from your evil backport workflow and you were copying sfiles between repositories.

The fslayer_utime() is a function in the remap layer that handles common filesystem operations for files in SCCS directories. The code is written assuming the original pathnames repo/dir/SCCS/s.file and the remap.c code maps that to the new location at repo/.bk/dir/SCCS/file,s.

The utime() function is called on a sfile to change the modtime. This is needed for traditional make(1) where it can auto-checkout missing gfiles. The new code knows not to bother calling utime() for sfiles in a remapped repository with a .bk directory and so I added that assertion to make sure I didn’t miss any calls.

In this case, I guess you are doing a bk cp repo1/file1 repo2/file1 and the two repositories are running in different remap modes. And the code opens the file in repo1 for reading and then writes a new file in repo2. It uses the sfile write code so that format changes happen automatically. However in sccs_setStime() we are don’t notice that s->proj contains information for the source repository, but we are actually calling utime() in the destination repository.

The fix is going to be to make sure we reload s->proj with the new repository when writing to a different repository.

you were copying sfiles between repositories.

I don’t think this is an important quibble but it might lead to a simpler test case: my first preformatted line, above, should show that I was copying a file inside a single repository, but doing so from a directory outside the repository, using absolute paths.

Hmmm. OK, different bug but in a very similar area. I thought I fixed all of these. Oh well, should be easy enough to reproduce. Then I will test the other bug I was expecting.

The fix for this has been pushed to bk://bkbits.net/bk/dev

This was somewhat interesting. The initial problem was a very surprising strchr() whenstrrchr() was intended. I am surprised that didn’t break somewhere else.

After that, the locking was broken as it assumed the command was run from the repository to be locked. It also didn’t handle locking correctly when copying files between repositories.