Get rid of the "locked..." and "...unlocked!" messages by default, since
they're not usually interesting. But add a new option to bring them back in
case we end up with trouble debugging the locking stuff. (I don't really
100% trust it yet, although I haven't had a problem for a while now.)
Now 'redo test' runs the tests, but 'redo t' just builds the programs.
Also removed wvtest stuff; we're not really using it properly anyway and
it's not helping our testing right now. It might come back later.
dirty_deps() changed its meaning now that we also have to check
state.isbuilt(). Now, just because dirty_deps() returns true doesn't mean
that the file should be unstamped (which forces a rebuild); this might have
happened because of state.isbuilt, which means someone already *did* do a
rebuild.
If we get past state.isbuilt() and into looking at the children, however,
and one of the children is dirty, then we should definitely unstamp the
current target.
If someone else built and marked one of our dependencies, then that
dependency would show up as *clean* in a later redo-ifchange, so other
dependents of that file wouldn't be rebuilt.
We actually have to track two session-specific variables: whether the file
has been checked, and whether it was rebuilt. (Or alternatively, whether it
was dirty when we checked it the first time. But we store the former.)
If a depends on b depends on c, then if when we consider building a, we have
to check b and c. If we then are asked about a2 which depends on b, there
is no reason to re-check b and its dependencies; we already know it's done.
This takes the time to do 'redo t/curse/all' the *second* time down from
1.0s to 0.13s. (make can still do it in 0.07s.)
'redo t/curse/all' the first time is down from 5.4s to to 4.6s. With -j4,
from 3.0s to 2.5s.
redo: 5.4s
redo -j4: 3.0s
make: 2.3s
make -j4: 1.4s
make SHELL=/bin/dash: 1.2s
make SHELL=/bin/dash -j4: 0.83s
We have some distance to go yet. Of course, redo is still written in
python, not C, so it's very expensive, and the on-disk dependency store is
very inefficient.
This greatly reduces the number of fork+exec calls, so in particular,
t/curse/all.do now runs much faster:
/bin/sh (bash): was 5.9s, now 2.2s
/bin/dash: was 3.2s, now 1.1s
Obviously improving the speed of minimal/do doesn't really matter, except
that it makes a good benchmark to compare the "real" redo against. So far
it's losing badly: 5.4s.
That way, if everything is locked, we can determine that with a single
token, reducing context switches.
But mostly this is good because the code is simpler.
The 'redo' command is supposed to *always* rebuild, not just if nobody else
rebuilt it. (If you want "rebuild sometimes" behaviour, use redo-ifchange.)
Thus, we shouldn't be short circuiting it just because a file was previously
locked and then built okay.
However, there's still a race condition in parallel builds, because
redo-ifchange only checks the build stamp of each file once, then passes it
to redo. Thus, we end up trying to build the same stuff over and over.
This change actually makes it build *more* times, which seems dumb, but is
one step closer to right.
Doing this broke 'make test', however, because we were unlinking the target
right before building it, rather than replacing it atomically as djb's
original design suggested we should do. Thus, because of the combination of
the above two bugs, CC would appear and then disappear even as people were
trying to actually use it. Now it gets replaced atomically so it should
at least work at all times... even though we're still building it more than
once, which is incorrect.
Now t/curse passes again when parallelized (except for the countall
mismatch, since we haven't fixed the source of that problem yet). At least
it's consistent now.
There's a bunch of stuff rearranged in here, but the actual important
problem was that we were doing unlink() on the lock fifo even if ENXIO,
which meant a reader could connect in between ENXIO and unlink(), and thus
never get notified of the disconnection. This would cause the build to
randomly freeze.
This doesn't really seem to change anything, but it's more correct and
should reveal weirdness (especially an incorrect .redo directory in a
sub-redo) sooner.
This makes 'redo -j1000' now run successfully in t/curse, except that we
foolishly generate the same files more than once. But at least not more
than once *in parallel*.
...because it seems my locking isn't very good. It exposes annoying
problems involving rebuilding the same files more than once, screwing up
stamp files with redo -j, and being unnecessarily slow when checking
dependencies. So it's a pretty good test considering how simple it is.
Didn't add it to t/all.do yet, because it would fail.
Now people waiting for a lock can wait for the fifo to be ready, which means
it's instant instead of polled. Very pretty. Probably doesn't work on
Windows though.
The problem is that redo-ifchange has a different $PWD than its
sub-dependencies, so as it's chasing them down, fixing up the relative paths
totally doesn't work at all.
There's probably a much smarter fix than this, but it's too late at night to
think of it right now.
atoi() was getting redundant, and unfortunately we can't easily load
helpers.py in some places where we'd want to, because it depends on vars.py.
So move it to its own module.
The problem is if someone accidentally creates a file called "test" *before*
.redo/gen^test got created, then 'redo test' would do nothing, because redo
would assume it's a source file instead of a destination, according to djb's
rule. But in this case, we know it's not, since test.do exists, so let's
build it anyway. The problem is related to .PHONY rules in make.
This workaround is kind of cheating, because we can't safely apply that rule
if foo and default.do exist, even though default.do can be used to build
foo.
This probably won't happen very often... except with minimal/do, which
creates these empty files even when it shouldn't. I'm not sure if I should
try to fix that or not, though.