redo-log should log to stdout, because when you ask for the specific
logs from a run, the logs are the output you requested. redo-log's
stderr should be about any errors retrieving that output.
On the other hand, when you run redo, the logs are literally the stderr
of the build steps, which are incidental to the main job (building
things). So that should be send to stderr. Previously, we were
sending to stderr when --no-log, but stdout when --log, which is
totally wrong.
Also, adding redo-log had the unexpected result that if a .do script
redirected the stderr of a sub-redo or redo-ifchange to a file or pipe,
the output would be eaten by redo-log instead of the intended output.
So a test runner like this:
self.test:
redo self.runtest 2>&1 | grep ERROR
would not work; the self.runtest output would be sent to redo's log
buffer (and from there, probably printed to the toplevel redo's stderr)
rather than passed along to grep.
--no-log: don't capture logs or run redo-log (same as pre-redo-log redo)
--no-pretty: don't pretty-print logs, print @@REDO lines.
The latter is an option to both redo and redo-log.
redo now saves the stderr from every .do script, for every target, into
a file in the .redo directory. That means you can look up the logs
from the most recent build of any target using the new redo-log
command, for example:
redo-log -r all
The default is to show logs non-recursively, that is, it'll show when a
target does redo-ifchange on another target, but it won't recurse into
the logs for the latter target. With -r (recursive), it does. With -u
(unchanged), it does even if redo-ifchange discovered that the target
was already up-to-date; in that case, it prints the logs of the *most
recent* time the target was generated.
With --no-details, redo-log will show only the 'redo' lines, not the
other log messages. For very noisy build systems (like recursing into
a 'make' instance) this can be helpful to get an overview of what
happened, without all the cruft.
You can use the -f (follow) option like tail -f, to follow a build
that's currently in progress until it finishes. redo itself spins up a
copy of redo-log -r -f while it runs, so you can see what's going on.
Still broken in this version:
- No man page or new tests yet.
- ANSI colors don't yet work (unless you use --raw-logs, which gives
the old-style behaviour).
- You can't redirect the output of a sub-redo to a file or a
pipe right now, because redo-log is eating it.
- The regex for matching 'redo' lines in the log is very gross.
Instead, we should put the raw log files in a more machine-parseable
format, and redo-log should turn that into human-readable format.
- redo-log tries to "linearize" the logs, which makes them
comprehensible even for a large parallel build. It recursively shows
log messages for each target in depth-first tree order (by tracing
into a new target every time it sees a 'redo' line). This works
really well, but in some specific cases, the "topmost" redo instance
can get stuck waiting for a jwack token, which makes it look like the
whole build has stalled, when really redo-log is just waiting a long
time for a particular subprocess to be able to continue. We'll need to
add a specific workaround for that.
If a depends on b which depends on a, redo would just freeze. Now it
aborts with a somewhat helpful error message.
[Updated by apenwarr for coding style and to add a test.]
If you use "redo --old-args", it will switch back to the old
(apenwarr-style) arguments for now, to give you time to update your .do
scripts. This option will go away eventually.
Note: minimal/do doesn't understand the --old-args option. If you're using
minimal/do in your project, keep using the old one until you update your use
of $1/$2, and then update to the new one.
apenwarr-style default.o.do:
$1 foo
$2 .o
$3 whatever.tmp
djb-style default.o.do:
$1 foo.o
$2 foo
$3 whatever.tmp
apenwarr-style foo.o.do:
$1 foo.o
$2 ""
$3 whatever.tmp
djb-style foo.o.do:
$1 foo.o
$2 foo.o (I think?)
$3 whatever.tmp
We called 'redo' instead of 'redo-ifchange' on our indeterminate objects.
Since other instances of redo-oob might be running at the same time, this
could cause the same object to get rebuilt more than once unnecessarily.
The unit tests caught this, I just didn't notice earlier.
If a depends on b depends on c, and c is dirty but b uses redo-stamp
checksums, then 'redo-ifchange a' is indeterminate: we won't know if we need
to run a.do unless we first build b, but the script that *normally* runs
'redo-ifchange b' is a.do, and we don't want to run that yet, because we
don't know for sure if b is dirty, and we shouldn't build a unless one of
its dependencies is dirty. Eek!
Luckily, there's a safe solution. If we *know* a is dirty - eg. because
a.do or one of its children has definitely changed - then we can just run
a.do immediately and there's no problem, even if b is indeterminate, because
we were going to run a.do anyhow.
If a's dependencies are *not* definitely dirty, and all we have is
indeterminate ones like b, then that means a's build process *hasn't
changed*, which means its tree of dependencies still includes b, which means
we can deduce that if we *did* run a.do, it would end up running b.do.
Since we know that anyhow, we can safely just run b.do, which will either
b.set_checked() or b.set_changed(). Once that's done, we can re-parse a's
dependencies and this time conclusively tell if it needs to be redone or
not. Even if it does, b is already up-to-date, so the 'redo-ifchange b'
line in a.do will be fast.
...now take all the above and do it recursively to handle nested
dependencies, etc, and you're done.
It passes all tests when run serialized, but still gives weird errors
(OperationalError: database is locked) when run with -j5. sqlite3 shouldn't
be barfing just because the database is locked, since the default timeout is
5 seconds, and it's dying *way* faster than that.
Again, I forgot to make vars.py not crash if the variables aren't set, so we
can print a useful error message. But this time I have the right solution:
vars.py will do the checking for itself, and abort with a nice message.
Previously, the default was to *always* keep going, which is actually not
usually what you want. Now we actually exit correctly after an error. Of
course you still might have multiple errors before existing if you were
building in parallel.
Get rid of the "locked..." and "...unlocked!" messages by default, since
they're not usually interesting. But add a new option to bring them back in
case we end up with trouble debugging the locking stuff. (I don't really
100% trust it yet, although I haven't had a problem for a while now.)
atoi() was getting redundant, and unfortunately we can't easily load
helpers.py in some places where we'd want to, because it depends on vars.py.
So move it to its own module.
Previously, for testing, we were *always* randomizing the build order of
dependencies. That's annoying since it'll make build logs differ randomly
from one run to the next, which could make comparisons harder. However, the
feature is still useful for uncovering hidden dependencies between objects.
It used to say:
redo: t/all
redo: hello
and now it says:
redo t/all
redo t/hello
ie. there's no colon, and the path is intact. That means if the build
fails, you can cut-and-paste 'redo t/hello', add a -v, and try to debug
what went wrong.