If all.do runs and creates no output, we shouldn't create a file called
'all', but we should remember that 'all' has been run successfully. We do
this by creating 'all.did' during the build.
Since minimal/do always just wipes everything out every time it runs, we can
safely remove the .did files after minimal/do terminates, so this doesn't
clutter things too much in normal use.
This fixes some edge cases, particularly that 'minimal/do clean' no longer
leaves stupid files named "clean" lying around, and the redo-sh directory
can now be rebuilt correctly since we rebuild it as long as redo-sh.did
doesn't exist. (We don't want to "rm -rf redo-sh" because it makes me
nervous.)
This includes a fairly detailed test of various known shell bugs from the
autoconf docs.
The idea here is that if redo works on your system, you should be able to
rely on a *good* shell to run your .do files; you shouldn't have to work
around zillions of bugs like autoconf does.
Previously, we would only search for default*.do in the same directory in
the target; now we search parent directories as well.
Let's say we're in a/b/ and trying to build foo.o. If we find
../../default.o.do, then we'll run
cd ../..; sh default.o.do a/b/foo .o $TMPNAME
In other words, we still always chdir to the same directory as the .do file.
But now $1 might have a path in it, not just a basename.
Suggested by djb in personal email, and on the mailing list. redo-targets
lists all the targets in the database; redo-sources lists all the existing
sources (ie. files that are referred to but which aren't targets).
redo-ifcreate filenames aren't included in the redo-sources list.
These export and import, respectively, the generated man pages to/from the
git branch called 'man'. You can use it to retrieve the .1 files if you
don't have a working pandoc.
We were hardcoding the absolute $LIBDIR location, which sounds smart, but not if
you're doing "make install" into a temp dir that will end up somewhere else
later.
Instead, look for ../lib/redo/ from wherever the binary is installed.
The previous method, using fcntl byterange locks, was very efficient and
avoided unnecessarily filesystem metadata churn (ie. creating/deleting
inodes). Unfortunately, MacOS X (at least version 10.6.5) apparently has a
race condition in its fcntl locking that makes it unusably unreliable
(http://apenwarr.ca/log/?m=201012#13).
My tests indicate that if you only ever lock a *single* byterange on a file,
the race condition doesn't cause a problem. So let's just use one lockfile
per target. Now "redo -j20 test" passes for me on both MacOS and Linux.
This doesn't measurably affect the speed on Linux, at least, in my tests.
The bad news: it's hard to safely *delete* those lockfiles when we're done
with them, so they tend to accumulate in the .redo dir.
This could happen if you did 'redo foo foo'. Which nobody ever did, I
think, but let's make sure we catch it if they do.
One problem with having multiple locks on the same file is then you have to
remember not to *unlock* it until they're all done. But there are other
problems, such as: why the heck did we think it was a good idea to lock the
same file more than once? So just prevent it from happening for now,
unless/until we somehow come up with a reason it might be a good idea.
This comes down to the lack of a 'seq' command (what?!) and the fact that
BSD "wc -l" returns extra whitespace, while the GNU version doesn't. We
should be using numeric comparisons instead of string comparisons, and then
it's ok.
We can't just delete all the dependencies at the beginning and re-add them:
other people might be checking the same dependencies in parallel. Instead,
mark them as delete_me up front, and then after the build completes, remove
only the delete_me entries.
In redo-ifchange, this might be a good idea, since you might just want to
set a dependency on it, so we won't say anything from inside builder.py.
But if you're calling redo.py, that means you expect it to be rebuilt, since
there's no other reason to try. So print a warning.
(This is what make does, more or less.)
...only when running under minimal/do, of course.
The tests in question mostly fail because they're testing particular
dependency-related behaviour, and minimal/do doesn't support dependencies,
so naturally it doesn't work.
Just allow that sub-redo to return an error code. Also, parent redos should
return error code 1, not the same code as the child. That makes it easier
to figure out which file generated the "special" error code.
That makes it a little easier to tell, in a strace, what the process is
waiting on. If it's 100/101, then it's waiting on a token; 50+ means waiting
on a subtask.
Also, we weren't closing the read side of subtask fds on exec. This didn't
cause any problems, but did result in a wasted fd in subprocesses.
If a checksummed target A used to exist but is now missing, and we tried to
redo-ifchange that exact file, we would unnecessarily run 'redo-oob A A';
that is, we have to build A in order to determine if A needs to be built.
The sub-targets of redo-oob aren't run with REDO_UNLOCKED, so this would
deadlock instantly.
Add an assertion to redo-oob to ensure we never try to redo-ifchange the
primary target (thus converting the deadlock into an exception). And skip
doing redo-oob when the target is already the same as the thing we have to
check.
We called 'redo' instead of 'redo-ifchange' on our indeterminate objects.
Since other instances of redo-oob might be running at the same time, this
could cause the same object to get rebuilt more than once unnecessarily.
The unit tests caught this, I just didn't notice earlier.