Commit graph

55 commits

Author SHA1 Message Date
Avery Pennarun
2a936a7574 Print a nicer error message when asked to build an empty string ('').
This happens sometimes, for example, if you do
	whatever | while read x; do
		redo-ifchange "$x"
	done
and the input contains blank lines.

We could ignore the request for blankness, but it seems like that
situation might indicate a more serious bug in your parser, so it's
probably better to just abort with a meaningful error.
2018-11-03 22:02:26 -04:00
Avery Pennarun
711b05766f Print a better message when detecting pre-existing cyclic dependencies.
We already printed an error at build time, but added the broken
dependency anyway.  If the .do script decided to succeed despite
redo-ifchange aborting, the target would be successfully created
and we'd end up with an infinite loop when running isdirty() later.

The result was still "correct", because python helpfully aborted
the infinite loop after the recursion got too deep.  But let's
explicitly detect it and print a better error message.

(Thanks to Nils Dagsson Moskopp's redo-testcases repo for exposing this
problem.  If you put a #!/bin/sh header on your .do script means you
need to run 'set -e' yourself if you want .do scripts to abort after an
error, which you almost always do, and those testcases don't, which
exposed this bug if you ran the tests twice.)
2018-11-02 02:20:52 -04:00
Avery Pennarun
d143cca7da class Lock: set lockfile = None before trying to open a file.
In case the open causes a delay and we got a KeyboardInterrupt, this
prevents a weird error from the destructor about an uninitialized variable.
2018-11-02 02:20:52 -04:00
Daniele Varrazzo
adbaaf38ce Use py-setproctitle to clean up ps output in redo scripts
* Change the process title with a cleaned-up version of the script
* Document the use of setproctitle in the README
2018-10-11 03:52:20 -04:00
Robert L. Bocchino Jr
7dd63efb37 Add cyclic dependence detection.
If a depends on b which depends on a, redo would just freeze.  Now it
aborts with a somewhat helpful error message.

[Updated by apenwarr for coding style and to add a test.]
2018-10-11 03:28:05 -04:00
Robert L. Bocchino Jr
99873775ba Revise file stamp in state.py
Removed information (ctime) that was causing spurious
"you modified it" warnings.

[apenwarr: This was not quite right.  ctime includes some things we
don't want to care about, such as link count, so we have to remove it.
But it also notices changes to st_uid, st_gid, and st_mode, which we do
care about, so now we have to include those explicitly.]
2018-10-11 03:28:05 -04:00
Alan Falloon
67c1d4f7d8 We sometimes missed deps when more than one dep required a stamp check.
If must_build was nonempty when recursively calling isdirty() that
returned a list, we'd lose the original value of must_build.
2018-10-11 03:28:05 -04:00
Avery Pennarun
5156feae9d Switch sqlite3 journal mode to WAL.
WAL mode does make the deadlocking on MacOS go away.  This suggests
that pysqlite3 was leaving *read* transactions open for a long time; in
old-style sqlite journals, this prevents anyone from obtaining a write
lock, although it doesn't prevent other concurrent reads.

With WAL journals, writes can happen even while readers are holding a
lock, but the journal doesn't flush until the readers have released it.
This is not a "real" fix but it's fairly harmless, since all redo
instances will exit eventually, and when they do, the WAL journal can
be flushed.
2018-10-06 05:06:42 -04:00
Avery Pennarun
613625b580 Add more assertions about uncommitted sqlite transactions.
I think we were sometimes leaving half-done sqlite transactions sitting
around for a long time (eg. across sub-calls to .do files).  This
seemed to be okay on Linux, but caused sqlite deadlocks on MacOS.  Most
likely it's not the operating system, but the sqlite version and
journal mode in use.

In any case, the correct thing to do is to actually commit or rollback
transactions, not leave them hanging around.

...unfortunately this doesn't actually fix my MacOS deadlocks, which
makes me rather nervous.
2018-10-06 05:06:19 -04:00
Avery Pennarun
34669fba65 Use os.lstat() instead of os.stat().
I think this aligns better with how redo works.  Otherwise, if a.do
creates a as a symlink, then changes to the symlink's *target* will
change a's stat/stamp information without re-running a.do, which looks
to redo like you modified a by hand, which causes it to stop running
a.do altogether.

With this change, modifications to a's target are okay, but they don't
trigger any redo dependency changes.  If you want that, then a.do
should redo-ifchange on its symlink target explicitly.
2018-10-06 00:14:02 -04:00
Avery Pennarun
4e285607f0 Don't use "insert ... default values" in sqlite3.
It isn't supported in older sqlite3 versions.  Let's just do something
equivalent instead.
2011-05-07 23:47:03 -04:00
Avery Pennarun
705722d612 state.relpath: correct broken paths ending with /, etc.
Bug reported by Elliott Hird.  Based on a patch by Tim Allen, but I didn't
do it the same way.
2011-03-27 15:25:11 -04:00
Tim Allen
e27aaf01e7 Make redo read byte-strings from the database.
By default, the database redo uses to store file state returns filenames
as Unicode strings, and if redo tries to run a build-script whose
fully-qualified path contains non-ASCII characters then redo crashes
when trying to promote the path to a Unicode string.

This patch ensures that the database always returns byte-strings, not
Unicode strings. That way, the fully-qualified path and the target name
are both byte-strings and can be joined without issue.

(Fixes a bug reported by Zoran Zaric.)
2011-02-14 18:48:33 -08:00
Avery Pennarun
340aad1797 state.Lock: destructor: unlock before closing, not the other way around. 2011-01-18 00:48:51 -08:00
Avery Pennarun
642d60a7c9 state.File.set_static(): barf if we set_static() a nonexistent file.
If that ever happens, we probably got our paths mangled (like in the
previous commit) so we should die right away rather than allow weird things
to happen later.
2011-01-18 00:48:51 -08:00
Avery Pennarun
0dcc3f61b6 Search parent directories for default*.do.
Previously, we would only search for default*.do in the same directory in
the target; now we search parent directories as well.

Let's say we're in a/b/ and trying to build foo.o.  If we find
../../default.o.do, then we'll run

	cd ../..; sh default.o.do a/b/foo .o $TMPNAME

In other words, we still always chdir to the same directory as the .do file.
But now $1 might have a path in it, not just a basename.
2010-12-19 05:58:49 -08:00
Avery Pennarun
560f95fd77 Don't update the database during redo-ood.
Makes it slightly faster.
2010-12-19 03:50:55 -08:00
Avery Pennarun
df85b3d163 Move dependency checking from redo-ifchange into deps.py.
In preparation for sharing between multiple commands.
2010-12-19 03:50:38 -08:00
Avery Pennarun
f2d34fa685 New redo-sources and redo-targets commands.
Suggested by djb in personal email, and on the mailing list.  redo-targets
lists all the targets in the database; redo-sources lists all the existing
sources (ie. files that are referred to but which aren't targets).

redo-ifcreate filenames aren't included in the redo-sources list.
2010-12-19 03:50:38 -08:00
Avery Pennarun
95680ed7ef Switch to using a separate lockfile per target.
The previous method, using fcntl byterange locks, was very efficient and
avoided unnecessarily filesystem metadata churn (ie. creating/deleting
inodes).  Unfortunately, MacOS X (at least version 10.6.5) apparently has a
race condition in its fcntl locking that makes it unusably unreliable
(http://apenwarr.ca/log/?m=201012#13).

My tests indicate that if you only ever lock a *single* byterange on a file,
the race condition doesn't cause a problem.  So let's just use one lockfile
per target.  Now "redo -j20 test" passes for me on both MacOS and Linux.

This doesn't measurably affect the speed on Linux, at least, in my tests.

The bad news: it's hard to safely *delete* those lockfiles when we're done
with them, so they tend to accumulate in the .redo dir.
2010-12-14 02:44:29 -08:00
Avery Pennarun
294945bd0f Assert that one instance never holds multiple locks on the same file at once.
This could happen if you did 'redo foo foo'.  Which nobody ever did, I
think, but let's make sure we catch it if they do.

One problem with having multiple locks on the same file is then you have to
remember not to *unlock* it until they're all done.  But there are other
problems, such as: why the heck did we think it was a good idea to lock the
same file more than once?  So just prevent it from happening for now,
unless/until we somehow come up with a reason it might be a good idea.
2010-12-14 02:19:08 -08:00
Avery Pennarun
c64b8a3eb1 Fix a race condition caused by zap_deps().
We can't just delete all the dependencies at the beginning and re-add them:
other people might be checking the same dependencies in parallel.  Instead,
mark them as delete_me up front, and then after the build completes, remove
only the delete_me entries.
2010-12-11 22:59:55 -08:00
Avery Pennarun
1cb000ece1 redo.py: report when you're trying to rebuild a static file.
In redo-ifchange, this might be a good idea, since you might just want to
set a dependency on it, so we won't say anything from inside builder.py.
But if you're calling redo.py, that means you expect it to be rebuilt, since
there's no other reason to try.  So print a warning.

(This is what make does, more or less.)
2010-12-11 21:19:15 -08:00
Avery Pennarun
e18fa85d58 The only thing in helpers.py that needed vars.py was the log stuff.
So put it in its own file.  Now it's safer to import and use helpers even if
you can't safely touch vars.
2010-12-11 18:34:02 -08:00
Avery Pennarun
0da5c7c082 Add a redo-always command: it adds an "always dirty" dependency to your target.
This is mostly useless except when combined with redo-stamp... I think.
2010-12-11 07:02:45 -08:00
Avery Pennarun
1d26d99e0c Fix a deadlock with redo-oob.
If a checksummed target A used to exist but is now missing, and we tried to
redo-ifchange that exact file, we would unnecessarily run 'redo-oob A A';
that is, we have to build A in order to determine if A needs to be built.

The sub-targets of redo-oob aren't run with REDO_UNLOCKED, so this would
deadlock instantly.

Add an assertion to redo-oob to ensure we never try to redo-ifchange the
primary target (thus converting the deadlock into an exception).  And skip
doing redo-oob when the target is already the same as the thing we have to
check.
2010-12-11 06:16:32 -08:00
Avery Pennarun
22617d335c Half-support for using file checksums instead of stamps.
A new redo-stamp program takes whatever you give it as stdin and uses it to
calculate a checksum for the current target.  If that checksum is the same
as last time, then we consider the target to be unchanged, and we set
checked_runid and stamp, but leave changed_runid alone.  That will make
future callers of redo-ifchange see this target as unmodified.

However, this is only "half" support because by the time we run the .do
script that calls redo-stamp, it's too late; the caller is a dependant of
the stamped program, which is already being rebuilt, even if redo-stamp
turns out to say that this target is unchanged.

The other half is coming up.
2010-12-11 05:54:37 -08:00
Avery Pennarun
f6d11d5411 If a user manually changes a generated file, don't ever overwrite it.
That way the user can modify an auto-generated 'compile' script, for
example, and it'll stay modified.

If they delete the file, we can then generate it for them again.

Also, we have to warn whenever we're doing this, or people might think it's
a bug.
2010-12-10 22:43:11 -08:00
Avery Pennarun
0126f6be1e Don't wipe the timestamp when a target fails to redo.
It's really a separate condition.  And since we're not removing the target
*file* in case of error - we update it atomically, and keeping it is better
than losing it - there's no reason to wipe the timestamp in that case
either.

However, we do need to know that the build failed, so that anybody else
(especially in a parallel build) who looks at that target knows that it
died.  So add a separate flag just for that.
2010-12-10 22:41:11 -08:00
Avery Pennarun
84169c5d27 Change locking stuff from fifos to fcntl.lockf().
This should reduce filesystem grinding a bit, and makes the code simpler.
It's also theoretically a bit more portable, since I'm guessing fifo
semantics aren't the same on win32 if we ever get there.

Also, a major problem with the old fifo-based system is that if a redo
process died without cleaning up after itself, it wouldn't delete its
lockfiles, so we had to wipe them all at the beginning of each build.  Now
we don't; in theory, you can now have multiple copies of redo poking at the
same tree at the same time and not stepping on each other.
2010-12-10 03:55:51 -08:00
Avery Pennarun
b5c02e410e state.py: reorder things so sqlite never does fdatasync().
It was briefly synchronous at data creation time, adding a few ms to
redo startup.
2010-12-10 00:50:53 -08:00
Avery Pennarun
e1a0fc9c12 state.File.is_checked() was being too paranoid.
It wasn't allowing us to short circuit a dependency if that dependency had
been built previously, but that was already being checked (more correctly)
in dirty_deps().
2010-12-10 00:50:52 -08:00
Avery Pennarun
94cecc240b Don't abort if 'insert into Files' gives an IntegrityError.
It can happen occasionally if some other parallel redo adds the same file at
the same time.
2010-12-10 00:50:52 -08:00
Avery Pennarun
3ef2bd7300 Don't check as often whether the .redo directory exists.
Just check it once after running a subprocess: that's the only way it ought
to be able to disappear (ie. in a 'make clean' setup).
2010-12-10 00:50:52 -08:00
Avery Pennarun
29d6c9a746 Don't db.commit() so frequently.
Just commit when we're about to do something blocking.  sqlite goes a lot
faster with bigger transactions.  This change does show a small percentage
speedup in tests, but not as much as I'd like.
2010-12-10 00:50:52 -08:00
Avery Pennarun
fb79851530 Calculate dependencies with fewer sqlite queries. 2010-12-10 00:50:52 -08:00
Avery Pennarun
c339359f04 Schema cleanup. 2010-12-10 00:50:52 -08:00
Avery Pennarun
f4535be0cd Fix a deadlock.
We were holding a database open with a read lock while a child redo might
need to open it with a write lock.
2010-12-10 00:50:52 -08:00
Avery Pennarun
9e36106642 sqlite3: configure the timeout explicitly.
In flush-cache.sh, we have to do this, because the sqlite3 command-line tool
sets it to zero.  Inevitably during parallel testing, it'll end up
contending for a lock, and we really want it to wait a bit.

In state.py, it's not as important since the default is nonzero.  But
python-sqlite3's default of 5 seconds makes me a little too nervous; I can
imagine a disk write waiting for more than 5 seconds sometime.  So let's use
60 instead.
2010-12-10 00:50:52 -08:00
Avery Pennarun
a62bd50d44 Switch state.py to use sqlite3 instead of filesystem-based stamps.
It passes all tests when run serialized, but still gives weird errors
(OperationalError: database is locked) when run with -j5.  sqlite3 shouldn't
be barfing just because the database is locked, since the default timeout is
5 seconds, and it's dying *way* faster than that.
2010-12-10 00:50:52 -08:00
Avery Pennarun
51bbdc6c5a If we can't find a .do file for a target, mark it as not is_generated.
This allows files to transition from generated to not-generated if the .do
file is ever removed (ie. the user is changing things and the file is now a
source file, not a target).
2010-12-06 03:12:53 -08:00
Avery Pennarun
c29de89051 Fix more trouble with .do scripts that cd to other directories.
The interaction of REDO_STARTDIR, REDO_PWD, and getcwd() are pretty
complicated.  In this case, we accidentally assumed that the current
instance of redo was running with getcwd() == REDO_STARTDIR+REDO_PWD, and so
the new target was REDO_STARTDIR+REDO_PWD+t, but this isn't the case if the
current .do script did chdir().

The correct answer is REDO_STARTDIR+getcwd()+t.
2010-11-25 06:37:24 -08:00
Avery Pennarun
f3413c0f7c doublestatic: fix dependencies if two files depend on one non-generated file.
If a and b both depend on c, and c is a static (non-generated) file that has
changed since the last successful build of a and b, we would try to redo
a, but would forget to redo b.  Now it does both.
2010-11-24 04:52:30 -08:00
Avery Pennarun
9fc5ae1b56 Optimization: don't getcwd() so often.
We never chdir() except just as we exec a subprocess, so it's okay to cache
this value.  This makes strace output look cleaner, and speeds things up a
little bit when checking a large number of dependencies.

Relatedly, take a debug2() message and put it in an additional if, so that
we don't have to do so much work to calculate it when we're just going to
throw it away anyhow.
2010-11-24 03:45:38 -08:00
Avery Pennarun
f337df463d state.stamp() can't imply state.built().
...because we deliberately stamp non-generated files as well, and that
doesn't need to imply that we rebuilt them just now.  In fact, we know for a
fact that we *didn't* rebuild them just now, but we still need to record the
timestamp for later.
2010-11-22 22:53:40 -08:00
Avery Pennarun
dce0076554 Print a useful message and exit when the .redo directory disappears. 2010-11-22 04:04:45 -08:00
Avery Pennarun
2dbd47100d state.py: reduce race condition between Lock.trylock() and unlock().
If 'redo clean' deletes the lockfile after trylock() succeeds but before
unlock(), then unlock() won't be able to open the pipe in order to release
readers, and any waiters might end up waiting forever.

We can't open the fifo for write until there's at least one reader, so let's
open a reader *just* to let us open a writer.  Then we'll leave them open
until the later unlock(), which can just close them both.
2010-11-22 04:04:45 -08:00
Avery Pennarun
7aa7c41e38 builder,jwack: slight cleanup to token passing.
In rare cases, one process could end up holding onto more than one token.
2010-11-21 22:46:20 -08:00
Avery Pennarun
47edb9527d state.py: remove all the ugly fromdir= stuff.
Instead, just change the target name to be more specific, in the one place
in redo-ifchange that actually needed it.
2010-11-21 04:57:04 -08:00
Avery Pennarun
0652bc9911 Oops, earlier state.mark() stuff was a little too radical.
If someone else built and marked one of our dependencies, then that
dependency would show up as *clean* in a later redo-ifchange, so other
dependents of that file wouldn't be rebuilt.

We actually have to track two session-specific variables: whether the file
has been checked, and whether it was rebuilt.  (Or alternatively, whether it
was dirty when we checked it the first time.  But we store the former.)
2010-11-21 04:39:28 -08:00