apenwarr-redo

Author	SHA1	Message	Date
Avery Pennarun	560f95fd77	Don't update the database during redo-ood. Makes it slightly faster.	2010-12-19 03:50:55 -08:00
Avery Pennarun	df85b3d163	Move dependency checking from redo-ifchange into deps.py. In preparation for sharing between multiple commands.	2010-12-19 03:50:38 -08:00
Avery Pennarun	f2d34fa685	New redo-sources and redo-targets commands. Suggested by djb in personal email, and on the mailing list. redo-targets lists all the targets in the database; redo-sources lists all the existing sources (ie. files that are referred to but which aren't targets). redo-ifcreate filenames aren't included in the redo-sources list.	2010-12-19 03:50:38 -08:00
Avery Pennarun	95680ed7ef	Switch to using a separate lockfile per target. The previous method, using fcntl byterange locks, was very efficient and avoided unnecessarily filesystem metadata churn (ie. creating/deleting inodes). Unfortunately, MacOS X (at least version 10.6.5) apparently has a race condition in its fcntl locking that makes it unusably unreliable (http://apenwarr.ca/log/?m=201012#13). My tests indicate that if you only ever lock a single byterange on a file, the race condition doesn't cause a problem. So let's just use one lockfile per target. Now "redo -j20 test" passes for me on both MacOS and Linux. This doesn't measurably affect the speed on Linux, at least, in my tests. The bad news: it's hard to safely delete those lockfiles when we're done with them, so they tend to accumulate in the .redo dir.	2010-12-14 02:44:29 -08:00
Avery Pennarun	294945bd0f	Assert that one instance never holds multiple locks on the same file at once. This could happen if you did 'redo foo foo'. Which nobody ever did, I think, but let's make sure we catch it if they do. One problem with having multiple locks on the same file is then you have to remember not to unlock it until they're all done. But there are other problems, such as: why the heck did we think it was a good idea to lock the same file more than once? So just prevent it from happening for now, unless/until we somehow come up with a reason it might be a good idea.	2010-12-14 02:19:08 -08:00
Avery Pennarun	c64b8a3eb1	Fix a race condition caused by zap_deps(). We can't just delete all the dependencies at the beginning and re-add them: other people might be checking the same dependencies in parallel. Instead, mark them as delete_me up front, and then after the build completes, remove only the delete_me entries.	2010-12-11 22:59:55 -08:00
Avery Pennarun	1cb000ece1	redo.py: report when you're trying to rebuild a static file. In redo-ifchange, this might be a good idea, since you might just want to set a dependency on it, so we won't say anything from inside builder.py. But if you're calling redo.py, that means you expect it to be rebuilt, since there's no other reason to try. So print a warning. (This is what make does, more or less.)	2010-12-11 21:19:15 -08:00
Avery Pennarun	e18fa85d58	The only thing in helpers.py that needed vars.py was the log stuff. So put it in its own file. Now it's safer to import and use helpers even if you can't safely touch vars.	2010-12-11 18:34:02 -08:00
Avery Pennarun	0da5c7c082	Add a redo-always command: it adds an "always dirty" dependency to your target. This is mostly useless except when combined with redo-stamp... I think.	2010-12-11 07:02:45 -08:00
Avery Pennarun	1d26d99e0c	Fix a deadlock with redo-oob. If a checksummed target A used to exist but is now missing, and we tried to redo-ifchange that exact file, we would unnecessarily run 'redo-oob A A'; that is, we have to build A in order to determine if A needs to be built. The sub-targets of redo-oob aren't run with REDO_UNLOCKED, so this would deadlock instantly. Add an assertion to redo-oob to ensure we never try to redo-ifchange the primary target (thus converting the deadlock into an exception). And skip doing redo-oob when the target is already the same as the thing we have to check.	2010-12-11 06:16:32 -08:00
Avery Pennarun	22617d335c	Half-support for using file checksums instead of stamps. A new redo-stamp program takes whatever you give it as stdin and uses it to calculate a checksum for the current target. If that checksum is the same as last time, then we consider the target to be unchanged, and we set checked_runid and stamp, but leave changed_runid alone. That will make future callers of redo-ifchange see this target as unmodified. However, this is only "half" support because by the time we run the .do script that calls redo-stamp, it's too late; the caller is a dependant of the stamped program, which is already being rebuilt, even if redo-stamp turns out to say that this target is unchanged. The other half is coming up.	2010-12-11 05:54:37 -08:00
Avery Pennarun	f6d11d5411	If a user manually changes a generated file, don't ever overwrite it. That way the user can modify an auto-generated 'compile' script, for example, and it'll stay modified. If they delete the file, we can then generate it for them again. Also, we have to warn whenever we're doing this, or people might think it's a bug.	2010-12-10 22:43:11 -08:00
Avery Pennarun	0126f6be1e	Don't wipe the timestamp when a target fails to redo. It's really a separate condition. And since we're not removing the target file in case of error - we update it atomically, and keeping it is better than losing it - there's no reason to wipe the timestamp in that case either. However, we do need to know that the build failed, so that anybody else (especially in a parallel build) who looks at that target knows that it died. So add a separate flag just for that.	2010-12-10 22:41:11 -08:00
Avery Pennarun	84169c5d27	Change locking stuff from fifos to fcntl.lockf(). This should reduce filesystem grinding a bit, and makes the code simpler. It's also theoretically a bit more portable, since I'm guessing fifo semantics aren't the same on win32 if we ever get there. Also, a major problem with the old fifo-based system is that if a redo process died without cleaning up after itself, it wouldn't delete its lockfiles, so we had to wipe them all at the beginning of each build. Now we don't; in theory, you can now have multiple copies of redo poking at the same tree at the same time and not stepping on each other.	2010-12-10 03:55:51 -08:00
Avery Pennarun	b5c02e410e	state.py: reorder things so sqlite never does fdatasync(). It was briefly synchronous at data creation time, adding a few ms to redo startup.	2010-12-10 00:50:53 -08:00
Avery Pennarun	e1a0fc9c12	state.File.is_checked() was being too paranoid. It wasn't allowing us to short circuit a dependency if that dependency had been built previously, but that was already being checked (more correctly) in dirty_deps().	2010-12-10 00:50:52 -08:00
Avery Pennarun	94cecc240b	Don't abort if 'insert into Files' gives an IntegrityError. It can happen occasionally if some other parallel redo adds the same file at the same time.	2010-12-10 00:50:52 -08:00
Avery Pennarun	3ef2bd7300	Don't check as often whether the .redo directory exists. Just check it once after running a subprocess: that's the only way it ought to be able to disappear (ie. in a 'make clean' setup).	2010-12-10 00:50:52 -08:00
Avery Pennarun	29d6c9a746	Don't db.commit() so frequently. Just commit when we're about to do something blocking. sqlite goes a lot faster with bigger transactions. This change does show a small percentage speedup in tests, but not as much as I'd like.	2010-12-10 00:50:52 -08:00
Avery Pennarun	fb79851530	Calculate dependencies with fewer sqlite queries.	2010-12-10 00:50:52 -08:00
Avery Pennarun	c339359f04	Schema cleanup.	2010-12-10 00:50:52 -08:00
Avery Pennarun	f4535be0cd	Fix a deadlock. We were holding a database open with a read lock while a child redo might need to open it with a write lock.	2010-12-10 00:50:52 -08:00
Avery Pennarun	9e36106642	sqlite3: configure the timeout explicitly. In flush-cache.sh, we have to do this, because the sqlite3 command-line tool sets it to zero. Inevitably during parallel testing, it'll end up contending for a lock, and we really want it to wait a bit. In state.py, it's not as important since the default is nonzero. But python-sqlite3's default of 5 seconds makes me a little too nervous; I can imagine a disk write waiting for more than 5 seconds sometime. So let's use 60 instead.	2010-12-10 00:50:52 -08:00
Avery Pennarun	a62bd50d44	Switch state.py to use sqlite3 instead of filesystem-based stamps. It passes all tests when run serialized, but still gives weird errors (OperationalError: database is locked) when run with -j5. sqlite3 shouldn't be barfing just because the database is locked, since the default timeout is 5 seconds, and it's dying way faster than that.	2010-12-10 00:50:52 -08:00
Avery Pennarun	51bbdc6c5a	If we can't find a .do file for a target, mark it as not is_generated. This allows files to transition from generated to not-generated if the .do file is ever removed (ie. the user is changing things and the file is now a source file, not a target).	2010-12-06 03:12:53 -08:00
Avery Pennarun	c29de89051	Fix more trouble with .do scripts that cd to other directories. The interaction of REDO_STARTDIR, REDO_PWD, and getcwd() are pretty complicated. In this case, we accidentally assumed that the current instance of redo was running with getcwd() == REDO_STARTDIR+REDO_PWD, and so the new target was REDO_STARTDIR+REDO_PWD+t, but this isn't the case if the current .do script did chdir(). The correct answer is REDO_STARTDIR+getcwd()+t.	2010-11-25 06:37:24 -08:00
Avery Pennarun	f3413c0f7c	doublestatic: fix dependencies if two files depend on one non-generated file. If a and b both depend on c, and c is a static (non-generated) file that has changed since the last successful build of a and b, we would try to redo a, but would forget to redo b. Now it does both.	2010-11-24 04:52:30 -08:00
Avery Pennarun	9fc5ae1b56	Optimization: don't getcwd() so often. We never chdir() except just as we exec a subprocess, so it's okay to cache this value. This makes strace output look cleaner, and speeds things up a little bit when checking a large number of dependencies. Relatedly, take a debug2() message and put it in an additional if, so that we don't have to do so much work to calculate it when we're just going to throw it away anyhow.	2010-11-24 03:45:38 -08:00
Avery Pennarun	f337df463d	state.stamp() can't imply state.built(). ...because we deliberately stamp non-generated files as well, and that doesn't need to imply that we rebuilt them just now. In fact, we know for a fact that we didn't rebuild them just now, but we still need to record the timestamp for later.	2010-11-22 22:53:40 -08:00
Avery Pennarun	dce0076554	Print a useful message and exit when the .redo directory disappears.	2010-11-22 04:04:45 -08:00
Avery Pennarun	2dbd47100d	state.py: reduce race condition between Lock.trylock() and unlock(). If 'redo clean' deletes the lockfile after trylock() succeeds but before unlock(), then unlock() won't be able to open the pipe in order to release readers, and any waiters might end up waiting forever. We can't open the fifo for write until there's at least one reader, so let's open a reader just to let us open a writer. Then we'll leave them open until the later unlock(), which can just close them both.	2010-11-22 04:04:45 -08:00
Avery Pennarun	7aa7c41e38	builder,jwack: slight cleanup to token passing. In rare cases, one process could end up holding onto more than one token.	2010-11-21 22:46:20 -08:00
Avery Pennarun	47edb9527d	state.py: remove all the ugly fromdir= stuff. Instead, just change the target name to be more specific, in the one place in redo-ifchange that actually needed it.	2010-11-21 04:57:04 -08:00
Avery Pennarun	0652bc9911	Oops, earlier state.mark() stuff was a little too radical. If someone else built and marked one of our dependencies, then that dependency would show up as clean in a later redo-ifchange, so other dependents of that file wouldn't be rebuilt. We actually have to track two session-specific variables: whether the file has been checked, and whether it was rebuilt. (Or alternatively, whether it was dirty when we checked it the first time. But we store the former.)	2010-11-21 04:39:28 -08:00
Avery Pennarun	cd702a8126	state.Lock: initialize self.owner first, to avoid problems in __del__ ...if an exception is ever thrown in _sname(). Which shouldn't happen, but we might as well be careful.	2010-11-21 03:57:52 -08:00
Avery Pennarun	12983bd88d	Move initialization of .redo directory into state.init().	2010-11-21 02:26:18 -08:00
Avery Pennarun	2f604b2c8f	Don't re-check dependencies in a single run. If a depends on b depends on c, then if when we consider building a, we have to check b and c. If we then are asked about a2 which depends on b, there is no reason to re-check b and its dependencies; we already know it's done. This takes the time to do 'redo t/curse/all' the second time down from 1.0s to 0.13s. (make can still do it in 0.07s.) 'redo t/curse/all' the first time is down from 5.4s to to 4.6s. With -j4, from 3.0s to 2.5s.	2010-11-21 01:29:55 -08:00
Avery Pennarun	362ca2997a	A whole bunch of cleanups to state.Lock. Now t/curse passes again when parallelized (except for the countall mismatch, since we haven't fixed the source of that problem yet). At least it's consistent now. There's a bunch of stuff rearranged in here, but the actual important problem was that we were doing unlink() on the lock fifo even if ENXIO, which meant a reader could connect in between ENXIO and unlink(), and thus never get notified of the disconnection. This would cause the build to randomly freeze.	2010-11-19 06:07:41 -08:00
Avery Pennarun	dc3efb69cc	Extract .redo dir state management stuff into its own file. In preparation for changing the on-disk format eventually, as well as making the main code more readable.	2010-11-19 03:16:29 -08:00

39 commits