apenwarr-redo

Author	SHA1	Message	Date
Avery Pennarun	3dd8d081be	minimal/do: MacOS has /usr/bin/true, not /bin/true.	2010-12-14 02:47:51 -08:00
Avery Pennarun	95680ed7ef	Switch to using a separate lockfile per target. The previous method, using fcntl byterange locks, was very efficient and avoided unnecessarily filesystem metadata churn (ie. creating/deleting inodes). Unfortunately, MacOS X (at least version 10.6.5) apparently has a race condition in its fcntl locking that makes it unusably unreliable (http://apenwarr.ca/log/?m=201012#13). My tests indicate that if you only ever lock a single byterange on a file, the race condition doesn't cause a problem. So let's just use one lockfile per target. Now "redo -j20 test" passes for me on both MacOS and Linux. This doesn't measurably affect the speed on Linux, at least, in my tests. The bad news: it's hard to safely delete those lockfiles when we're done with them, so they tend to accumulate in the .redo dir.	2010-12-14 02:44:29 -08:00
Avery Pennarun	294945bd0f	Assert that one instance never holds multiple locks on the same file at once. This could happen if you did 'redo foo foo'. Which nobody ever did, I think, but let's make sure we catch it if they do. One problem with having multiple locks on the same file is then you have to remember not to unlock it until they're all done. But there are other problems, such as: why the heck did we think it was a good idea to lock the same file more than once? So just prevent it from happening for now, unless/until we somehow come up with a reason it might be a good idea.	2010-12-14 02:19:08 -08:00
Avery Pennarun	f5eabe61d2	install.do: don't crash when the manpages fail to build.	2010-12-12 05:42:20 -08:00
Avery Pennarun	8f9453a55d	Fix tests on MacOS. This comes down to the lack of a 'seq' command (what?!) and the fact that BSD "wc -l" returns extra whitespace, while the GNU version doesn't. We should be using numeric comparisons instead of string comparisons, and then it's ok.	2010-12-12 05:38:30 -08:00
Avery Pennarun	d21e6612e2	Answer a bunch more hopefully-FAQs.	2010-12-12 04:30:43 -08:00
Avery Pennarun	14456d5892	Generally clean up the README.	2010-12-12 03:50:56 -08:00
Avery Pennarun	6e5ec95c6a	Add an 'install' target. (usable as 'make install' or 'redo install', of course)	2010-12-12 02:48:42 -08:00
Avery Pennarun	4b48448233	Add a bunch of manpages.	2010-12-12 02:12:21 -08:00
Avery Pennarun	f16f0147b1	Add a redo-ifcreate test. Turns out we weren't testing this one at all, which is a shame, because it totally didn't work.	2010-12-11 23:50:12 -08:00
Avery Pennarun	e6f95521ae	redo-always/redo-ifcreate/redo-stamp: work inside chdir(). If someone cd's to another directory and then runs redo-always, we weren't adding to the right target.	2010-12-11 23:42:45 -08:00
Avery Pennarun	caea093519	Makefile: don't try to 'redo Makefile'. This eliminates the relatively new warning about trying to redo static files.	2010-12-11 23:41:40 -08:00
Avery Pennarun	c64b8a3eb1	Fix a race condition caused by zap_deps(). We can't just delete all the dependencies at the beginning and re-add them: other people might be checking the same dependencies in parallel. Instead, mark them as delete_me up front, and then after the build completes, remove only the delete_me entries.	2010-12-11 22:59:55 -08:00
Avery Pennarun	80fedc84fe	minimal/do: make redo-ifchange (etc) into subprograms in a temp dir. Using aliases for them was cute, but it didn't work with things like: find -name '*.c' \| xargs redo-ifchange since xargs doesn't know about aliases.	2010-12-11 21:47:55 -08:00
Avery Pennarun	1cb000ece1	redo.py: report when you're trying to rebuild a static file. In redo-ifchange, this might be a good idea, since you might just want to set a dependency on it, so we won't say anything from inside builder.py. But if you're calling redo.py, that means you expect it to be rebuilt, since there's no other reason to try. So print a warning. (This is what make does, more or less.)	2010-12-11 21:19:15 -08:00
Avery Pennarun	49f0a041b2	clean.do: cleanup *.tmp files that might have been left lying around. ...and fix a bug where builder.py can't handle it if its temp file is deleted out from under it.	2010-12-11 21:10:57 -08:00
Avery Pennarun	b9987433c9	Disable the tests that don't work with minimal/do. ...only when running under minimal/do, of course. The tests in question mostly fail because they're testing particular dependency-related behaviour, and minimal/do doesn't support dependencies, so naturally it doesn't work.	2010-12-11 21:06:12 -08:00
Avery Pennarun	a75555e7a8	minimal/do: don't completely abort if a sub-redo fails. Just allow that sub-redo to return an error code. Also, parent redos should return error code 1, not the same code as the child. That makes it easier to figure out which file generated the "special" error code.	2010-12-11 20:37:27 -08:00
Avery Pennarun	fba684ee07	redo-ifchange can now be run even if there's no parent redo.	2010-12-11 19:08:53 -08:00
Avery Pennarun	e18fa85d58	The only thing in helpers.py that needed vars.py was the log stuff. So put it in its own file. Now it's safer to import and use helpers even if you can't safely touch vars.	2010-12-11 18:34:02 -08:00
Avery Pennarun	1abaf77d35	jwack: start waitfds around fd#50. That makes it a little easier to tell, in a strace, what the process is waiting on. If it's 100/101, then it's waiting on a token; 50+ means waiting on a subtask. Also, we weren't closing the read side of subtask fds on exec. This didn't cause any problems, but did result in a wasted fd in subprocesses.	2010-12-11 18:25:13 -08:00
Avery Pennarun	2706525fc0	redo-stamp: print a helpful message if stdin is a tty. Otherwise your redo process might just freeze in the middle, and you'll wonder why.	2010-12-11 18:13:58 -08:00
Avery Pennarun	0da5c7c082	Add a redo-always command: it adds an "always dirty" dependency to your target. This is mostly useless except when combined with redo-stamp... I think.	2010-12-11 07:02:45 -08:00
Avery Pennarun	1d26d99e0c	Fix a deadlock with redo-oob. If a checksummed target A used to exist but is now missing, and we tried to redo-ifchange that exact file, we would unnecessarily run 'redo-oob A A'; that is, we have to build A in order to determine if A needs to be built. The sub-targets of redo-oob aren't run with REDO_UNLOCKED, so this would deadlock instantly. Add an assertion to redo-oob to ensure we never try to redo-ifchange the primary target (thus converting the deadlock into an exception). And skip doing redo-oob when the target is already the same as the thing we have to check.	2010-12-11 06:16:32 -08:00
Avery Pennarun	91630a892a	Whoops, redo-oob was slightly wrong when used with -j. We called 'redo' instead of 'redo-ifchange' on our indeterminate objects. Since other instances of redo-oob might be running at the same time, this could cause the same object to get rebuilt more than once unnecessarily. The unit tests caught this, I just didn't notice earlier.	2010-12-11 05:54:39 -08:00
Avery Pennarun	e7f7119f2e	If a checksummed file is deleted, we should still use redo-oob. We were giving up and rebuilding the toplevel object, which did eventually rebuild our checksummed file, but then the file turned out to be identical to what it was before, so that nobody else who depended on it ended up getting rebuilt. So the results were indeterminate. Now we treat it as if its dirtiness is unknown, so we build it using redo-oob before building any of its dependencies.	2010-12-11 05:54:39 -08:00
Avery Pennarun	f702417ef3	The second half of redo-stamp: out-of-order building. If a depends on b depends on c, and c is dirty but b uses redo-stamp checksums, then 'redo-ifchange a' is indeterminate: we won't know if we need to run a.do unless we first build b, but the script that normally runs 'redo-ifchange b' is a.do, and we don't want to run that yet, because we don't know for sure if b is dirty, and we shouldn't build a unless one of its dependencies is dirty. Eek! Luckily, there's a safe solution. If we know a is dirty - eg. because a.do or one of its children has definitely changed - then we can just run a.do immediately and there's no problem, even if b is indeterminate, because we were going to run a.do anyhow. If a's dependencies are not definitely dirty, and all we have is indeterminate ones like b, then that means a's build process hasn't changed, which means its tree of dependencies still includes b, which means we can deduce that if we did run a.do, it would end up running b.do. Since we know that anyhow, we can safely just run b.do, which will either b.set_checked() or b.set_changed(). Once that's done, we can re-parse a's dependencies and this time conclusively tell if it needs to be redone or not. Even if it does, b is already up-to-date, so the 'redo-ifchange b' line in a.do will be fast. ...now take all the above and do it recursively to handle nested dependencies, etc, and you're done.	2010-12-11 05:54:39 -08:00
Avery Pennarun	1355ade7c7	Correctly handle a checksummed file that depends on a non-checksummed file. We were rebuilding the checksummed file every time because redo-ifchange was incorrectly assuming that a child's changed_runid that's greater than my changed_runid means I'm dirty. But if my checked_runid is >= the child's checked_runid, then I'm clean, because my checksum didn't change. Clear as mud?	2010-12-11 05:54:39 -08:00
Avery Pennarun	22617d335c	Half-support for using file checksums instead of stamps. A new redo-stamp program takes whatever you give it as stdin and uses it to calculate a checksum for the current target. If that checksum is the same as last time, then we consider the target to be unchanged, and we set checked_runid and stamp, but leave changed_runid alone. That will make future callers of redo-ifchange see this target as unmodified. However, this is only "half" support because by the time we run the .do script that calls redo-stamp, it's too late; the caller is a dependant of the stamped program, which is already being rebuilt, even if redo-stamp turns out to say that this target is unchanged. The other half is coming up.	2010-12-11 05:54:37 -08:00
Avery Pennarun	ca67f5e71a	redo-ifchange: fix relative pathnames printed in debug messages.	2010-12-11 02:15:42 -08:00
Avery Pennarun	59201dd7a0	$3 and stdout no longer refer to the same file. This is slightly inelegant, as the old style echo foo echo blah chmod a+x $3 doesn't work anymore; the stuff you wrote to stdout didn't end up in $3. You can rewrite it as: exec >$3 echo foo echo blah chmod a+x $3 Anyway, it's better this way, because now we can tell the difference between a zero-length $3 and a nonexistent one. A .do script can thus produce either one and we'll either delete the target or move the empty $3 to replace it, whichever is right. As a bonus, this simplifies our detection of whether you did something weird with overlapping changes to stdout and $3.	2010-12-11 00:29:04 -08:00
Avery Pennarun	c4be0050f7	Release the jwack token when doing a synchronous lock wait. Although we were deadlock-free before, under some circumstances we'd end up holding a perfectly good token while in sync wait; that would reduce our parallelism for no good reason. So give back our tokens before waiting for anybody else.	2010-12-10 23:04:46 -08:00
Avery Pennarun	f6d11d5411	If a user manually changes a generated file, don't ever overwrite it. That way the user can modify an auto-generated 'compile' script, for example, and it'll stay modified. If they delete the file, we can then generate it for them again. Also, we have to warn whenever we're doing this, or people might think it's a bug.	2010-12-10 22:43:11 -08:00
Avery Pennarun	0126f6be1e	Don't wipe the timestamp when a target fails to redo. It's really a separate condition. And since we're not removing the target file in case of error - we update it atomically, and keeping it is better than losing it - there's no reason to wipe the timestamp in that case either. However, we do need to know that the build failed, so that anybody else (especially in a parallel build) who looks at that target knows that it died. So add a separate flag just for that.	2010-12-10 22:41:11 -08:00
Avery Pennarun	16bebd21b5	builder: the (WAITING) message from --debug-locks didn't print every time. This was misleading; we end up waiting synchronously for a lock more often than I thought, and it really does slow down builds.	2010-12-10 22:39:25 -08:00
Avery Pennarun	b1bb48a029	Merge branch 'sqlite' This replaces the .redo state directory with an sqlite database instead, improving correctness and sometimes performance.	2010-12-10 05:43:47 -08:00
Avery Pennarun	18b5263db7	jwack: fix a typo in the "wrong number of tokens on exit" error. Not that we ever see that error, except when I'm screwing around.	2010-12-10 05:19:49 -08:00
Avery Pennarun	49ebea445f	jwack: don't ever set the jobserver socket to O_NONBLOCK. It creates a race condition: GNU Make might try to read while the socket is O_NONBLOCK, get EAGAIN, and die; or else another redo might set it back to blocking in between our call to make it O_NONBLOCK and our call to read(). This method - setting an alarm() during the read - is hacky, but should work every time. Unfortunately you get a 1s delay - rarely - when this happens. The good news is it only happens when there are no tokens available anyhow, so it won't affect performance much in any situation I can imagine.	2010-12-10 04:57:13 -08:00
Avery Pennarun	f70c028a8a	With --debug-locks, print a message when we stop to wait on a lock. Helps in seeing why a particular process might be stopped, and in detecting potential reasons that parallelism might be reduced.	2010-12-10 04:31:22 -08:00
Avery Pennarun	675a5106d2	dup() the jobserver fds to 100,101 to make debugging a bit easier. Now if a process is stuck waiting on one of those fds, it'll be obvious from the strace.	2010-12-10 04:11:44 -08:00
Avery Pennarun	84169c5d27	Change locking stuff from fifos to fcntl.lockf(). This should reduce filesystem grinding a bit, and makes the code simpler. It's also theoretically a bit more portable, since I'm guessing fifo semantics aren't the same on win32 if we ever get there. Also, a major problem with the old fifo-based system is that if a redo process died without cleaning up after itself, it wouldn't delete its lockfiles, so we had to wipe them all at the beginning of each build. Now we don't; in theory, you can now have multiple copies of redo poking at the same tree at the same time and not stepping on each other.	2010-12-10 03:55:51 -08:00
Avery Pennarun	10afd9000f	Add some conditionals around some high-bandwidth debug statements. When you have lots of unmodified dependencies, building these printout strings (which aren't even printed unless you're using -d) ends up taking something like 5% of the runtime.	2010-12-10 00:50:53 -08:00
Avery Pennarun	6e6e453908	Some speedups for doing redo-ifchange on a large number of static files. Fix some wastage revealed by the (almost useless, sigh) python profiler.	2010-12-10 00:50:53 -08:00
Avery Pennarun	b5c02e410e	state.py: reorder things so sqlite never does fdatasync(). It was briefly synchronous at data creation time, adding a few ms to redo startup.	2010-12-10 00:50:53 -08:00
Avery Pennarun	e446d4dd04	builder.py: don't import the 'random' module unless we need it. Initializing the random number generator involves some pointless reading from /dev/urandom.	2010-12-10 00:50:53 -08:00
Avery Pennarun	e1a0fc9c12	state.File.is_checked() was being too paranoid. It wasn't allowing us to short circuit a dependency if that dependency had been built previously, but that was already being checked (more correctly) in dirty_deps().	2010-12-10 00:50:52 -08:00
Avery Pennarun	94cecc240b	Don't abort if 'insert into Files' gives an IntegrityError. It can happen occasionally if some other parallel redo adds the same file at the same time.	2010-12-10 00:50:52 -08:00
Avery Pennarun	3ef2bd7300	Don't check as often whether the .redo directory exists. Just check it once after running a subprocess: that's the only way it ought to be able to disappear (ie. in a 'make clean' setup).	2010-12-10 00:50:52 -08:00
Avery Pennarun	29d6c9a746	Don't db.commit() so frequently. Just commit when we're about to do something blocking. sqlite goes a lot faster with bigger transactions. This change does show a small percentage speedup in tests, but not as much as I'd like.	2010-12-10 00:50:52 -08:00
Avery Pennarun	fb79851530	Calculate dependencies with fewer sqlite queries.	2010-12-10 00:50:52 -08:00

1 2 3 4

187 commits