apenwarr-redo

Author	SHA1	Message	Date
Avery Pennarun	e6f95521ae	redo-always/redo-ifcreate/redo-stamp: work inside chdir(). If someone cd's to another directory and then runs redo-always, we weren't adding to the right target.	2010-12-11 23:42:45 -08:00
Avery Pennarun	caea093519	Makefile: don't try to 'redo Makefile'. This eliminates the relatively new warning about trying to redo static files.	2010-12-11 23:41:40 -08:00
Avery Pennarun	c64b8a3eb1	Fix a race condition caused by zap_deps(). We can't just delete all the dependencies at the beginning and re-add them: other people might be checking the same dependencies in parallel. Instead, mark them as delete_me up front, and then after the build completes, remove only the delete_me entries.	2010-12-11 22:59:55 -08:00
Avery Pennarun	80fedc84fe	minimal/do: make redo-ifchange (etc) into subprograms in a temp dir. Using aliases for them was cute, but it didn't work with things like: find -name '*.c' \| xargs redo-ifchange since xargs doesn't know about aliases.	2010-12-11 21:47:55 -08:00
Avery Pennarun	1cb000ece1	redo.py: report when you're trying to rebuild a static file. In redo-ifchange, this might be a good idea, since you might just want to set a dependency on it, so we won't say anything from inside builder.py. But if you're calling redo.py, that means you expect it to be rebuilt, since there's no other reason to try. So print a warning. (This is what make does, more or less.)	2010-12-11 21:19:15 -08:00
Avery Pennarun	49f0a041b2	clean.do: cleanup *.tmp files that might have been left lying around. ...and fix a bug where builder.py can't handle it if its temp file is deleted out from under it.	2010-12-11 21:10:57 -08:00
Avery Pennarun	b9987433c9	Disable the tests that don't work with minimal/do. ...only when running under minimal/do, of course. The tests in question mostly fail because they're testing particular dependency-related behaviour, and minimal/do doesn't support dependencies, so naturally it doesn't work.	2010-12-11 21:06:12 -08:00
Avery Pennarun	a75555e7a8	minimal/do: don't completely abort if a sub-redo fails. Just allow that sub-redo to return an error code. Also, parent redos should return error code 1, not the same code as the child. That makes it easier to figure out which file generated the "special" error code.	2010-12-11 20:37:27 -08:00
Avery Pennarun	fba684ee07	redo-ifchange can now be run even if there's no parent redo.	2010-12-11 19:08:53 -08:00
Avery Pennarun	e18fa85d58	The only thing in helpers.py that needed vars.py was the log stuff. So put it in its own file. Now it's safer to import and use helpers even if you can't safely touch vars.	2010-12-11 18:34:02 -08:00
Avery Pennarun	1abaf77d35	jwack: start waitfds around fd#50. That makes it a little easier to tell, in a strace, what the process is waiting on. If it's 100/101, then it's waiting on a token; 50+ means waiting on a subtask. Also, we weren't closing the read side of subtask fds on exec. This didn't cause any problems, but did result in a wasted fd in subprocesses.	2010-12-11 18:25:13 -08:00
Avery Pennarun	2706525fc0	redo-stamp: print a helpful message if stdin is a tty. Otherwise your redo process might just freeze in the middle, and you'll wonder why.	2010-12-11 18:13:58 -08:00
Avery Pennarun	0da5c7c082	Add a redo-always command: it adds an "always dirty" dependency to your target. This is mostly useless except when combined with redo-stamp... I think.	2010-12-11 07:02:45 -08:00
Avery Pennarun	1d26d99e0c	Fix a deadlock with redo-oob. If a checksummed target A used to exist but is now missing, and we tried to redo-ifchange that exact file, we would unnecessarily run 'redo-oob A A'; that is, we have to build A in order to determine if A needs to be built. The sub-targets of redo-oob aren't run with REDO_UNLOCKED, so this would deadlock instantly. Add an assertion to redo-oob to ensure we never try to redo-ifchange the primary target (thus converting the deadlock into an exception). And skip doing redo-oob when the target is already the same as the thing we have to check.	2010-12-11 06:16:32 -08:00
Avery Pennarun	91630a892a	Whoops, redo-oob was slightly wrong when used with -j. We called 'redo' instead of 'redo-ifchange' on our indeterminate objects. Since other instances of redo-oob might be running at the same time, this could cause the same object to get rebuilt more than once unnecessarily. The unit tests caught this, I just didn't notice earlier.	2010-12-11 05:54:39 -08:00
Avery Pennarun	e7f7119f2e	If a checksummed file is deleted, we should still use redo-oob. We were giving up and rebuilding the toplevel object, which did eventually rebuild our checksummed file, but then the file turned out to be identical to what it was before, so that nobody else who depended on it ended up getting rebuilt. So the results were indeterminate. Now we treat it as if its dirtiness is unknown, so we build it using redo-oob before building any of its dependencies.	2010-12-11 05:54:39 -08:00
Avery Pennarun	f702417ef3	The second half of redo-stamp: out-of-order building. If a depends on b depends on c, and c is dirty but b uses redo-stamp checksums, then 'redo-ifchange a' is indeterminate: we won't know if we need to run a.do unless we first build b, but the script that normally runs 'redo-ifchange b' is a.do, and we don't want to run that yet, because we don't know for sure if b is dirty, and we shouldn't build a unless one of its dependencies is dirty. Eek! Luckily, there's a safe solution. If we know a is dirty - eg. because a.do or one of its children has definitely changed - then we can just run a.do immediately and there's no problem, even if b is indeterminate, because we were going to run a.do anyhow. If a's dependencies are not definitely dirty, and all we have is indeterminate ones like b, then that means a's build process hasn't changed, which means its tree of dependencies still includes b, which means we can deduce that if we did run a.do, it would end up running b.do. Since we know that anyhow, we can safely just run b.do, which will either b.set_checked() or b.set_changed(). Once that's done, we can re-parse a's dependencies and this time conclusively tell if it needs to be redone or not. Even if it does, b is already up-to-date, so the 'redo-ifchange b' line in a.do will be fast. ...now take all the above and do it recursively to handle nested dependencies, etc, and you're done.	2010-12-11 05:54:39 -08:00
Avery Pennarun	1355ade7c7	Correctly handle a checksummed file that depends on a non-checksummed file. We were rebuilding the checksummed file every time because redo-ifchange was incorrectly assuming that a child's changed_runid that's greater than my changed_runid means I'm dirty. But if my checked_runid is >= the child's checked_runid, then I'm clean, because my checksum didn't change. Clear as mud?	2010-12-11 05:54:39 -08:00
Avery Pennarun	22617d335c	Half-support for using file checksums instead of stamps. A new redo-stamp program takes whatever you give it as stdin and uses it to calculate a checksum for the current target. If that checksum is the same as last time, then we consider the target to be unchanged, and we set checked_runid and stamp, but leave changed_runid alone. That will make future callers of redo-ifchange see this target as unmodified. However, this is only "half" support because by the time we run the .do script that calls redo-stamp, it's too late; the caller is a dependant of the stamped program, which is already being rebuilt, even if redo-stamp turns out to say that this target is unchanged. The other half is coming up.	2010-12-11 05:54:37 -08:00
Avery Pennarun	ca67f5e71a	redo-ifchange: fix relative pathnames printed in debug messages.	2010-12-11 02:15:42 -08:00
Avery Pennarun	59201dd7a0	$3 and stdout no longer refer to the same file. This is slightly inelegant, as the old style echo foo echo blah chmod a+x $3 doesn't work anymore; the stuff you wrote to stdout didn't end up in $3. You can rewrite it as: exec >$3 echo foo echo blah chmod a+x $3 Anyway, it's better this way, because now we can tell the difference between a zero-length $3 and a nonexistent one. A .do script can thus produce either one and we'll either delete the target or move the empty $3 to replace it, whichever is right. As a bonus, this simplifies our detection of whether you did something weird with overlapping changes to stdout and $3.	2010-12-11 00:29:04 -08:00
Avery Pennarun	c4be0050f7	Release the jwack token when doing a synchronous lock wait. Although we were deadlock-free before, under some circumstances we'd end up holding a perfectly good token while in sync wait; that would reduce our parallelism for no good reason. So give back our tokens before waiting for anybody else.	2010-12-10 23:04:46 -08:00
Avery Pennarun	f6d11d5411	If a user manually changes a generated file, don't ever overwrite it. That way the user can modify an auto-generated 'compile' script, for example, and it'll stay modified. If they delete the file, we can then generate it for them again. Also, we have to warn whenever we're doing this, or people might think it's a bug.	2010-12-10 22:43:11 -08:00
Avery Pennarun	0126f6be1e	Don't wipe the timestamp when a target fails to redo. It's really a separate condition. And since we're not removing the target file in case of error - we update it atomically, and keeping it is better than losing it - there's no reason to wipe the timestamp in that case either. However, we do need to know that the build failed, so that anybody else (especially in a parallel build) who looks at that target knows that it died. So add a separate flag just for that.	2010-12-10 22:41:11 -08:00
Avery Pennarun	16bebd21b5	builder: the (WAITING) message from --debug-locks didn't print every time. This was misleading; we end up waiting synchronously for a lock more often than I thought, and it really does slow down builds.	2010-12-10 22:39:25 -08:00
Avery Pennarun	b1bb48a029	Merge branch 'sqlite' This replaces the .redo state directory with an sqlite database instead, improving correctness and sometimes performance.	2010-12-10 05:43:47 -08:00
Avery Pennarun	18b5263db7	jwack: fix a typo in the "wrong number of tokens on exit" error. Not that we ever see that error, except when I'm screwing around.	2010-12-10 05:19:49 -08:00
Avery Pennarun	49ebea445f	jwack: don't ever set the jobserver socket to O_NONBLOCK. It creates a race condition: GNU Make might try to read while the socket is O_NONBLOCK, get EAGAIN, and die; or else another redo might set it back to blocking in between our call to make it O_NONBLOCK and our call to read(). This method - setting an alarm() during the read - is hacky, but should work every time. Unfortunately you get a 1s delay - rarely - when this happens. The good news is it only happens when there are no tokens available anyhow, so it won't affect performance much in any situation I can imagine.	2010-12-10 04:57:13 -08:00
Avery Pennarun	f70c028a8a	With --debug-locks, print a message when we stop to wait on a lock. Helps in seeing why a particular process might be stopped, and in detecting potential reasons that parallelism might be reduced.	2010-12-10 04:31:22 -08:00
Avery Pennarun	675a5106d2	dup() the jobserver fds to 100,101 to make debugging a bit easier. Now if a process is stuck waiting on one of those fds, it'll be obvious from the strace.	2010-12-10 04:11:44 -08:00
Avery Pennarun	84169c5d27	Change locking stuff from fifos to fcntl.lockf(). This should reduce filesystem grinding a bit, and makes the code simpler. It's also theoretically a bit more portable, since I'm guessing fifo semantics aren't the same on win32 if we ever get there. Also, a major problem with the old fifo-based system is that if a redo process died without cleaning up after itself, it wouldn't delete its lockfiles, so we had to wipe them all at the beginning of each build. Now we don't; in theory, you can now have multiple copies of redo poking at the same tree at the same time and not stepping on each other.	2010-12-10 03:55:51 -08:00
Avery Pennarun	10afd9000f	Add some conditionals around some high-bandwidth debug statements. When you have lots of unmodified dependencies, building these printout strings (which aren't even printed unless you're using -d) ends up taking something like 5% of the runtime.	2010-12-10 00:50:53 -08:00
Avery Pennarun	6e6e453908	Some speedups for doing redo-ifchange on a large number of static files. Fix some wastage revealed by the (almost useless, sigh) python profiler.	2010-12-10 00:50:53 -08:00
Avery Pennarun	b5c02e410e	state.py: reorder things so sqlite never does fdatasync(). It was briefly synchronous at data creation time, adding a few ms to redo startup.	2010-12-10 00:50:53 -08:00
Avery Pennarun	e446d4dd04	builder.py: don't import the 'random' module unless we need it. Initializing the random number generator involves some pointless reading from /dev/urandom.	2010-12-10 00:50:53 -08:00
Avery Pennarun	e1a0fc9c12	state.File.is_checked() was being too paranoid. It wasn't allowing us to short circuit a dependency if that dependency had been built previously, but that was already being checked (more correctly) in dirty_deps().	2010-12-10 00:50:52 -08:00
Avery Pennarun	94cecc240b	Don't abort if 'insert into Files' gives an IntegrityError. It can happen occasionally if some other parallel redo adds the same file at the same time.	2010-12-10 00:50:52 -08:00
Avery Pennarun	3ef2bd7300	Don't check as often whether the .redo directory exists. Just check it once after running a subprocess: that's the only way it ought to be able to disappear (ie. in a 'make clean' setup).	2010-12-10 00:50:52 -08:00
Avery Pennarun	29d6c9a746	Don't db.commit() so frequently. Just commit when we're about to do something blocking. sqlite goes a lot faster with bigger transactions. This change does show a small percentage speedup in tests, but not as much as I'd like.	2010-12-10 00:50:52 -08:00
Avery Pennarun	fb79851530	Calculate dependencies with fewer sqlite queries.	2010-12-10 00:50:52 -08:00
Avery Pennarun	c339359f04	Schema cleanup.	2010-12-10 00:50:52 -08:00
Avery Pennarun	b86a32d33d	flush-cache.sh: for speed, disable sqlite's synchronous mode.	2010-12-10 00:50:52 -08:00
Avery Pennarun	f4535be0cd	Fix a deadlock. We were holding a database open with a read lock while a child redo might need to open it with a write lock.	2010-12-10 00:50:52 -08:00
Avery Pennarun	9e36106642	sqlite3: configure the timeout explicitly. In flush-cache.sh, we have to do this, because the sqlite3 command-line tool sets it to zero. Inevitably during parallel testing, it'll end up contending for a lock, and we really want it to wait a bit. In state.py, it's not as important since the default is nonzero. But python-sqlite3's default of 5 seconds makes me a little too nervous; I can imagine a disk write waiting for more than 5 seconds sometime. So let's use 60 instead.	2010-12-10 00:50:52 -08:00
Avery Pennarun	a62bd50d44	Switch state.py to use sqlite3 instead of filesystem-based stamps. It passes all tests when run serialized, but still gives weird errors (OperationalError: database is locked) when run with -j5. sqlite3 shouldn't be barfing just because the database is locked, since the default timeout is 5 seconds, and it's dying way faster than that.	2010-12-10 00:50:52 -08:00
Avery Pennarun	8dad223225	flush-cache: run it as a separate program, not using 'source' That way it doesn't clutter up 'redo -x' as much.	2010-12-10 00:50:52 -08:00
Avery Pennarun	43b74f3220	builder._nice(): show the right filename in the case of chdir(). This only affects cosmetics, not actual behaviour, which is why the unit tests didn't catch it.	2010-12-10 00:49:30 -08:00
Avery Pennarun	51bbdc6c5a	If we can't find a .do file for a target, mark it as not is_generated. This allows files to transition from generated to not-generated if the .do file is ever removed (ie. the user is changing things and the file is now a source file, not a target).	2010-12-06 03:12:53 -08:00
Avery Pennarun	0979a6e666	t/passfailtest.do: just return exit codes, don't print messages. The exit code numbers are useful enough, and the messages are the sort of thing that might turn into lies eventually.	2010-12-06 03:12:02 -08:00
Avery Pennarun	b3a14a28c4	When -x or -v is given, print the sh command we're executing.	2010-12-06 02:47:24 -08:00

... 6 7 8 9 10 ...

527 commits