Commit graph

13 commits

Author SHA1 Message Date
Avery Pennarun
670abbe305 jobserver.py: _try_read()'s alarm timeout needs to throw an exception.
In python3, os.read() automatically retries after EINTR, which breaks
our ability to interrupt on SIGALRM.

Instead, throw an exception from the SIGALRM handler, which should work
on both python2 and python3.

This fixes a rare deadlock during parallel builds on python3.

For background:
https://www.python.org/dev/peps/pep-0475/#backward-compatibility

"Applications relying on the fact that system calls are interrupted
with InterruptedError will hang. The authors of this PEP don't think
that such applications exist [...]"

Well, apparently they were mistaken :)
2020-06-15 02:20:02 -04:00
Avery Pennarun
aa920f12ed jobserver.py: fix very rare python3 failure reported by a user.
Traceback (most recent call last):
  File "/nix/store/i0835myyhrfr13lh4y26r58406kk90xj-redo-apenwarr-0.42a/bin/../lib/redo/cmd_ifchange.py", line 54, in main
    jobserver.force_return_tokens()
  File "/nix/store/i0835myyhrfr13lh4y26r58406kk90xj-redo-apenwarr-0.42a/bin/../lib/redo/jobserver.py", line 482, in force_return_tokens
    os.write(_cheatfds[1], 't' * _cheats)
TypeError: a bytes-like object is required, not 'str'

Unfortunately I wasn't able to replicate it, but this is obviously the right fix.
2020-06-15 00:39:10 -04:00
Moritz Lell
e239820afd Distinguish byte (python2 str type) and unicode strings (python 3 str type)
Python 3 strings are python 2 unicode strings. Therefore consistently mark
strings that are sent via pipes or written/read to file as byte strings.
2019-10-30 21:28:49 +01:00
Moritz Lell
491040ea72 Run 2to3 utility 2019-10-30 19:58:12 +01:00
Avery Pennarun
3dbdfbc06f Better handling if parent closes REDO_CHEATFDS or MAKEFLAGS fds.
Silently recover if REDO_CHEATFDS file descriptors are closed, because
they aren't completely essential and MAKEFLAGS-related warnings already
get printed if all file descriptors have been closed.

If MAKEFLAGS --jobserver-auth flags are closed, improve the error
message so that a) it's a normal error instead of an exception and b)
we link to documentation about why it happens.  Also write some more
detailed documentation about what's going on here.
2019-01-18 00:11:48 +00:00
Avery Pennarun
19049d52fc jobserver: allow overriding the parent jobserver in a subprocess.
Previously, if you passed a -j option to a redo process in a redo or
make process hierarchy with MAKEFLAGS already set, it would ignore the
-j option and continue using the jobserver provided by the parent.

With this change, we instead initialize a new jobserver with the
desired number of tokens, which is what GNU make does in the same
situation.  A typical use case for this is to force serialization of
build steps in a subtree (by using -j1).  In make, this is often useful
for "fixing" makefiles that haven't been written correctly for parallel
builds.  In redo, that happens much less often, but it's useful at
least in unit tests.

Passing -j1 is relatively harmless (the redo you are starting inherits
a token anyway, so it doesn't create any new tokens).  Passing -j > 1
is more risky, because it creates new tokens, thus increasing the level
of parallelism in the system.  Because this may not be what you wanted,
we print a warning when you pass -j > 1 to a sub-redo.  GNU make gives
a similar warning in this situation.
2018-12-31 19:24:27 -05:00
Avery Pennarun
e247a72300 jobserver: don't release the very last token in wait_all().
After waiting for children to exit, we would release our own token, and
then the caller would immediately try to obtain a token again.  This
accounted for tokens correctly, but would pass tokens around the call
tree in unexpected ways.

For example, imagine we had only one token.  We call 'redo a1 a2', and
a1 calls 'redo b1 b2', and b1 calls 'redo c1'.  When c1 exits, it
releases its token, then tries to re-acquire it before exiting.  This
also includes 'redo b1 b2' and 'redo a1 a2' in the race for the token,
which means b1 might get suspended while *either* a2 or b2 starts
running.

This never caused a deadlock, even if a2 or b2 depends on b1, because
if they tried to build b1, they would notice it is locked, give up
their token, and wait for the lock.  c1 (and then b1) could then obtain
the token and immediately terminate, allowing progress to continue.

But this is not really the way we expect things to happen.  "Obviously"
what we want here is a straightforward stack unwinding: c1 should finish,
then b1, then b2, then a1, then b2.

The not-very-obvious symptom of this bug is that redo's unit tests
seemed to run in the wrong order when using -j1 --no-log.  (--log would
hide the problem by rearranging logs back into the right order!)
2018-12-31 19:02:55 -05:00
Avery Pennarun
29f939013e Add a bunch of missing python docstrings.
This appeases pylint, so un-disable its docstring warning.
2018-12-14 09:03:53 +00:00
Avery Pennarun
2b4fe812e2 Some renaming and comments to try to clarify builder and jobserver.
The code is still a bit spaghetti-like, especialy when it comes to
redo-unlocked, but at least the new names are slightly more
comprehensible.
2018-12-11 04:17:27 +00:00
Avery Pennarun
bd8dbfb487 Switch to module-relative import syntax.
Now that the python scripts are all in a "redo" python module, we can
use the "new style" (ahem) package-relative imports.  This appeases
pylint, plus avoids confusion in case more than one package has
similarly-named modules.
2018-12-05 02:34:36 -05:00
Avery Pennarun
9b6d1eeb6e env and env_init: Eliminate weird auto-initialization of globals.
Merge the two files into env, and make each command explicitly call the
function that sets it up in the way that's needed for that command.

This means we can finally just import all the modules at the top of
each file, without worrying about import order.  Phew.

While we're here, remove the weird auto-appending-'all'-to-targets
feature in env.init().  Instead, do it explicitly, and only from redo and
redo-ifchange, only if is_toplevel and no other targets are given.
2018-12-05 02:27:04 -05:00
Avery Pennarun
ded14507b0 Rename vars{,_init}.py -> env{,_init}.py.
This fixes some pylint 'redefined builtins' warnings.  While I was
here, I fixed the others too by renaming a few local variables.
2018-12-05 02:26:49 -05:00
Avery Pennarun
65cf1c9854 Rename jwack.py -> jobserver.py.
I'm not really sure why I called it jwack.  I think it was kind of a
wack jobserver(tm).  But nowadays most of the wack-ness is gone.
2018-12-05 00:22:10 -05:00
Renamed from redo/jwack.py (Browse further)