2010-11-12 20:08:38 -08:00
|
|
|
#
|
redo-log: prioritize the "foreground" process.
When running a parallel build, redo-log -f (which is auto-started by
redo) tries to traverse through the logs depth first, in the order
parent processes started subprocesses. This works pretty well, but if
its dependencies are locked, a process might have to give up its
jobserver token while other stuff builds its dependencies. After the
dependency finishes, the parent might not be able to get a token for
quite some time, and the logs will appear to stop.
To prevent this from happening, we can instantiate up to one "cheater"
token, only in the foreground process (the one locked by redo-log -f),
which will allow it to continue running, albeit a bit slowly (since it
only has one token out of possibly many). When the process finishes,
we then destroy the fake token. It gets a little complicated; see
explanation at the top of jwack.py.
2018-11-17 04:32:09 -05:00
|
|
|
# Implementation of a GNU make-compatible jobserver.
|
|
|
|
|
#
|
|
|
|
|
# The basic idea is that both ends of a pipe (tokenfds) are shared with all
|
|
|
|
|
# subprocesses. At startup, we write one "token" into the pipe for each
|
|
|
|
|
# configured job. (So eg. redo -j20 will put 20 tokens in the pipe.) In
|
|
|
|
|
# order to do work, you must first obtain a token, by reading the other
|
|
|
|
|
# end of the pipe. When you're done working, you write the token back into
|
|
|
|
|
# the pipe so that someone else can grab it.
|
|
|
|
|
#
|
|
|
|
|
# The toplevel process in the hierarchy is what creates the pipes in the
|
|
|
|
|
# first place. Then it puts the pipe file descriptor numbers into MAKEFLAGS,
|
|
|
|
|
# so that subprocesses can pull them back out.
|
|
|
|
|
#
|
|
|
|
|
# As usual, edge cases make all this a bit tricky:
|
|
|
|
|
#
|
|
|
|
|
# - Every process is defined as owning a token at startup time. This makes
|
|
|
|
|
# sense because it's backward compatible with single-process make: if a
|
|
|
|
|
# subprocess neither reads nor writes the pipe, then it has exactly one
|
|
|
|
|
# token, so it's allowed to do one thread of work.
|
|
|
|
|
#
|
|
|
|
|
# - Thus, for symmetry, processes also must own a token at exit time.
|
|
|
|
|
#
|
|
|
|
|
# - In turn, to make *that* work, a parent process must destroy *its* token
|
|
|
|
|
# upon launching a subprocess. (Destroy, not release, because the
|
|
|
|
|
# subprocess has created its own token.) It can try to obtain another
|
|
|
|
|
# token, but if none are available, it has to stop work until one of its
|
|
|
|
|
# subprocesses finishes. When the subprocess finishes, its token is
|
|
|
|
|
# destroyed, so the parent creates a new one.
|
|
|
|
|
#
|
|
|
|
|
# - If our process is going to stop and wait for a lock (eg. because we
|
|
|
|
|
# depend on a target and someone else is already building that target),
|
|
|
|
|
# we must give up our token. Otherwise, we're sucking up a "thread" (a
|
|
|
|
|
# unit of parallelism) just to do nothing. If enough processes are waiting
|
|
|
|
|
# on a particular lock, then the process building that target might end up
|
|
|
|
|
# with only a single token, and everything gets serialized.
|
|
|
|
|
#
|
|
|
|
|
# - Unfortunately this leads to a problem: if we give up our token, we then
|
|
|
|
|
# have to re-acquire a token before exiting, even if we want to exit with
|
|
|
|
|
# an error code.
|
|
|
|
|
#
|
|
|
|
|
# - redo-log wants to linearize output so that it always prints log messages
|
|
|
|
|
# in the order jobs were started; but because of the above, a job being
|
|
|
|
|
# logged might end up with no tokens for a long time, waiting for some
|
|
|
|
|
# other branch of the build to complete.
|
|
|
|
|
#
|
|
|
|
|
# As a result, we extend beyond GNU make's model and make things even more
|
|
|
|
|
# complicated. We add a second pipe, cheatfds, which we use to "cheat" on
|
|
|
|
|
# tokens if our particular job is in the foreground (ie. is the one
|
|
|
|
|
# currently being tailed by redo-log -f). We add at most one token per
|
|
|
|
|
# redo-log instance. If we are the foreground task, and we need a token,
|
|
|
|
|
# and we don't have a token, and we don't have any subtasks (because if we
|
|
|
|
|
# had a subtask, then we're not in the foreground), we synthesize our own
|
|
|
|
|
# token by incrementing _mytokens and _cheats, but we don't read from
|
|
|
|
|
# tokenfds. Then, when it's time to give up our token again, we also won't
|
|
|
|
|
# write back to tokenfds, so the synthesized token disappears.
|
|
|
|
|
#
|
|
|
|
|
# Of course, all that then leads to *another* problem: every process must
|
|
|
|
|
# hold a *real* token when it exits, because its parent has given up a
|
|
|
|
|
# *real* token in order to start this subprocess. If we're holding a cheat
|
|
|
|
|
# token when it's time to exit, then we can't meet this requirement. The
|
|
|
|
|
# obvious thing to do would be to give up the cheat token and wait for a
|
|
|
|
|
# real token, but that might take a very long time, and if we're the last
|
|
|
|
|
# thing preventing our parent from exiting, then redo-log will sit around
|
|
|
|
|
# following our parent until we finally get a token so we can exit,
|
|
|
|
|
# defeating the whole purpose of cheating. Instead of waiting, we write our
|
|
|
|
|
# "cheater" token to cheatfds. Then, any task, upon noticing one of its
|
|
|
|
|
# subprocesses has finished, will check to see if there are any tokens on
|
|
|
|
|
# cheatfds; if so, it will remove one of them and *not* re-create its
|
|
|
|
|
# child's token, thus destroying the cheater token from earlier, and restoring
|
|
|
|
|
# balance.
|
|
|
|
|
#
|
|
|
|
|
# Sorry this is so complicated. I couldn't think of a way to make it
|
|
|
|
|
# simpler :)
|
2010-11-12 20:08:38 -08:00
|
|
|
#
|
2010-12-10 04:55:13 -08:00
|
|
|
import sys, os, errno, select, fcntl, signal
|
2018-12-02 23:15:37 -05:00
|
|
|
from atoi import atoi
|
|
|
|
|
from helpers import close_on_exec
|
redo-log: prioritize the "foreground" process.
When running a parallel build, redo-log -f (which is auto-started by
redo) tries to traverse through the logs depth first, in the order
parent processes started subprocesses. This works pretty well, but if
its dependencies are locked, a process might have to give up its
jobserver token while other stuff builds its dependencies. After the
dependency finishes, the parent might not be able to get a token for
quite some time, and the logs will appear to stop.
To prevent this from happening, we can instantiate up to one "cheater"
token, only in the foreground process (the one locked by redo-log -f),
which will allow it to continue running, albeit a bit slowly (since it
only has one token out of possibly many). When the process finishes,
we then destroy the fake token. It gets a little complicated; see
explanation at the top of jwack.py.
2018-11-17 04:32:09 -05:00
|
|
|
import state, vars
|
2010-11-12 20:08:38 -08:00
|
|
|
|
2010-11-13 04:36:44 -08:00
|
|
|
_toplevel = 0
|
|
|
|
|
_mytokens = 1
|
redo-log: prioritize the "foreground" process.
When running a parallel build, redo-log -f (which is auto-started by
redo) tries to traverse through the logs depth first, in the order
parent processes started subprocesses. This works pretty well, but if
its dependencies are locked, a process might have to give up its
jobserver token while other stuff builds its dependencies. After the
dependency finishes, the parent might not be able to get a token for
quite some time, and the logs will appear to stop.
To prevent this from happening, we can instantiate up to one "cheater"
token, only in the foreground process (the one locked by redo-log -f),
which will allow it to continue running, albeit a bit slowly (since it
only has one token out of possibly many). When the process finishes,
we then destroy the fake token. It gets a little complicated; see
explanation at the top of jwack.py.
2018-11-17 04:32:09 -05:00
|
|
|
_cheats = 0
|
|
|
|
|
_tokenfds = None
|
|
|
|
|
_cheatfds = None
|
2010-11-12 20:08:38 -08:00
|
|
|
_waitfds = {}
|
2010-11-13 04:36:44 -08:00
|
|
|
|
|
|
|
|
|
|
|
|
|
def _debug(s):
|
|
|
|
|
if 0:
|
2018-12-02 23:15:37 -05:00
|
|
|
sys.stderr.write('jwack#%d: %s' % (os.getpid(), s))
|
2010-11-13 04:36:44 -08:00
|
|
|
|
redo-log: prioritize the "foreground" process.
When running a parallel build, redo-log -f (which is auto-started by
redo) tries to traverse through the logs depth first, in the order
parent processes started subprocesses. This works pretty well, but if
its dependencies are locked, a process might have to give up its
jobserver token while other stuff builds its dependencies. After the
dependency finishes, the parent might not be able to get a token for
quite some time, and the logs will appear to stop.
To prevent this from happening, we can instantiate up to one "cheater"
token, only in the foreground process (the one locked by redo-log -f),
which will allow it to continue running, albeit a bit slowly (since it
only has one token out of possibly many). When the process finishes,
we then destroy the fake token. It gets a little complicated; see
explanation at the top of jwack.py.
2018-11-17 04:32:09 -05:00
|
|
|
|
|
|
|
|
def _create_tokens(n):
|
|
|
|
|
global _mytokens, _cheats
|
|
|
|
|
assert n >= 0
|
|
|
|
|
assert _cheats >= 0
|
|
|
|
|
for _ in xrange(n):
|
|
|
|
|
if _cheats > 0:
|
|
|
|
|
_cheats -= 1
|
|
|
|
|
else:
|
|
|
|
|
_mytokens += 1
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def _destroy_tokens(n):
|
2010-11-13 04:36:44 -08:00
|
|
|
global _mytokens
|
redo-log: prioritize the "foreground" process.
When running a parallel build, redo-log -f (which is auto-started by
redo) tries to traverse through the logs depth first, in the order
parent processes started subprocesses. This works pretty well, but if
its dependencies are locked, a process might have to give up its
jobserver token while other stuff builds its dependencies. After the
dependency finishes, the parent might not be able to get a token for
quite some time, and the logs will appear to stop.
To prevent this from happening, we can instantiate up to one "cheater"
token, only in the foreground process (the one locked by redo-log -f),
which will allow it to continue running, albeit a bit slowly (since it
only has one token out of possibly many). When the process finishes,
we then destroy the fake token. It gets a little complicated; see
explanation at the top of jwack.py.
2018-11-17 04:32:09 -05:00
|
|
|
assert _mytokens >= n
|
|
|
|
|
_mytokens -= n
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def _release(n):
|
|
|
|
|
global _mytokens, _cheats
|
|
|
|
|
assert n >= 0
|
|
|
|
|
assert _mytokens >= n
|
|
|
|
|
_debug('%d,%d -> release(%d)\n' % (_mytokens, _cheats, n))
|
|
|
|
|
n_to_share = 0
|
|
|
|
|
for _ in xrange(n):
|
|
|
|
|
_mytokens -= 1
|
|
|
|
|
if _cheats > 0:
|
|
|
|
|
_cheats -= 1
|
|
|
|
|
else:
|
|
|
|
|
n_to_share += 1
|
|
|
|
|
assert _mytokens >= 0
|
|
|
|
|
assert _cheats >= 0
|
|
|
|
|
if n_to_share:
|
|
|
|
|
_debug('PUT tokenfds %d\n' % n_to_share)
|
|
|
|
|
os.write(_tokenfds[1], 't' * n_to_share)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def _release_except_mine():
|
|
|
|
|
assert _mytokens > 0
|
|
|
|
|
_release(_mytokens - 1)
|
2010-11-13 04:36:44 -08:00
|
|
|
|
|
|
|
|
|
2010-12-10 23:04:46 -08:00
|
|
|
def release_mine():
|
redo-log: prioritize the "foreground" process.
When running a parallel build, redo-log -f (which is auto-started by
redo) tries to traverse through the logs depth first, in the order
parent processes started subprocesses. This works pretty well, but if
its dependencies are locked, a process might have to give up its
jobserver token while other stuff builds its dependencies. After the
dependency finishes, the parent might not be able to get a token for
quite some time, and the logs will appear to stop.
To prevent this from happening, we can instantiate up to one "cheater"
token, only in the foreground process (the one locked by redo-log -f),
which will allow it to continue running, albeit a bit slowly (since it
only has one token out of possibly many). When the process finishes,
we then destroy the fake token. It gets a little complicated; see
explanation at the top of jwack.py.
2018-11-17 04:32:09 -05:00
|
|
|
assert _mytokens >= 1
|
|
|
|
|
_debug('%d,%d -> release_mine()\n' % (_mytokens, _cheats))
|
|
|
|
|
_release(1)
|
2010-12-10 23:04:46 -08:00
|
|
|
|
|
|
|
|
|
2010-12-10 04:55:13 -08:00
|
|
|
def _timeout(sig, frame):
|
|
|
|
|
pass
|
|
|
|
|
|
|
|
|
|
|
redo-log: capture and linearize the output of redo builds.
redo now saves the stderr from every .do script, for every target, into
a file in the .redo directory. That means you can look up the logs
from the most recent build of any target using the new redo-log
command, for example:
redo-log -r all
The default is to show logs non-recursively, that is, it'll show when a
target does redo-ifchange on another target, but it won't recurse into
the logs for the latter target. With -r (recursive), it does. With -u
(unchanged), it does even if redo-ifchange discovered that the target
was already up-to-date; in that case, it prints the logs of the *most
recent* time the target was generated.
With --no-details, redo-log will show only the 'redo' lines, not the
other log messages. For very noisy build systems (like recursing into
a 'make' instance) this can be helpful to get an overview of what
happened, without all the cruft.
You can use the -f (follow) option like tail -f, to follow a build
that's currently in progress until it finishes. redo itself spins up a
copy of redo-log -r -f while it runs, so you can see what's going on.
Still broken in this version:
- No man page or new tests yet.
- ANSI colors don't yet work (unless you use --raw-logs, which gives
the old-style behaviour).
- You can't redirect the output of a sub-redo to a file or a
pipe right now, because redo-log is eating it.
- The regex for matching 'redo' lines in the log is very gross.
Instead, we should put the raw log files in a more machine-parseable
format, and redo-log should turn that into human-readable format.
- redo-log tries to "linearize" the logs, which makes them
comprehensible even for a large parallel build. It recursively shows
log messages for each target in depth-first tree order (by tracing
into a new target every time it sees a 'redo' line). This works
really well, but in some specific cases, the "topmost" redo instance
can get stuck waiting for a jwack token, which makes it look like the
whole build has stalled, when really redo-log is just waiting a long
time for a particular subprocess to be able to continue. We'll need to
add a specific workaround for that.
2018-11-03 22:09:18 -04:00
|
|
|
# We make the pipes use the first available fd numbers starting at startfd.
|
|
|
|
|
# This makes it easier to differentiate different kinds of pipes when using
|
|
|
|
|
# strace.
|
2010-12-11 18:24:10 -08:00
|
|
|
def _make_pipe(startfd):
|
2018-12-02 23:15:37 -05:00
|
|
|
(a, b) = os.pipe()
|
2010-12-11 18:24:10 -08:00
|
|
|
fds = (fcntl.fcntl(a, fcntl.F_DUPFD, startfd),
|
2018-12-02 23:15:37 -05:00
|
|
|
fcntl.fcntl(b, fcntl.F_DUPFD, startfd + 1))
|
2010-12-11 18:24:10 -08:00
|
|
|
os.close(a)
|
|
|
|
|
os.close(b)
|
|
|
|
|
return fds
|
|
|
|
|
|
|
|
|
|
|
2010-11-13 04:36:44 -08:00
|
|
|
def _try_read(fd, n):
|
redo-log: prioritize the "foreground" process.
When running a parallel build, redo-log -f (which is auto-started by
redo) tries to traverse through the logs depth first, in the order
parent processes started subprocesses. This works pretty well, but if
its dependencies are locked, a process might have to give up its
jobserver token while other stuff builds its dependencies. After the
dependency finishes, the parent might not be able to get a token for
quite some time, and the logs will appear to stop.
To prevent this from happening, we can instantiate up to one "cheater"
token, only in the foreground process (the one locked by redo-log -f),
which will allow it to continue running, albeit a bit slowly (since it
only has one token out of possibly many). When the process finishes,
we then destroy the fake token. It gets a little complicated; see
explanation at the top of jwack.py.
2018-11-17 04:32:09 -05:00
|
|
|
"""Try to read n bytes from fd. Returns: '' on EOF, None if EAGAIN."""
|
|
|
|
|
assert state.is_flushed()
|
|
|
|
|
|
2010-12-10 04:55:13 -08:00
|
|
|
# using djb's suggested way of doing non-blocking reads from a blocking
|
|
|
|
|
# socket: http://cr.yp.to/unix/nonblock.html
|
|
|
|
|
# We can't just make the socket non-blocking, because we want to be
|
|
|
|
|
# compatible with GNU Make, and they can't handle it.
|
2018-12-02 23:15:37 -05:00
|
|
|
r, w, x = select.select([fd], [], [], 0)
|
2010-12-10 04:55:13 -08:00
|
|
|
if not r:
|
redo-log: prioritize the "foreground" process.
When running a parallel build, redo-log -f (which is auto-started by
redo) tries to traverse through the logs depth first, in the order
parent processes started subprocesses. This works pretty well, but if
its dependencies are locked, a process might have to give up its
jobserver token while other stuff builds its dependencies. After the
dependency finishes, the parent might not be able to get a token for
quite some time, and the logs will appear to stop.
To prevent this from happening, we can instantiate up to one "cheater"
token, only in the foreground process (the one locked by redo-log -f),
which will allow it to continue running, albeit a bit slowly (since it
only has one token out of possibly many). When the process finishes,
we then destroy the fake token. It gets a little complicated; see
explanation at the top of jwack.py.
2018-11-17 04:32:09 -05:00
|
|
|
return None # try again
|
2010-12-10 04:55:13 -08:00
|
|
|
# ok, the socket is readable - but some other process might get there
|
|
|
|
|
# first. We have to set an alarm() in case our read() gets stuck.
|
|
|
|
|
oldh = signal.signal(signal.SIGALRM, _timeout)
|
2010-11-13 04:36:44 -08:00
|
|
|
try:
|
2018-11-17 10:21:11 -05:00
|
|
|
signal.setitimer(signal.ITIMER_REAL, 0.01, 0.01) # emergency fallback
|
2010-11-13 04:50:03 -08:00
|
|
|
try:
|
redo-log: prioritize the "foreground" process.
When running a parallel build, redo-log -f (which is auto-started by
redo) tries to traverse through the logs depth first, in the order
parent processes started subprocesses. This works pretty well, but if
its dependencies are locked, a process might have to give up its
jobserver token while other stuff builds its dependencies. After the
dependency finishes, the parent might not be able to get a token for
quite some time, and the logs will appear to stop.
To prevent this from happening, we can instantiate up to one "cheater"
token, only in the foreground process (the one locked by redo-log -f),
which will allow it to continue running, albeit a bit slowly (since it
only has one token out of possibly many). When the process finishes,
we then destroy the fake token. It gets a little complicated; see
explanation at the top of jwack.py.
2018-11-17 04:32:09 -05:00
|
|
|
b = os.read(fd, 1)
|
2010-11-13 04:50:03 -08:00
|
|
|
except OSError, e:
|
2010-12-10 04:55:13 -08:00
|
|
|
if e.errno in (errno.EAGAIN, errno.EINTR):
|
|
|
|
|
# interrupted or it was nonblocking
|
redo-log: prioritize the "foreground" process.
When running a parallel build, redo-log -f (which is auto-started by
redo) tries to traverse through the logs depth first, in the order
parent processes started subprocesses. This works pretty well, but if
its dependencies are locked, a process might have to give up its
jobserver token while other stuff builds its dependencies. After the
dependency finishes, the parent might not be able to get a token for
quite some time, and the logs will appear to stop.
To prevent this from happening, we can instantiate up to one "cheater"
token, only in the foreground process (the one locked by redo-log -f),
which will allow it to continue running, albeit a bit slowly (since it
only has one token out of possibly many). When the process finishes,
we then destroy the fake token. It gets a little complicated; see
explanation at the top of jwack.py.
2018-11-17 04:32:09 -05:00
|
|
|
return None # try again
|
2010-11-13 04:50:03 -08:00
|
|
|
else:
|
|
|
|
|
raise
|
|
|
|
|
finally:
|
2018-11-17 10:21:11 -05:00
|
|
|
signal.setitimer(signal.ITIMER_REAL, 0, 0)
|
2010-12-10 04:55:13 -08:00
|
|
|
signal.signal(signal.SIGALRM, oldh)
|
redo-log: prioritize the "foreground" process.
When running a parallel build, redo-log -f (which is auto-started by
redo) tries to traverse through the logs depth first, in the order
parent processes started subprocesses. This works pretty well, but if
its dependencies are locked, a process might have to give up its
jobserver token while other stuff builds its dependencies. After the
dependency finishes, the parent might not be able to get a token for
quite some time, and the logs will appear to stop.
To prevent this from happening, we can instantiate up to one "cheater"
token, only in the foreground process (the one locked by redo-log -f),
which will allow it to continue running, albeit a bit slowly (since it
only has one token out of possibly many). When the process finishes,
we then destroy the fake token. It gets a little complicated; see
explanation at the top of jwack.py.
2018-11-17 04:32:09 -05:00
|
|
|
return b
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def _try_read_all(fd, n):
|
|
|
|
|
bb = ''
|
|
|
|
|
while 1:
|
|
|
|
|
b = _try_read(fd, n)
|
|
|
|
|
if not b:
|
|
|
|
|
break
|
|
|
|
|
bb += b
|
|
|
|
|
return bb
|
2010-11-12 20:08:38 -08:00
|
|
|
|
|
|
|
|
|
|
|
|
|
def setup(maxjobs):
|
redo-log: prioritize the "foreground" process.
When running a parallel build, redo-log -f (which is auto-started by
redo) tries to traverse through the logs depth first, in the order
parent processes started subprocesses. This works pretty well, but if
its dependencies are locked, a process might have to give up its
jobserver token while other stuff builds its dependencies. After the
dependency finishes, the parent might not be able to get a token for
quite some time, and the logs will appear to stop.
To prevent this from happening, we can instantiate up to one "cheater"
token, only in the foreground process (the one locked by redo-log -f),
which will allow it to continue running, albeit a bit slowly (since it
only has one token out of possibly many). When the process finishes,
we then destroy the fake token. It gets a little complicated; see
explanation at the top of jwack.py.
2018-11-17 04:32:09 -05:00
|
|
|
global _tokenfds, _cheatfds, _toplevel
|
|
|
|
|
assert maxjobs > 0
|
|
|
|
|
assert not _tokenfds
|
2010-11-13 04:36:44 -08:00
|
|
|
_debug('setup(%d)\n' % maxjobs)
|
2018-12-02 23:15:37 -05:00
|
|
|
|
2010-11-12 20:08:38 -08:00
|
|
|
flags = ' ' + os.getenv('MAKEFLAGS', '') + ' '
|
2018-10-03 19:54:54 -04:00
|
|
|
FIND1 = ' --jobserver-auth=' # renamed in GNU make 4.2
|
|
|
|
|
FIND2 = ' --jobserver-fds=' # fallback syntax
|
|
|
|
|
FIND = FIND1
|
|
|
|
|
ofs = flags.find(FIND1)
|
|
|
|
|
if ofs < 0:
|
2018-12-02 23:15:37 -05:00
|
|
|
FIND = FIND2
|
|
|
|
|
ofs = flags.find(FIND2)
|
2010-11-12 20:08:38 -08:00
|
|
|
if ofs >= 0:
|
|
|
|
|
s = flags[ofs+len(FIND):]
|
2018-12-02 23:15:37 -05:00
|
|
|
(arg, junk) = s.split(' ', 1)
|
|
|
|
|
(a, b) = arg.split(',', 1)
|
2010-12-11 18:32:40 -08:00
|
|
|
a = atoi(a)
|
|
|
|
|
b = atoi(b)
|
2010-11-12 20:08:38 -08:00
|
|
|
if a <= 0 or b <= 0:
|
2018-10-03 19:54:54 -04:00
|
|
|
raise ValueError('invalid --jobserver-auth: %r' % arg)
|
2010-11-13 05:05:48 -08:00
|
|
|
try:
|
|
|
|
|
fcntl.fcntl(a, fcntl.F_GETFL)
|
|
|
|
|
fcntl.fcntl(b, fcntl.F_GETFL)
|
|
|
|
|
except IOError, e:
|
|
|
|
|
if e.errno == errno.EBADF:
|
2018-12-02 23:15:37 -05:00
|
|
|
raise ValueError('broken --jobserver-auth from make; ' +
|
|
|
|
|
'prefix your Makefile rule with a "+"')
|
2010-11-13 05:05:48 -08:00
|
|
|
else:
|
|
|
|
|
raise
|
2018-12-02 23:15:37 -05:00
|
|
|
_tokenfds = (a, b)
|
|
|
|
|
|
redo-log: prioritize the "foreground" process.
When running a parallel build, redo-log -f (which is auto-started by
redo) tries to traverse through the logs depth first, in the order
parent processes started subprocesses. This works pretty well, but if
its dependencies are locked, a process might have to give up its
jobserver token while other stuff builds its dependencies. After the
dependency finishes, the parent might not be able to get a token for
quite some time, and the logs will appear to stop.
To prevent this from happening, we can instantiate up to one "cheater"
token, only in the foreground process (the one locked by redo-log -f),
which will allow it to continue running, albeit a bit slowly (since it
only has one token out of possibly many). When the process finishes,
we then destroy the fake token. It gets a little complicated; see
explanation at the top of jwack.py.
2018-11-17 04:32:09 -05:00
|
|
|
cheats = os.getenv('REDO_CHEATFDS', '')
|
|
|
|
|
if cheats:
|
2018-12-02 23:15:37 -05:00
|
|
|
(a, b) = cheats.split(',', 1)
|
redo-log: prioritize the "foreground" process.
When running a parallel build, redo-log -f (which is auto-started by
redo) tries to traverse through the logs depth first, in the order
parent processes started subprocesses. This works pretty well, but if
its dependencies are locked, a process might have to give up its
jobserver token while other stuff builds its dependencies. After the
dependency finishes, the parent might not be able to get a token for
quite some time, and the logs will appear to stop.
To prevent this from happening, we can instantiate up to one "cheater"
token, only in the foreground process (the one locked by redo-log -f),
which will allow it to continue running, albeit a bit slowly (since it
only has one token out of possibly many). When the process finishes,
we then destroy the fake token. It gets a little complicated; see
explanation at the top of jwack.py.
2018-11-17 04:32:09 -05:00
|
|
|
a = atoi(a)
|
|
|
|
|
b = atoi(b)
|
|
|
|
|
if a <= 0 or b <= 0:
|
|
|
|
|
raise ValueError('invalid REDO_CHEATFDS: %r' % cheats)
|
2018-12-02 23:15:37 -05:00
|
|
|
_cheatfds = (a, b)
|
2018-12-04 00:07:23 -05:00
|
|
|
else:
|
|
|
|
|
_cheatfds = _make_pipe(102)
|
|
|
|
|
os.putenv('REDO_CHEATFDS', '%d,%d' % (_cheatfds[0], _cheatfds[1]))
|
2018-12-02 23:15:37 -05:00
|
|
|
|
redo-log: prioritize the "foreground" process.
When running a parallel build, redo-log -f (which is auto-started by
redo) tries to traverse through the logs depth first, in the order
parent processes started subprocesses. This works pretty well, but if
its dependencies are locked, a process might have to give up its
jobserver token while other stuff builds its dependencies. After the
dependency finishes, the parent might not be able to get a token for
quite some time, and the logs will appear to stop.
To prevent this from happening, we can instantiate up to one "cheater"
token, only in the foreground process (the one locked by redo-log -f),
which will allow it to continue running, albeit a bit slowly (since it
only has one token out of possibly many). When the process finishes,
we then destroy the fake token. It gets a little complicated; see
explanation at the top of jwack.py.
2018-11-17 04:32:09 -05:00
|
|
|
if not _tokenfds:
|
2010-11-12 20:08:38 -08:00
|
|
|
# need to start a new server
|
2010-11-13 04:36:44 -08:00
|
|
|
_toplevel = maxjobs
|
redo-log: prioritize the "foreground" process.
When running a parallel build, redo-log -f (which is auto-started by
redo) tries to traverse through the logs depth first, in the order
parent processes started subprocesses. This works pretty well, but if
its dependencies are locked, a process might have to give up its
jobserver token while other stuff builds its dependencies. After the
dependency finishes, the parent might not be able to get a token for
quite some time, and the logs will appear to stop.
To prevent this from happening, we can instantiate up to one "cheater"
token, only in the foreground process (the one locked by redo-log -f),
which will allow it to continue running, albeit a bit slowly (since it
only has one token out of possibly many). When the process finishes,
we then destroy the fake token. It gets a little complicated; see
explanation at the top of jwack.py.
2018-11-17 04:32:09 -05:00
|
|
|
_tokenfds = _make_pipe(100)
|
|
|
|
|
_create_tokens(maxjobs - 1)
|
|
|
|
|
_release_except_mine()
|
2010-11-12 20:08:38 -08:00
|
|
|
os.putenv('MAKEFLAGS',
|
2018-10-03 19:54:54 -04:00
|
|
|
'%s -j --jobserver-auth=%d,%d --jobserver-fds=%d,%d' %
|
|
|
|
|
(os.getenv('MAKEFLAGS', ''),
|
redo-log: prioritize the "foreground" process.
When running a parallel build, redo-log -f (which is auto-started by
redo) tries to traverse through the logs depth first, in the order
parent processes started subprocesses. This works pretty well, but if
its dependencies are locked, a process might have to give up its
jobserver token while other stuff builds its dependencies. After the
dependency finishes, the parent might not be able to get a token for
quite some time, and the logs will appear to stop.
To prevent this from happening, we can instantiate up to one "cheater"
token, only in the foreground process (the one locked by redo-log -f),
which will allow it to continue running, albeit a bit slowly (since it
only has one token out of possibly many). When the process finishes,
we then destroy the fake token. It gets a little complicated; see
explanation at the top of jwack.py.
2018-11-17 04:32:09 -05:00
|
|
|
_tokenfds[0], _tokenfds[1],
|
|
|
|
|
_tokenfds[0], _tokenfds[1]))
|
2010-11-12 20:08:38 -08:00
|
|
|
|
|
|
|
|
|
redo-log: prioritize the "foreground" process.
When running a parallel build, redo-log -f (which is auto-started by
redo) tries to traverse through the logs depth first, in the order
parent processes started subprocesses. This works pretty well, but if
its dependencies are locked, a process might have to give up its
jobserver token while other stuff builds its dependencies. After the
dependency finishes, the parent might not be able to get a token for
quite some time, and the logs will appear to stop.
To prevent this from happening, we can instantiate up to one "cheater"
token, only in the foreground process (the one locked by redo-log -f),
which will allow it to continue running, albeit a bit slowly (since it
only has one token out of possibly many). When the process finishes,
we then destroy the fake token. It gets a little complicated; see
explanation at the top of jwack.py.
2018-11-17 04:32:09 -05:00
|
|
|
def _wait(want_token, max_delay):
|
2010-11-12 20:08:38 -08:00
|
|
|
rfds = _waitfds.keys()
|
redo-log: prioritize the "foreground" process.
When running a parallel build, redo-log -f (which is auto-started by
redo) tries to traverse through the logs depth first, in the order
parent processes started subprocesses. This works pretty well, but if
its dependencies are locked, a process might have to give up its
jobserver token while other stuff builds its dependencies. After the
dependency finishes, the parent might not be able to get a token for
quite some time, and the logs will appear to stop.
To prevent this from happening, we can instantiate up to one "cheater"
token, only in the foreground process (the one locked by redo-log -f),
which will allow it to continue running, albeit a bit slowly (since it
only has one token out of possibly many). When the process finishes,
we then destroy the fake token. It gets a little complicated; see
explanation at the top of jwack.py.
2018-11-17 04:32:09 -05:00
|
|
|
if want_token:
|
|
|
|
|
rfds.append(_tokenfds[0])
|
|
|
|
|
assert rfds
|
|
|
|
|
assert state.is_flushed()
|
2018-12-02 23:15:37 -05:00
|
|
|
r, w, x = select.select(rfds, [], [], max_delay)
|
redo-log: prioritize the "foreground" process.
When running a parallel build, redo-log -f (which is auto-started by
redo) tries to traverse through the logs depth first, in the order
parent processes started subprocesses. This works pretty well, but if
its dependencies are locked, a process might have to give up its
jobserver token while other stuff builds its dependencies. After the
dependency finishes, the parent might not be able to get a token for
quite some time, and the logs will appear to stop.
To prevent this from happening, we can instantiate up to one "cheater"
token, only in the foreground process (the one locked by redo-log -f),
which will allow it to continue running, albeit a bit slowly (since it
only has one token out of possibly many). When the process finishes,
we then destroy the fake token. It gets a little complicated; see
explanation at the top of jwack.py.
2018-11-17 04:32:09 -05:00
|
|
|
_debug('_tokenfds=%r; wfds=%r; readable: %r\n' % (_tokenfds, _waitfds, r))
|
2010-11-12 20:08:38 -08:00
|
|
|
for fd in r:
|
redo-log: prioritize the "foreground" process.
When running a parallel build, redo-log -f (which is auto-started by
redo) tries to traverse through the logs depth first, in the order
parent processes started subprocesses. This works pretty well, but if
its dependencies are locked, a process might have to give up its
jobserver token while other stuff builds its dependencies. After the
dependency finishes, the parent might not be able to get a token for
quite some time, and the logs will appear to stop.
To prevent this from happening, we can instantiate up to one "cheater"
token, only in the foreground process (the one locked by redo-log -f),
which will allow it to continue running, albeit a bit slowly (since it
only has one token out of possibly many). When the process finishes,
we then destroy the fake token. It gets a little complicated; see
explanation at the top of jwack.py.
2018-11-17 04:32:09 -05:00
|
|
|
if fd == _tokenfds[0]:
|
2010-11-12 20:08:38 -08:00
|
|
|
pass
|
|
|
|
|
else:
|
2010-11-13 04:36:44 -08:00
|
|
|
pd = _waitfds[fd]
|
|
|
|
|
_debug("done: %r\n" % pd.name)
|
redo-log: prioritize the "foreground" process.
When running a parallel build, redo-log -f (which is auto-started by
redo) tries to traverse through the logs depth first, in the order
parent processes started subprocesses. This works pretty well, but if
its dependencies are locked, a process might have to give up its
jobserver token while other stuff builds its dependencies. After the
dependency finishes, the parent might not be able to get a token for
quite some time, and the logs will appear to stop.
To prevent this from happening, we can instantiate up to one "cheater"
token, only in the foreground process (the one locked by redo-log -f),
which will allow it to continue running, albeit a bit slowly (since it
only has one token out of possibly many). When the process finishes,
we then destroy the fake token. It gets a little complicated; see
explanation at the top of jwack.py.
2018-11-17 04:32:09 -05:00
|
|
|
# redo subprocesses are expected to die without releasing their
|
|
|
|
|
# tokens, so things are less likely to get confused if they
|
|
|
|
|
# die abnormally. That means a token has 'disappeared' and we
|
|
|
|
|
# now need to recreate it.
|
|
|
|
|
b = _try_read(_cheatfds[0], 1)
|
|
|
|
|
_debug('GOT cheatfd\n')
|
2018-12-02 23:15:37 -05:00
|
|
|
if b is None:
|
redo-log: prioritize the "foreground" process.
When running a parallel build, redo-log -f (which is auto-started by
redo) tries to traverse through the logs depth first, in the order
parent processes started subprocesses. This works pretty well, but if
its dependencies are locked, a process might have to give up its
jobserver token while other stuff builds its dependencies. After the
dependency finishes, the parent might not be able to get a token for
quite some time, and the logs will appear to stop.
To prevent this from happening, we can instantiate up to one "cheater"
token, only in the foreground process (the one locked by redo-log -f),
which will allow it to continue running, albeit a bit slowly (since it
only has one token out of possibly many). When the process finishes,
we then destroy the fake token. It gets a little complicated; see
explanation at the top of jwack.py.
2018-11-17 04:32:09 -05:00
|
|
|
_create_tokens(1)
|
|
|
|
|
if has_token():
|
|
|
|
|
_release_except_mine()
|
|
|
|
|
else:
|
|
|
|
|
# someone exited with _cheats > 0, so we need to compensate
|
|
|
|
|
# by *not* re-creating a token now.
|
|
|
|
|
pass
|
2010-11-13 04:36:44 -08:00
|
|
|
os.close(fd)
|
|
|
|
|
del _waitfds[fd]
|
|
|
|
|
rv = os.waitpid(pd.pid, 0)
|
redo-log: prioritize the "foreground" process.
When running a parallel build, redo-log -f (which is auto-started by
redo) tries to traverse through the logs depth first, in the order
parent processes started subprocesses. This works pretty well, but if
its dependencies are locked, a process might have to give up its
jobserver token while other stuff builds its dependencies. After the
dependency finishes, the parent might not be able to get a token for
quite some time, and the logs will appear to stop.
To prevent this from happening, we can instantiate up to one "cheater"
token, only in the foreground process (the one locked by redo-log -f),
which will allow it to continue running, albeit a bit slowly (since it
only has one token out of possibly many). When the process finishes,
we then destroy the fake token. It gets a little complicated; see
explanation at the top of jwack.py.
2018-11-17 04:32:09 -05:00
|
|
|
assert rv[0] == pd.pid
|
2010-11-21 07:09:47 -08:00
|
|
|
_debug("done1: rv=%r\n" % (rv,))
|
2010-11-13 04:36:44 -08:00
|
|
|
rv = rv[1]
|
|
|
|
|
if os.WIFEXITED(rv):
|
|
|
|
|
pd.rv = os.WEXITSTATUS(rv)
|
2010-11-12 20:08:38 -08:00
|
|
|
else:
|
2010-11-13 04:36:44 -08:00
|
|
|
pd.rv = -os.WTERMSIG(rv)
|
2010-11-21 07:09:47 -08:00
|
|
|
_debug("done2: rv=%d\n" % pd.rv)
|
2010-11-19 06:04:45 -08:00
|
|
|
pd.donefunc(pd.name, pd.rv)
|
2010-11-13 04:36:44 -08:00
|
|
|
|
|
|
|
|
|
2010-12-09 05:53:30 -08:00
|
|
|
def has_token():
|
redo-log: prioritize the "foreground" process.
When running a parallel build, redo-log -f (which is auto-started by
redo) tries to traverse through the logs depth first, in the order
parent processes started subprocesses. This works pretty well, but if
its dependencies are locked, a process might have to give up its
jobserver token while other stuff builds its dependencies. After the
dependency finishes, the parent might not be able to get a token for
quite some time, and the logs will appear to stop.
To prevent this from happening, we can instantiate up to one "cheater"
token, only in the foreground process (the one locked by redo-log -f),
which will allow it to continue running, albeit a bit slowly (since it
only has one token out of possibly many). When the process finishes,
we then destroy the fake token. It gets a little complicated; see
explanation at the top of jwack.py.
2018-11-17 04:32:09 -05:00
|
|
|
assert _mytokens >= 0
|
2010-12-09 05:53:30 -08:00
|
|
|
if _mytokens >= 1:
|
|
|
|
|
return True
|
|
|
|
|
|
|
|
|
|
|
redo-log: prioritize the "foreground" process.
When running a parallel build, redo-log -f (which is auto-started by
redo) tries to traverse through the logs depth first, in the order
parent processes started subprocesses. This works pretty well, but if
its dependencies are locked, a process might have to give up its
jobserver token while other stuff builds its dependencies. After the
dependency finishes, the parent might not be able to get a token for
quite some time, and the logs will appear to stop.
To prevent this from happening, we can instantiate up to one "cheater"
token, only in the foreground process (the one locked by redo-log -f),
which will allow it to continue running, albeit a bit slowly (since it
only has one token out of possibly many). When the process finishes,
we then destroy the fake token. It gets a little complicated; see
explanation at the top of jwack.py.
2018-11-17 04:32:09 -05:00
|
|
|
def ensure_token(reason, max_delay=None):
|
2010-11-13 04:36:44 -08:00
|
|
|
global _mytokens
|
redo-log: prioritize the "foreground" process.
When running a parallel build, redo-log -f (which is auto-started by
redo) tries to traverse through the logs depth first, in the order
parent processes started subprocesses. This works pretty well, but if
its dependencies are locked, a process might have to give up its
jobserver token while other stuff builds its dependencies. After the
dependency finishes, the parent might not be able to get a token for
quite some time, and the logs will appear to stop.
To prevent this from happening, we can instantiate up to one "cheater"
token, only in the foreground process (the one locked by redo-log -f),
which will allow it to continue running, albeit a bit slowly (since it
only has one token out of possibly many). When the process finishes,
we then destroy the fake token. It gets a little complicated; see
explanation at the top of jwack.py.
2018-11-17 04:32:09 -05:00
|
|
|
assert state.is_flushed()
|
|
|
|
|
assert _mytokens <= 1
|
2010-11-12 20:08:38 -08:00
|
|
|
while 1:
|
2010-11-13 04:36:44 -08:00
|
|
|
if _mytokens >= 1:
|
2010-11-21 22:46:20 -08:00
|
|
|
_debug("_mytokens is %d\n" % _mytokens)
|
redo-log: prioritize the "foreground" process.
When running a parallel build, redo-log -f (which is auto-started by
redo) tries to traverse through the logs depth first, in the order
parent processes started subprocesses. This works pretty well, but if
its dependencies are locked, a process might have to give up its
jobserver token while other stuff builds its dependencies. After the
dependency finishes, the parent might not be able to get a token for
quite some time, and the logs will appear to stop.
To prevent this from happening, we can instantiate up to one "cheater"
token, only in the foreground process (the one locked by redo-log -f),
which will allow it to continue running, albeit a bit slowly (since it
only has one token out of possibly many). When the process finishes,
we then destroy the fake token. It gets a little complicated; see
explanation at the top of jwack.py.
2018-11-17 04:32:09 -05:00
|
|
|
assert _mytokens == 1
|
2010-11-13 04:36:44 -08:00
|
|
|
_debug('(%r) used my own token...\n' % reason)
|
2010-11-21 22:46:20 -08:00
|
|
|
break
|
redo-log: prioritize the "foreground" process.
When running a parallel build, redo-log -f (which is auto-started by
redo) tries to traverse through the logs depth first, in the order
parent processes started subprocesses. This works pretty well, but if
its dependencies are locked, a process might have to give up its
jobserver token while other stuff builds its dependencies. After the
dependency finishes, the parent might not be able to get a token for
quite some time, and the logs will appear to stop.
To prevent this from happening, we can instantiate up to one "cheater"
token, only in the foreground process (the one locked by redo-log -f),
which will allow it to continue running, albeit a bit slowly (since it
only has one token out of possibly many). When the process finishes,
we then destroy the fake token. It gets a little complicated; see
explanation at the top of jwack.py.
2018-11-17 04:32:09 -05:00
|
|
|
assert _mytokens < 1
|
2010-11-13 04:36:44 -08:00
|
|
|
_debug('(%r) waiting for tokens...\n' % reason)
|
redo-log: prioritize the "foreground" process.
When running a parallel build, redo-log -f (which is auto-started by
redo) tries to traverse through the logs depth first, in the order
parent processes started subprocesses. This works pretty well, but if
its dependencies are locked, a process might have to give up its
jobserver token while other stuff builds its dependencies. After the
dependency finishes, the parent might not be able to get a token for
quite some time, and the logs will appear to stop.
To prevent this from happening, we can instantiate up to one "cheater"
token, only in the foreground process (the one locked by redo-log -f),
which will allow it to continue running, albeit a bit slowly (since it
only has one token out of possibly many). When the process finishes,
we then destroy the fake token. It gets a little complicated; see
explanation at the top of jwack.py.
2018-11-17 04:32:09 -05:00
|
|
|
_wait(want_token=1, max_delay=max_delay)
|
2010-11-21 22:46:20 -08:00
|
|
|
if _mytokens >= 1:
|
|
|
|
|
break
|
redo-log: prioritize the "foreground" process.
When running a parallel build, redo-log -f (which is auto-started by
redo) tries to traverse through the logs depth first, in the order
parent processes started subprocesses. This works pretty well, but if
its dependencies are locked, a process might have to give up its
jobserver token while other stuff builds its dependencies. After the
dependency finishes, the parent might not be able to get a token for
quite some time, and the logs will appear to stop.
To prevent this from happening, we can instantiate up to one "cheater"
token, only in the foreground process (the one locked by redo-log -f),
which will allow it to continue running, albeit a bit slowly (since it
only has one token out of possibly many). When the process finishes,
we then destroy the fake token. It gets a little complicated; see
explanation at the top of jwack.py.
2018-11-17 04:32:09 -05:00
|
|
|
assert _mytokens < 1
|
|
|
|
|
b = _try_read(_tokenfds[0], 1)
|
|
|
|
|
_debug('GOT tokenfd\n')
|
|
|
|
|
if b == '':
|
|
|
|
|
raise Exception('unexpected EOF on token read')
|
|
|
|
|
if b:
|
|
|
|
|
_mytokens += 1
|
|
|
|
|
_debug('(%r) got a token (%r).\n' % (reason, b))
|
|
|
|
|
break
|
|
|
|
|
if max_delay != None:
|
|
|
|
|
break
|
|
|
|
|
assert _mytokens <= 1
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def ensure_token_or_cheat(reason, cheatfunc):
|
|
|
|
|
global _mytokens, _cheats
|
|
|
|
|
backoff = 0.01
|
|
|
|
|
while not has_token():
|
|
|
|
|
while running() and not has_token():
|
|
|
|
|
# If we already have a subproc running, then effectively we
|
|
|
|
|
# already have a token. Don't create a cheater token unless
|
|
|
|
|
# we're completely idle.
|
|
|
|
|
ensure_token(reason, max_delay=None)
|
|
|
|
|
ensure_token(reason, max_delay=min(1.0, backoff))
|
|
|
|
|
backoff *= 2
|
|
|
|
|
if not has_token():
|
|
|
|
|
assert _mytokens == 0
|
|
|
|
|
n = cheatfunc()
|
|
|
|
|
_debug('%s: %s: cheat = %d\n' % (vars.TARGET, reason, n))
|
|
|
|
|
if n > 0:
|
|
|
|
|
_mytokens += n
|
|
|
|
|
_cheats += n
|
2010-11-12 20:08:38 -08:00
|
|
|
break
|
|
|
|
|
|
|
|
|
|
|
2010-11-19 06:04:45 -08:00
|
|
|
def running():
|
|
|
|
|
return len(_waitfds)
|
|
|
|
|
|
|
|
|
|
|
2010-11-12 20:08:38 -08:00
|
|
|
def wait_all():
|
redo-log: prioritize the "foreground" process.
When running a parallel build, redo-log -f (which is auto-started by
redo) tries to traverse through the logs depth first, in the order
parent processes started subprocesses. This works pretty well, but if
its dependencies are locked, a process might have to give up its
jobserver token while other stuff builds its dependencies. After the
dependency finishes, the parent might not be able to get a token for
quite some time, and the logs will appear to stop.
To prevent this from happening, we can instantiate up to one "cheater"
token, only in the foreground process (the one locked by redo-log -f),
which will allow it to continue running, albeit a bit slowly (since it
only has one token out of possibly many). When the process finishes,
we then destroy the fake token. It gets a little complicated; see
explanation at the top of jwack.py.
2018-11-17 04:32:09 -05:00
|
|
|
_debug("%d,%d -> wait_all\n" % (_mytokens, _cheats))
|
|
|
|
|
assert state.is_flushed()
|
|
|
|
|
while 1:
|
2010-12-10 23:04:46 -08:00
|
|
|
while _mytokens >= 1:
|
|
|
|
|
release_mine()
|
redo-log: prioritize the "foreground" process.
When running a parallel build, redo-log -f (which is auto-started by
redo) tries to traverse through the logs depth first, in the order
parent processes started subprocesses. This works pretty well, but if
its dependencies are locked, a process might have to give up its
jobserver token while other stuff builds its dependencies. After the
dependency finishes, the parent might not be able to get a token for
quite some time, and the logs will appear to stop.
To prevent this from happening, we can instantiate up to one "cheater"
token, only in the foreground process (the one locked by redo-log -f),
which will allow it to continue running, albeit a bit slowly (since it
only has one token out of possibly many). When the process finishes,
we then destroy the fake token. It gets a little complicated; see
explanation at the top of jwack.py.
2018-11-17 04:32:09 -05:00
|
|
|
if not running():
|
|
|
|
|
break
|
2010-11-13 04:36:44 -08:00
|
|
|
_debug("wait_all: wait()\n")
|
redo-log: prioritize the "foreground" process.
When running a parallel build, redo-log -f (which is auto-started by
redo) tries to traverse through the logs depth first, in the order
parent processes started subprocesses. This works pretty well, but if
its dependencies are locked, a process might have to give up its
jobserver token while other stuff builds its dependencies. After the
dependency finishes, the parent might not be able to get a token for
quite some time, and the logs will appear to stop.
To prevent this from happening, we can instantiate up to one "cheater"
token, only in the foreground process (the one locked by redo-log -f),
which will allow it to continue running, albeit a bit slowly (since it
only has one token out of possibly many). When the process finishes,
we then destroy the fake token. It gets a little complicated; see
explanation at the top of jwack.py.
2018-11-17 04:32:09 -05:00
|
|
|
_wait(want_token=0, max_delay=None)
|
2010-11-13 04:36:44 -08:00
|
|
|
_debug("wait_all: empty list\n")
|
|
|
|
|
if _toplevel:
|
redo-log: prioritize the "foreground" process.
When running a parallel build, redo-log -f (which is auto-started by
redo) tries to traverse through the logs depth first, in the order
parent processes started subprocesses. This works pretty well, but if
its dependencies are locked, a process might have to give up its
jobserver token while other stuff builds its dependencies. After the
dependency finishes, the parent might not be able to get a token for
quite some time, and the logs will appear to stop.
To prevent this from happening, we can instantiate up to one "cheater"
token, only in the foreground process (the one locked by redo-log -f),
which will allow it to continue running, albeit a bit slowly (since it
only has one token out of possibly many). When the process finishes,
we then destroy the fake token. It gets a little complicated; see
explanation at the top of jwack.py.
2018-11-17 04:32:09 -05:00
|
|
|
# If we're the toplevel and we're sure no child processes remain,
|
|
|
|
|
# then we know we're totally idle. Self-test to ensure no tokens
|
|
|
|
|
# mysteriously got created/destroyed.
|
|
|
|
|
tokens = _try_read_all(_tokenfds[0], 8192)
|
|
|
|
|
cheats = _try_read_all(_cheatfds[0], 8192)
|
|
|
|
|
_debug('toplevel: GOT %d tokens and %d cheats\n'
|
|
|
|
|
% (len(tokens), len(cheats)))
|
|
|
|
|
if len(tokens) - len(cheats) != _toplevel:
|
|
|
|
|
raise Exception('on exit: expected %d tokens; found %r-%r'
|
|
|
|
|
% (_toplevel, len(tokens), len(cheats)))
|
|
|
|
|
os.write(_tokenfds[1], tokens)
|
|
|
|
|
# note: when we return, we have *no* tokens, not even our own!
|
|
|
|
|
# If caller wants to continue, they have to obtain one right away.
|
2010-11-12 20:08:38 -08:00
|
|
|
|
|
|
|
|
|
|
|
|
|
def force_return_tokens():
|
2010-11-13 04:36:44 -08:00
|
|
|
n = len(_waitfds)
|
redo-log: prioritize the "foreground" process.
When running a parallel build, redo-log -f (which is auto-started by
redo) tries to traverse through the logs depth first, in the order
parent processes started subprocesses. This works pretty well, but if
its dependencies are locked, a process might have to give up its
jobserver token while other stuff builds its dependencies. After the
dependency finishes, the parent might not be able to get a token for
quite some time, and the logs will appear to stop.
To prevent this from happening, we can instantiate up to one "cheater"
token, only in the foreground process (the one locked by redo-log -f),
which will allow it to continue running, albeit a bit slowly (since it
only has one token out of possibly many). When the process finishes,
we then destroy the fake token. It gets a little complicated; see
explanation at the top of jwack.py.
2018-11-17 04:32:09 -05:00
|
|
|
_debug('%d,%d -> %d jobs left in force_return_tokens\n'
|
|
|
|
|
% (_mytokens, _cheats, n))
|
2018-12-02 23:15:37 -05:00
|
|
|
for k in list(_waitfds):
|
2010-11-13 04:36:44 -08:00
|
|
|
del _waitfds[k]
|
redo-log: prioritize the "foreground" process.
When running a parallel build, redo-log -f (which is auto-started by
redo) tries to traverse through the logs depth first, in the order
parent processes started subprocesses. This works pretty well, but if
its dependencies are locked, a process might have to give up its
jobserver token while other stuff builds its dependencies. After the
dependency finishes, the parent might not be able to get a token for
quite some time, and the logs will appear to stop.
To prevent this from happening, we can instantiate up to one "cheater"
token, only in the foreground process (the one locked by redo-log -f),
which will allow it to continue running, albeit a bit slowly (since it
only has one token out of possibly many). When the process finishes,
we then destroy the fake token. It gets a little complicated; see
explanation at the top of jwack.py.
2018-11-17 04:32:09 -05:00
|
|
|
_create_tokens(n)
|
|
|
|
|
if has_token():
|
|
|
|
|
_release_except_mine()
|
|
|
|
|
assert _mytokens == 1, 'mytokens=%d' % _mytokens
|
|
|
|
|
assert _cheats <= _mytokens, 'mytokens=%d cheats=%d' % (_mytokens, _cheats)
|
|
|
|
|
assert _cheats in (0, 1), 'cheats=%d' % _cheats
|
|
|
|
|
if _cheats:
|
|
|
|
|
_debug('%d,%d -> force_return_tokens: recovering final token\n'
|
|
|
|
|
% (_mytokens, _cheats))
|
|
|
|
|
_destroy_tokens(_cheats)
|
|
|
|
|
os.write(_cheatfds[1], 't' * _cheats)
|
|
|
|
|
assert state.is_flushed()
|
2010-11-12 20:08:38 -08:00
|
|
|
|
|
|
|
|
|
2010-11-13 04:36:44 -08:00
|
|
|
def _pre_job(r, w, pfn):
|
2010-11-12 20:08:38 -08:00
|
|
|
os.close(r)
|
2010-11-13 04:36:44 -08:00
|
|
|
if pfn:
|
|
|
|
|
pfn()
|
2010-11-12 20:08:38 -08:00
|
|
|
|
2010-11-13 04:36:44 -08:00
|
|
|
|
2018-12-02 23:15:37 -05:00
|
|
|
class Job(object):
|
2010-11-19 06:04:45 -08:00
|
|
|
def __init__(self, name, pid, donefunc):
|
2010-11-13 04:36:44 -08:00
|
|
|
self.name = name
|
|
|
|
|
self.pid = pid
|
|
|
|
|
self.rv = None
|
2010-11-19 06:04:45 -08:00
|
|
|
self.donefunc = donefunc
|
2018-12-02 23:15:37 -05:00
|
|
|
|
2010-11-19 06:04:45 -08:00
|
|
|
def __repr__(self):
|
|
|
|
|
return 'Job(%s,%d)' % (self.name, self.pid)
|
2010-11-13 04:36:44 -08:00
|
|
|
|
2018-12-02 23:15:37 -05:00
|
|
|
|
2010-11-22 00:03:43 -08:00
|
|
|
def start_job(reason, jobfunc, donefunc):
|
redo-log: prioritize the "foreground" process.
When running a parallel build, redo-log -f (which is auto-started by
redo) tries to traverse through the logs depth first, in the order
parent processes started subprocesses. This works pretty well, but if
its dependencies are locked, a process might have to give up its
jobserver token while other stuff builds its dependencies. After the
dependency finishes, the parent might not be able to get a token for
quite some time, and the logs will appear to stop.
To prevent this from happening, we can instantiate up to one "cheater"
token, only in the foreground process (the one locked by redo-log -f),
which will allow it to continue running, albeit a bit slowly (since it
only has one token out of possibly many). When the process finishes,
we then destroy the fake token. It gets a little complicated; see
explanation at the top of jwack.py.
2018-11-17 04:32:09 -05:00
|
|
|
assert state.is_flushed()
|
|
|
|
|
assert _mytokens <= 1
|
|
|
|
|
assert _mytokens == 1
|
|
|
|
|
# Subprocesses always start with 1 token, so we have to destroy ours
|
|
|
|
|
# in order for the universe to stay in balance.
|
|
|
|
|
_destroy_tokens(1)
|
2018-12-02 23:15:37 -05:00
|
|
|
r, w = _make_pipe(50)
|
2010-11-13 04:36:44 -08:00
|
|
|
pid = os.fork()
|
|
|
|
|
if pid == 0:
|
|
|
|
|
# child
|
|
|
|
|
os.close(r)
|
2010-11-19 07:09:26 -08:00
|
|
|
rv = 201
|
2010-11-13 04:36:44 -08:00
|
|
|
try:
|
|
|
|
|
try:
|
2010-11-19 07:09:26 -08:00
|
|
|
rv = jobfunc() or 0
|
2018-12-02 23:15:37 -05:00
|
|
|
_debug('jobfunc completed (%r, %r)\n' % (jobfunc, rv))
|
|
|
|
|
except Exception: # pylint: disable=broad-except
|
2010-11-19 00:54:36 -08:00
|
|
|
import traceback
|
|
|
|
|
traceback.print_exc()
|
2010-11-13 04:36:44 -08:00
|
|
|
finally:
|
2010-11-21 07:09:47 -08:00
|
|
|
_debug('exit: %d\n' % rv)
|
2010-11-19 07:09:26 -08:00
|
|
|
os._exit(rv)
|
2010-12-11 18:24:10 -08:00
|
|
|
close_on_exec(r, True)
|
2010-11-12 20:08:38 -08:00
|
|
|
os.close(w)
|
2010-11-19 06:04:45 -08:00
|
|
|
pd = Job(reason, pid, donefunc)
|
2010-11-13 04:36:44 -08:00
|
|
|
_waitfds[r] = pd
|