redo-log: prioritize the "foreground" process.

When running a parallel build, redo-log -f (which is auto-started by
redo) tries to traverse through the logs depth first, in the order
parent processes started subprocesses.  This works pretty well, but if
its dependencies are locked, a process might have to give up its
jobserver token while other stuff builds its dependencies.  After the
dependency finishes, the parent might not be able to get a token for
quite some time, and the logs will appear to stop.

To prevent this from happening, we can instantiate up to one "cheater"
token, only in the foreground process (the one locked by redo-log -f),
which will allow it to continue running, albeit a bit slowly (since it
only has one token out of possibly many).  When the process finishes,
we then destroy the fake token.  It gets a little complicated; see
explanation at the top of jwack.py.
This commit is contained in:
Avery Pennarun 2018-11-17 04:32:09 -05:00
commit 8b5a567b2e
7 changed files with 348 additions and 104 deletions

12
redo.py
View file

@ -1,5 +1,5 @@
#!/usr/bin/env python2
import sys, os
import sys, os, traceback
import options
from helpers import atoi
@ -54,7 +54,8 @@ from logs import warn, err
try:
if vars_init.is_toplevel:
builder.start_stdin_log_reader(status=opt.status, details=opt.details)
builder.start_stdin_log_reader(status=opt.status, details=opt.details,
debug_locks=opt.debug_locks, debug_pids=opt.debug_pids)
for t in targets:
if os.path.exists(t):
f = state.File(name=t)
@ -75,7 +76,12 @@ try:
try:
state.rollback()
finally:
jwack.force_return_tokens()
try:
jwack.force_return_tokens()
except Exception, e:
traceback.print_exc(100, sys.stderr)
err('unexpected error: %r\n' % e)
retcode = 1
if vars_init.is_toplevel:
builder.await_log_reader()
sys.exit(retcode)