redo-log: capture and linearize the output of redo builds.

redo now saves the stderr from every .do script, for every target, into
a file in the .redo directory.  That means you can look up the logs
from the most recent build of any target using the new redo-log
command, for example:

	redo-log -r all

The default is to show logs non-recursively, that is, it'll show when a
target does redo-ifchange on another target, but it won't recurse into
the logs for the latter target.  With -r (recursive), it does.  With -u
(unchanged), it does even if redo-ifchange discovered that the target
was already up-to-date; in that case, it prints the logs of the *most
recent* time the target was generated.

With --no-details, redo-log will show only the 'redo' lines, not the
other log messages.  For very noisy build systems (like recursing into
a 'make' instance) this can be helpful to get an overview of what
happened, without all the cruft.

You can use the -f (follow) option like tail -f, to follow a build
that's currently in progress until it finishes.  redo itself spins up a
copy of redo-log -r -f while it runs, so you can see what's going on.

Still broken in this version:

- No man page or new tests yet.

- ANSI colors don't yet work (unless you use --raw-logs, which gives
  the old-style behaviour).

- You can't redirect the output of a sub-redo to a file or a
  pipe right now, because redo-log is eating it.

- The regex for matching 'redo' lines in the log is very gross.
  Instead, we should put the raw log files in a more machine-parseable
  format, and redo-log should turn that into human-readable format.

- redo-log tries to "linearize" the logs, which makes them
  comprehensible even for a large parallel build.  It recursively shows
  log messages for each target in depth-first tree order (by tracing
  into a new target every time it sees a 'redo' line).  This works
  really well, but in some specific cases, the "topmost" redo instance
  can get stuck waiting for a jwack token, which makes it look like the
  whole build has stalled, when really redo-log is just waiting a long
  time for a particular subprocess to be able to continue.  We'll need to
  add a specific workaround for that.
This commit is contained in:
Avery Pennarun 2018-11-03 22:09:18 -04:00
commit b2411fe483
10 changed files with 315 additions and 23 deletions

View file

@ -174,7 +174,7 @@ class File(object):
# use this mostly to avoid accidentally assigning to typos
__slots__ = ['id'] + _file_cols[1:]
def _init_from_idname(self, id, name):
def _init_from_idname(self, id, name, allow_add):
q = ('select %s from Files ' % join(', ', _file_cols))
if id != None:
q += 'where rowid=?'
@ -189,7 +189,9 @@ class File(object):
row = d.execute(q, l).fetchone()
if not row:
if not name:
raise Exception('No file with id=%r name=%r' % (id, name))
raise KeyError('No file with id=%r name=%r' % (id, name))
elif not allow_add:
raise KeyError('No file with name=%r' % (name,))
try:
_write('insert into Files (name) values (?)', [name])
except sqlite3.IntegrityError:
@ -207,17 +209,17 @@ class File(object):
if self.name == ALWAYS and self.changed_runid < vars.RUNID:
self.changed_runid = vars.RUNID
def __init__(self, id=None, name=None, cols=None):
def __init__(self, id=None, name=None, cols=None, allow_add=True):
if cols:
return self._init_from_cols(cols)
else:
return self._init_from_idname(id, name)
return self._init_from_idname(id, name, allow_add=allow_add)
def __repr__(self):
return "File(%r)" % (self.nicename(),)
def refresh(self):
self._init_from_idname(self.id, None)
self._init_from_idname(self.id, None, allow_add=False)
def save(self):
cols = join(', ', ['%s=?'%i for i in _file_cols[2:]])
@ -324,6 +326,11 @@ def files():
yield File(cols=cols)
def logname(fid):
"""Given the id of a File, return the filename of its build log."""
return os.path.join(vars.BASE, '.redo', 'log.%d' % fid)
# FIXME: I really want to use fcntl F_SETLK, F_SETLKW, etc here. But python
# doesn't do the lockdata structure in a portable way, so we have to use
# fcntl.lockf() instead. Usually this is just a wrapper for fcntl, so it's
@ -365,6 +372,7 @@ class Lock:
raise
else:
self.owned = True
return self.owned
def waitlock(self):
self.check()