jwack: don't ever set the jobserver socket to O_NONBLOCK.

It creates a race condition: GNU Make might try to read while the socket is
O_NONBLOCK, get EAGAIN, and die; or else another redo might set it back to
blocking in between our call to make it O_NONBLOCK and our call to read().

This method - setting an alarm() during the read - is hacky, but should work
every time.  Unfortunately you get a 1s delay - rarely - when this happens.
The good news is it only happens when there are no tokens available anyhow,
so it won't affect performance much in any situation I can imagine.
This commit is contained in:
Avery Pennarun 2010-12-10 04:55:13 -08:00
commit 49ebea445f

View file

@ -1,7 +1,7 @@
# #
# beware the jobberwack # beware the jobberwack
# #
import sys, os, errno, select, fcntl import sys, os, errno, select, fcntl, signal
import atoi import atoi
_toplevel = 0 _toplevel = 0
@ -24,22 +24,35 @@ def _release(n):
_mytokens = 1 _mytokens = 1
def _timeout(sig, frame):
pass
def _try_read(fd, n): def _try_read(fd, n):
# FIXME: this isn't actually safe, because GNU make can't handle it if # using djb's suggested way of doing non-blocking reads from a blocking
# the socket is nonblocking. Ugh. That means we'll have to do their # socket: http://cr.yp.to/unix/nonblock.html
# horrible SIGCHLD hack after all. # We can't just make the socket non-blocking, because we want to be
fcntl.fcntl(_fds[0], fcntl.F_SETFL, os.O_NONBLOCK) # compatible with GNU Make, and they can't handle it.
r,w,x = select.select([fd], [], [], 0)
if not r:
return '' # try again
# ok, the socket is readable - but some other process might get there
# first. We have to set an alarm() in case our read() gets stuck.
oldh = signal.signal(signal.SIGALRM, _timeout)
try: try:
signal.alarm(1) # emergency fallback
try: try:
b = os.read(_fds[0], 1) b = os.read(_fds[0], 1)
except OSError, e: except OSError, e:
if e.errno == errno.EAGAIN: if e.errno in (errno.EAGAIN, errno.EINTR):
return '' # interrupted or it was nonblocking
return '' # try again
else: else:
raise raise
finally: finally:
fcntl.fcntl(_fds[0], fcntl.F_SETFL, 0) signal.alarm(0)
return b and b or None signal.signal(signal.SIGALRM, oldh)
return b and b or None # None means EOF
def setup(maxjobs): def setup(maxjobs):