Directory reorg: move code into redo/, generate binaries in bin/.

It's time to start preparing for a version of redo that doesn't work unless we build it first (because it will rely on C modules, and eventually be rewritten in C altogether). To get rolling, remove the old-style symlinks to the main programs, and rename those programs from redo-*.py to redo/cmd_*.py. We'll also move all library functions into the redo/ dir, which is a more python-style naming convention. Previously, install.do was generating wrappers for installing in /usr/bin, which extend sys.path and then import+run the right file. This made "installed" redo work quite differently from running redo inside its source tree. Instead, let's always generate the wrappers in bin/, and not make anything executable except those wrappers. Since we're generating wrappers anyway, let's actually auto-detect the right version of python for the running system; distros can't seem to agree on what to call their python2 binaries (sigh). We'll fill in the right #! shebang lines. Since we're doing that, we can stop using /usr/bin/env, which will a) make things slightly faster, and b) let us use "python -S", which tells python not to load a bunch of extra crap we're not using, thus improving startup times. Annoyingly, we now have to build redo using minimal/do, then run the tests using bin/redo. To make this less annoying, we add a toplevel ./do script that knows the right steps, and a Makefile (whee!) for people who are used to typing 'make' and 'make test' and 'make clean'.
2018-12-03 21:39:15 -05:00 · 2018-12-03 21:39:15 -05:00 · f6fe00db5c
commit f6fe00db5c
parent 5bc7c861b6
140 changed files with 256 additions and 99 deletions
--- a/docs/cookbook/all.do
+++ b/docs/cookbook/all.do
@ -0,0 +1,3 @@
+for d in */all.do; do
+    echo "${d%.do}"
+done | xargs redo-ifchange
--- a/docs/cookbook/clean.do
+++ b/docs/cookbook/clean.do
@ -0,0 +1,3 @@
+for d in */clean.do; do
+    echo "${d%.do}"
+done | xargs redo
--- a/docs/cookbook/defaults/.gitignore
+++ b/docs/cookbook/defaults/.gitignore
@ -0,0 +1,5 @@
+date
+version
+include/version.h
+version.py
+test.txt
--- a/docs/cookbook/defaults/all.do
+++ b/docs/cookbook/defaults/all.do
@ -0,0 +1,2 @@
+redo-ifchange test.txt version.py \
+    test.py include/version.h
--- a/docs/cookbook/defaults/clean.do
+++ b/docs/cookbook/defaults/clean.do
@ -0,0 +1,2 @@
+rm -f date version include/version.h \
+    version.py test.txt *~ .*~
--- a/docs/cookbook/defaults/date.do
+++ b/docs/cookbook/defaults/date.do
@ -0,0 +1,3 @@
+date +'%Y-%m-%d' >$3
+redo-always
+redo-stamp <$3
--- a/docs/cookbook/defaults/default.do
+++ b/docs/cookbook/defaults/default.do
@ -0,0 +1,25 @@
+# $1 is the target name, eg. test.txt
+# $2 in the same as $1.  We'll talk about
+#    that in a later example.
+# $3 is the temporary output file we should
+#    create.  If this script is successful,
+#    redo will atomically replace $1 with $3.
+
+if [ -e "$1.in" ]; then
+    # if a .in file exists, then do some
+    # text substitution.
+    #
+    # Remember, the user asks redo to build
+    # a particular *target* name.  It's the .do
+    # file's job to figure out what source file(s)
+    # to use to generate the target.
+    redo-ifchange "$1.in" version date
+    read VERSION <version
+    read DATE <date
+    sed -e "s/%VERSION%/$VERSION/g" \
+        -e "s/%DATE%/$DATE/g" \
+        <$1.in >$3
+else
+    echo "$0: Fatal: don't know how to build '$1'" >&2
+    exit 99
+fi
--- a/docs/cookbook/defaults/include/version.h.in
+++ b/docs/cookbook/defaults/include/version.h.in
@ -0,0 +1,8 @@
+// C/C++ header file identifying the current version
+#ifndef __VERSION_H
+#define __VERSION_H
+
+#define VERSION "%VERSION%"
+#define DATE "%DATE%"
+
+#endif // __VERSION_H
--- a/docs/cookbook/defaults/index.md
+++ b/docs/cookbook/defaults/index.md
@ -0,0 +1,322 @@
+### Building a text preprocessor
+
+Let's say we want to process a bunch of files, each in the same way.  An
+easy example is a text preprocessor: we'll look through the text and replace
+variables of the form %VARNAME% with real content.  In our example, we'll
+use variable %VERSION%, representing a version number, and %DATE%, with
+today's date.
+
+To play with this code on your own machine, get the [redo
+source code](https://github.com/apenwarr/redo) and look in the
+`docs/cookbook/defaults/` directory.
+
+### Input files
+
+Let's create some input files we can use.  The input to our
+preprocessor will have file extension `.in`.  The output can be in various
+formats.
+<pre><code src='test.txt.in'></code></pre>
+<pre><code lang='c' src='include/version.h.in'></code></pre>
+<pre><code lang='python' src='version.py.in'></code></pre>
+
+For fun, let's also make a python test program that calls the newly
+generated version.py:
+<pre><code lang='python' src='test.py'></code></pre>
+
+Finally, we need to provide values for the variables.  Let's put each one
+in its own file, named `version` and `date`, respectively.
+
+<pre><b style='text-align: center; display: block'>version</b
+><code>1.0</code></pre>
+
+<pre><b style='text-align: center; display: block'>date</b
+><code>1970-01-01</code></pre>
+
+### default.do files
+
+Now we want to teach redo how to do our substitutions: how to, in general,
+generate file X from file X.in.
+
+We could write a separate .do file for every output file X.  For example, we
+might make `test.txt.do` and `version.py.do`.  But that gets tedious.  To
+make it easier, if there is no specific `X.do` for a target named X, redo
+will try using `default.do` instead.  Let's write a default.do for our text
+preprocessor.
+<pre><code lang='sh' src='default.do'></code></pre>
+
+If `default.do` is asked to build `X`, and there exists a file named `X.in`, we
+use `sed` to do our variable substitutions.  In that case, `default.do` uses
+redo-ifchange to depend on `X.in`, `version`, and `date`.  If a file named
+`X.in` does *not* exist, then we don't know what to do, so we give an error.
+
+On the other hand, if we try to generate a file that *already* exists, like
+`test.py`, redo does not call default.do at all.  redo only tries to create
+files that don't exist, or that were previously generated by redo.  This
+stops redo from accidentally overwriting your work.
+```shell
+$ redo test.txt
+redo  test.txt
+
+$ cat test.txt
+This is the documentation for MyProgram version
+1.0.  It was generated on 1970-01-01.
+
+$ redo chicken
+redo  chicken
+default.do: Fatal: don't know how to build 'chicken'
+redo  chicken (exit 99)
+
+$ redo version.py
+redo  version.py
+
+# test.py was created by us, so it's a "source" file.
+# redo does *not* call default.do to replace it.
+$ redo test.py
+redo: test.py: exists and not marked as generated; not redoing.
+
+$ python test.py
+Version '1.0' has build date '1970-01-01'
+```
+
+Nice!
+
+While we're here, let's make an `all.do` so that we don't have to tell redo
+exactly which files to rebuild, every single time.
+<pre><code lang='sh' src='all.do'></code></pre>
+
+Results:
+```shell
+$ redo
+redo  all
+redo    test.txt
+redo    version.py
+redo    include/version.h
+
+# input files didn't change, so nothing to rebuild
+$ redo
+redo  all
+
+$ touch test.txt.in
+
+$ redo
+redo  all
+redo    test.txt
+```
+
+
+### Auto-generating the version and date (redo-always and redo-stamp)
+
+Of course, in a real project, we won't want to hardcode the version number
+and date into a file.  Ideally, we can get the version number from a version
+control system, like git, and we can use today's date.
+
+To make that happen, we can replace the static `version` and `date` files with
+`version.do` and `date.do`.  `default.do` already uses redo-ifchange to
+depend on `version` and `date`, so redo will create them as needed, and
+if they change, redo will rebuild all the targets that depend on them.
+
+However, the `version` and `date` files are special: they depend on the
+environment outside redo itself.  That is, there's no way to declare a
+dependency on the current date.  We might generate the `date` file once, but
+tomorrow, there's no way for redo to know that its value should change.
+
+To handle this situation, redo has the `redo-always` command.  If we run
+redo-always from a .do file, it means every time someone depends on that
+target, it will be considered out-of-date and need to be rebuilt.  The
+result looks like this:
+```shell
+$ redo 
+redo  all
+redo    test.txt
+redo      version
+redo      date
+redo    version.py
+redo    include/version.h
+
+# version.do and date.do are redo-always, so
+# everything depending on them needs to rebuild
+# every time.
+$ redo
+redo  all
+redo    test.txt
+redo      version
+redo      date
+redo    version.py
+redo    include/version.h
+```
+
+Of course, for many uses, that's overcompensating: the version number and
+date don't change *that* often, so we might end up doing a lot of
+unnecessary work on every build.  To solve that, there's `redo-stamp`. 
+redo-stamp does the opposite of redo-always: while redo-always makes things
+build *more* often, redo-stamp makes things build *less* often. 
+Specifically, it lets a .do file provide a "stamp value" for its output; if
+that stamp value is the same as before, then the target should be considered
+unchanged after all.
+
+The most common stamp value is just the content itself.  Since in redo, we
+write the content to $3, we can also read it back from $3:
+<pre><code lang='sh' src='version.do'></code></pre>
+<pre><code lang='sh' src='date.do'></code></pre>
+
+And the final result is what we want.  Although `version` and `date` are
+generated every time, the targets which depend on them are not:
+```shell
+$ redo clean
+redo  clean
+
+# version and date are generated just once per run,
+# the first time they are used.
+$ redo
+redo  all
+redo    test.txt
+redo      version
+redo      date
+redo    version.py
+redo    include/version.h
+
+# Here, (test.txt) means redo is considering building
+# test.txt, but can't decide yet. In order to decide,
+# it needs to first build date and version.  After
+# that, it decides not to build test.txt after all.
+$ redo
+redo  all
+redo    (test.txt)
+redo    date
+redo    version
+```
+
+
+### Temporary overrides
+
+Sometimes you want to override a file even if it *is* a target (ie. it has
+previously been built by redo and has a valid .do file associated with it). 
+In our example, maybe you want to hardcode the version number because you're
+building a release.  This is easy: redo notices whenever you overwrite a
+file from outside redo, and will avoid replacing that file until you
+subsequently delete it:
+```shell
+$ echo "1.0" >version
+
+$ redo
+redo  all
+redo    (test.txt)
+redo    date
+redo: version - you modified it; skipping
+redo    test.txt
+redo    version.py
+redo    include/version.h
+
+$ redo
+redo  all
+redo    (test.txt)
+redo    date
+redo: version - you modified it; skipping
+
+$ rm version
+
+$ redo
+redo  all
+redo    (test.txt)
+redo    date
+redo    version
+redo    test.txt
+redo    version.py
+redo    include/version.h
+```
+
+### default.do, subdirectories, and redo-whichdo
+
+There's one more thing we should mention, which is the interaction of
+default.do with files in subdirectories.  Notice that we are building
+`include/version.h` in our example:
+```shell
+$ redo include/version.h
+redo  include/version.h
+redo    version
+redo    date
+
+$ cat include/version.h
+// C/C++ header file identifying the current version
+#ifndef __VERSION_H
+#define __VERSION_H
+
+#define VERSION "redo-0.31-3-g974eb9f"
+#define DATE "2018-11-26"
+
+#endif // __VERSION_H
+```
+
+redo works differently from the `make` command when you ask it to build
+files in subdirectories.  In make's case, it always looks for a `Makefile`
+in the *current* directory, and uses that for all build instructions.  So
+`make include/version.h` and `cd include && make version.h` are two
+different things; the first uses `Makefile`, and the second uses
+`include/Makefile` (or crashes if the latter does not exist).
+
+redo, on the other hand, always uses the same formula to find a .do file for
+a particular target.  For a file named X, that formula is as follows:
+
+- first, try X.do
+- then try default.do
+- then try ../default.do
+- then try ../../default.do
+- ...and so on...
+
+(Note: for targets with an extension, like X.o, redo actually tries even
+more .do files, like `default.o.do` and `../default.o.do`.  For precise
+details, read the [redo man page](../../redo).)
+
+You can see which .do files redo considers for a given target by using the
+`redo-whichdo` command.  If redo-whichdo returns successfully, the last name
+in the list is the .do file it finally decided to use.
+```shell
+$ redo-whichdo include/version.h
+include/version.h.do
+include/default.h.do
+include/default.do
+default.h.do
+default.do
+```
+
+
+### Redo always runs in the .do file's directory
+
+To ensure consistency, redo always changes the current directory to
+the directory *containing the selected .do file* (**not** the directory
+containing the target, if they are different).  As a result,
+`redo include/version.h` and `cd include && redo version.h` always have
+exactly the same effect:
+```shell
+$ redo include/version.h
+redo  include/version.h
+redo    version
+redo    date
+
+$ (cd include && redo version.h)
+redo  version.h
+redo    ../version
+redo    ../date
+```
+
+(redo's display is slightly different between the two: it always shows the
+files it's building relative to the $PWD at the time you started redo.)
+
+This feature is critical to redo's recursive nature; it's the reason that
+essays like [Recursive Make Considered Harmful](http://aegis.sourceforge.net/auug97.pdf)
+don't apply to redo.  Any redo target, anywhere in your source tree, can
+use redo-ifchange to depend on any of your other targets, and the dependency
+will work right.
+
+Why does redo change to the directory containing the .do file, instead of
+the directory containing the target?  Because usually, the .do file needs to
+refer to other dependencies, and it's easier to always express those
+dependencies without adjusting any paths.  In our text preprocessor example,
+`default.do` does `redo-ifchange version date`; this wouldn't work properly
+if it were running from the `include/` directory, because there are no files
+named `version` and `date` in there.
+
+Similarly, when compiling C programs, there are
+usually compiler options like `-I../mylib/include`.  If we're compiling
+`foo.o` and `mydir/bar.o`, we would like `-I../mylib/include` to have the
+same meaning in both cases.
--- a/docs/cookbook/defaults/test.py
+++ b/docs/cookbook/defaults/test.py
@ -0,0 +1,6 @@
+#!/usr/bin/env python
+"""Test program for auto-generated version.py"""
+import version
+
+print('Version %r has build date %r'
+      % (version.VERSION, version.DATE))
--- a/docs/cookbook/defaults/test.txt.in
+++ b/docs/cookbook/defaults/test.txt.in
@ -0,0 +1,2 @@
+This is the documentation for MyProgram version
+%VERSION%.  It was generated on %DATE%.
--- a/docs/cookbook/defaults/version.do
+++ b/docs/cookbook/defaults/version.do
@ -0,0 +1,7 @@
+# Try to get a version number from git, if possible.
+if ! git describe >$3; then
+    echo "$0: Falling back to static version." >&2
+    echo 'UNKNOWN' >$3
+fi
+redo-always
+redo-stamp <$3
--- a/docs/cookbook/defaults/version.py.in
+++ b/docs/cookbook/defaults/version.py.in
@ -0,0 +1,3 @@
+# python module identifying the current version
+VERSION='%VERSION%'
+DATE='%DATE%'
--- a/docs/cookbook/hello/.gitignore
+++ b/docs/cookbook/hello/.gitignore
@ -0,0 +1,3 @@
+hello
+*~
+.*~
--- a/docs/cookbook/hello/all.do
+++ b/docs/cookbook/hello/all.do
@ -0,0 +1 @@
+redo-ifchange hello
--- a/docs/cookbook/hello/clean.do
+++ b/docs/cookbook/hello/clean.do
@ -0,0 +1 @@
+rm -f hello *~ .*~
--- a/docs/cookbook/hello/hello.c
+++ b/docs/cookbook/hello/hello.c
@ -0,0 +1,6 @@
+#include <stdio.h>
+
+int main() {
+    printf("Hello, world!\n");
+    return 0;
+}
--- a/docs/cookbook/hello/hello.do
+++ b/docs/cookbook/hello/hello.do
@ -0,0 +1,16 @@
+# If hello.c changes, this script needs to be
+# re-run.
+redo-ifchange hello.c
+
+# Compile hello.c into the 'hello' binary.
+#
+# $3 is the redo variable that represents the
+# output filename.  We want to build a file
+# called "hello", but if we write that directly,
+# then an interruption could result in a
+# partially-written file.  Instead, write it to
+# $3, and redo will move our output into its
+# final location, only if this script completes
+# successfully.
+#
+cc -o $3 hello.c -Wall
--- a/docs/cookbook/hello/index.md
+++ b/docs/cookbook/hello/index.md
@ -0,0 +1,142 @@
+### Hello!
+
+Let's start with Hello World: famously, the simplest project that does
+anything interesting.  We'll write this one in C, but don't worry if
+you're not a C programmer.  The focus isn't the C code itself, just to
+compile it.
+
+To play with the code on your own machine, get the [redo
+source code](https://github.com/apenwarr/redo) and look in the
+`docs/cookbook/hello/` directory.
+
+### Compiling the code
+
+First, let's create a source file that we want to compile:
+<pre><code lang='c' src='hello.c'></code></pre>
+
+Now we need a .do file to tell redo how to compile it:
+<pre><code lang='sh' src='hello.do'></code></pre>
+
+With those files in place, we can build and run the program:
+```shell
+$ redo hello
+redo  hello
+
+$ ./hello
+Hello, world!
+```
+
+Use the `redo` command to forcibly re-run a specific rule (in this case, the
+compiler).  Or, if you only want to recompile `hello` when its input
+files (dependencies) have changed, use `redo-ifchange`.
+```shell
+$ redo hello
+redo  hello
+
+# Rebuilds, whether we need it or not
+$ redo hello
+redo  hello
+
+# Does not rebuild because hello.c is unchanged
+$ redo-ifchange hello
+
+$ touch hello.c
+
+# Notices the change to hello.c
+$ redo-ifchange hello
+redo  hello
+```
+
+Usually we'll want to also provide an `all.do` file.  `all` is the
+default redo target when you don't specify one.
+<pre><code lang='sh' src='all.do'></code></pre>
+
+With that, now we can rebuild our project by just typing `redo`:
+```shell
+$ rm hello
+
+# 'redo' runs all.do, which calls into hello.do.
+$ redo
+redo  all
+redo    hello
+
+# Notice that this forcibly re-runs the 'all'
+# rule, but all.do calls redo-ifchange, so
+# hello itself is only recompiled if its
+# dependencies change.
+$ redo
+redo  all
+
+$ ./hello
+Hello, world!
+```
+
+
+### Debugging your .do scripts
+
+If you want to see exactly which commands are being run for each step,
+you can use redo's `-x` and `-v` options, which work similarly to
+`sh -x` and `sh -v`.
+
+```shell
+$ rm hello
+
+$ redo -x
+redo  all
+* sh -ex all.do all all all.redo2.tmp
+ redo-ifchange hello
+
+redo    hello
+* sh -ex hello.do hello hello hello.redo2.tmp
+ redo-ifchange hello.c
+ cc -o hello.redo2.tmp hello.c -Wall
+redo    hello (done)
+
+redo  all (done)
+```
+
+
+### Running integration tests
+
+What about tests?  We can, of course, compile a C program that has some
+unit tests.  But since our program isn't very complicated, let's write
+a shell "integration test" (also known as a "black box" test) to make
+sure it works as expected, without depending on implementation details:
+<pre><code lang='sh' src='test.do'></code></pre>
+
+Even if we rewrote our hello world program in python, javascript, or
+ruby, that integration test would still be useful.
+
+
+### Housekeeping
+
+Traditionally, it's considered polite to include a `clean` rule that
+restores your project to pristine status, so people can rebuild from
+scratch:
+<pre><code lang='sh' src='clean.do'></code></pre>
+
+Some people like to include a `.gitignore` file so that git won't pester
+you about files that would be cleaned up by `clean.do` anyway.  Let's add
+one:
+<pre><b align=center style="display: block">.gitignore</b><code>
+hello
+*~
+.*~
+</code></pre>
+
+Congratulations!  That's all it takes to make your first redo project.
+
+Here's what it looks like when we're done:
+```shell
+$ ls
+all.do  clean.do  hello.c  hello.do  test.do
+```
+
+Some people think this looks a little cluttered with .do files.  But
+notice one very useful feature: you can see, at a glance, exactly which
+operations are possible in your project.  You can redo all, clean,
+hello, or test.  Since most people downloading your project will just
+want to build it, it's helpful to have the available actions so
+prominently displayed.  And if they have a problem with one of the
+steps, it's very obvious which file contains the script that's causing
+the problem.
--- a/docs/cookbook/hello/test.do
+++ b/docs/cookbook/hello/test.do
@ -0,0 +1,12 @@
+# Make sure everything has been built before we start
+redo-ifchange all
+
+# Ensure that the hello program, when run, says
+# hello like we expect.
+if ./hello | grep -i 'hello' >/dev/null; then
+    echo "success" >&2
+    exit 0
+else
+    echo "missing 'hello' message!" >&2
+    exit 1
+fi
--- a/docs/cookbook/latex/.gitignore
+++ b/docs/cookbook/latex/.gitignore
@ -0,0 +1,7 @@
+*.eps
+*.dvi
+*.ps
+*.pdf
+*.tmp
+*~
+.*~
--- a/docs/cookbook/latex/all.do
+++ b/docs/cookbook/latex/all.do
@ -0,0 +1,8 @@
+for d in latex dvips dvipdf Rscript; do
+    if ! type "$d" >/dev/null 2>/dev/null; then
+        echo "$0: skipping: $d not installed." >&2
+        exit 0
+    fi
+done
+
+redo-ifchange paper.pdf paper.ps
--- a/docs/cookbook/latex/clean.do
+++ b/docs/cookbook/latex/clean.do
@ -0,0 +1,2 @@
+rm -f *.eps *.dvi *.ps *.pdf *~ .*~
+rm -rf *.tmp
--- a/docs/cookbook/latex/default.dvi.do
+++ b/docs/cookbook/latex/default.dvi.do
@ -0,0 +1,2 @@
+redo-ifchange "$2.runtex"
+ln "$2.tmp/$2.dvi" "$3"
--- a/docs/cookbook/latex/default.pdf.do
+++ b/docs/cookbook/latex/default.pdf.do
@ -0,0 +1,3 @@
+exec >&2
+redo-ifchange "$2.dvi"
+dvipdf "$2.dvi" "$3"
--- a/docs/cookbook/latex/default.ps.do
+++ b/docs/cookbook/latex/default.ps.do
@ -0,0 +1,3 @@
+exec >&2
+redo-ifchange "$2.dvi"
+dvips -o "$3" "$2.dvi"
--- a/docs/cookbook/latex/default.runtex.do
+++ b/docs/cookbook/latex/default.runtex.do
@ -0,0 +1,58 @@
+# latex produces log output on stdout, which is
+# not really correct.  Send it to stderr instead.
+exec >&2
+
+# We depend on both the .latex file and its .deps
+# file (which lists additional dependencies)
+redo-ifchange "$2.latex" "$2.deps"
+
+# Next, we have to depend on each dependency in
+# the .deps file.
+cat "$2.deps" | xargs redo-ifchange
+
+tmp="$2.tmp"
+rm -rf "$tmp"
+mkdir -p "$tmp"
+
+# latex generates eg.  the table of contents by
+# using a list of references ($2.aux) generated
+# during its run.  The first time, the table of
+# contents is empty, so we have to run again. 
+# But then the table of contents is non-empty,
+# which might cause page numbers to change, and
+# so on.  So we have to keep re-running until it
+# finally stops changing.
+touch "$tmp/$2.aux.old"
+ok=
+for i in $(seq 5); do
+    latex --halt-on-error \
+        --output-directory="$tmp" \
+        --recorder \
+        "$2.latex" </dev/null
+    if diff "$tmp/$2.aux.old" \
+            "$tmp/$2.aux" >/dev/null; then
+        # .aux file converged, so we're done
+        ok=1
+        break
+    fi
+    echo
+    echo "$0: $2.aux changed: try again (try #$i)"
+    echo
+    cp "$tmp/$2.aux" "$tmp/$2.aux.old"
+done
+
+if [ "$ok" = "" ]; then
+    echo "$0: fatal: $2.aux did not converge!"
+    exit 10
+fi
+
+# If the newly produced .dvi disappears, we need
+# to redo.
+redo-ifchange "$tmp/$2.dvi"
+
+# With --recorder, latex produces a list of files
+# it used during its run.  Let's depend on all of
+# them, so if they ever change, we'll redo.
+grep ^INPUT "$tmp/$2.fls" |
+    cut -d' ' -f2 |
+    xargs redo-ifchange
--- a/docs/cookbook/latex/discovery.txt
+++ b/docs/cookbook/latex/discovery.txt
@ -0,0 +1 @@
+It seems that \(E = m c^2\).
--- a/docs/cookbook/latex/index.md
+++ b/docs/cookbook/latex/index.md
@ -0,0 +1,252 @@
+### A LaTeX typesetting example
+
+[LaTeX](https://www.latex-project.org/) is a typesetting system that's
+especially popular in academia.  Among other things, it lets you produce
+postscript and pdf files from a set of (mostly text) input files.
+
+LaTeX documents often include images and charts.  In our example, we'll show
+how to auto-generate a chart for inclusion using an [R script with
+ggplot2](https://ggplot2.tidyverse.org/).
+
+To play with this code on your own machine, get the [redo
+source code](https://github.com/apenwarr/redo) and look in the
+`docs/cookbook/latex/` directory.
+
+
+### Generating a plot from an R script
+
+First, let's tell redo how to generate our chart.  We'll use
+the R language, and ask it to plot some of its sample data (the mpg, "miles
+per gallon" data set) and save it to an eps (encapsulated postscript) file. 
+eps files are usually a good format for LaTeX embedded images, because they
+scale to any printer or display resolution.
+
+First, let's make an R script that generates a plot:
+<pre><code lang='r' src='mpg.R'></code></pre>
+
+And then a .do file to tie that into redo:
+<pre><code lang='sh' src='mpg.eps.do'></code></pre>
+
+We can build and view the image:
+```shell
+$ redo mpg.eps 
+redo  mpg.eps
+
+# View the file on Linux
+$ evince mpg.eps
+
+# View the file on MacOS
+$ open mpg.eps
+```
+
+
+### Running the LaTeX processor
+
+Here's the first draft of our very important scientific paper:
+<pre><code lang='tex' src='paper.latex'></code></pre>
+
+Notice how it refers to the chart from above, `mpg.eps`, and a text file,
+`discovery.txt`.  Let's create the latter as a static file.
+<pre><code lang='tex' src='discovery.txt'></code></pre>
+
+With all the parts of our document in places, we can now compile it directly
+using `pdflatex`:
+```shell
+$ pdflatex paper.latex 
+This is pdfTeX, Version 3.14159265-2.6-1.40.17 (TeX Live 2016/Debian) (preloaded format=pdflatex)
+ restricted \write18 enabled.
+entering extended mode
+...[a lot of unnecessary diagnostic messages]...
+Output written on paper.pdf (2 pages, 68257 bytes).
+Transcript written on paper.log.
+```
+
+But this has a few problems.  First of all, it doesn't understand
+dependencies; if `mpg.R` changes, it won't know to rebuild `mpg.eps`. 
+Secondly, the TeX/LaTeX toolchain has an idiosyncracy that means you might
+have to rebuild your document more than once.  In our example, we generate a
+table of contents, but it ends up getting generated *before* processing the
+rest of the content in the document, so it's initially blank.  As it
+continues, LaTeX produces a file called `paper.aux` with a list of the
+references needed by the table of contents, and their page numbers.  If we
+run LaTeX over again, it'll use that to build a proper of table of contents.
+
+Of course, life is not necessarily so easy.  Once the table of contents
+isn't blank, it might start to push content onto the next page.  This will
+change all the page numbers!  So we'd have to do it one more time.  And that
+might lead to even more subtle problems, like a reference to page 99
+changing to page 100, which pushes a word onto the next page, which changes
+some other page number, and so on.  Thus, we need a script that will keep
+looping, re-running LaTeX until `paper.aux` stabilizes.
+
+The whole script we'll use is below.  Instead of running `pdflatex`
+directly, we'll use the regular `latex` command, which produces a .dvi
+(DeVice Independent) intermediate file which we can later turn into a pdf or
+ps file.
+
+LaTeX produces a bunch of clutter files (like `paper.aux`) that can be used
+in future runs, but which also make its execution nondeterministic.  To
+avoid that problem, we tell it to use a temporary `--output-directory` that
+we delete and recreate before each build (although we might need to run
+`latex` multiple times in one build, to get `paper.aux` to converge).
+<pre><code lang='sh' src='default.runtex.do'></code></pre>
+
+
+### Virtual targets, side effects, and multiple outputs
+
+Why did we call our script `default.runtex.do`?  Why not `default.pdf.do` or
+`default.dvi.do`, depending what kind of file we ask LaTeX to produce?
+
+The problem is that the `latex` command actually produces several
+files in that temporary directory, and we might want to keep them around. 
+If we name our .do file after only *one* of those outputs, things get messy.
+
+The biggest problem is that redo requires a .do file to write its output to
+$3 (or stdout), so that it can guarantee the output gets replaced
+atomically.  When there is more than one output, at most one file can
+be sent to $3; how do you choose which one?  Even worse, some programs don't
+even have the ability to choose the output filename; for an input of
+`paper.latex`, the `latex` command just writes a bunch of files named
+`paper.*` directly.  You can't ask it to put just one of them in $3.
+
+The easiest way to handle this situation in redo is to use a "virtual
+target", which is a target name that doesn't actually get created has a file,
+and has only side effects.  You've seen these before: when we use `all.do`
+or `clean.do`, we don't expect to produce a file named `all` or `clean`.  We
+expect redo to run a collection of other commands.  In `make`, these are
+sometimes called ".PHONY rules" because of the way they are declared in a
+`Makefile`.  But the rules aren't phony, they really are executed; they just
+don't produce output.  So in redo we call them "virtual."
+
+When we `redo paper.runtex`, it builds our virtual target.  There is no
+`paper.runtex` file or directory generated.  But as a side effect, a
+directory named `paper.tmp` is created.
+
+(Side note: it's tempting to name the directory the same as the target.  So
+we could have a `paper.runtex` directory instead of `paper.tmp`.  This is
+not inherently a bad idea, but currently redo behaviour is undefined if you
+redo-ifchange a directory.  Directories are weird.  If one file in that
+directory disappears, does that mean you "modified" the output by hand? 
+What if two redo targets modify the same directory?  Should we require
+scripts to only atomically replace an entire output directory via $3?  And
+so on.  We might carefully define this behaviour eventually, but for now,
+it's better to use a separate directory name and avoid the undefined
+behaviour.)
+
+
+### Depending on side effects produced by virtual targets
+
+Next, we want to produce .pdf and .ps files from the collection of files
+produced by the `latex` command, particularly `paper.tmp/paper.dvi`.  To do
+that, we have to bring our files back from the "virtual target" world into
+the real world.
+
+Depending on virtual targets is easy; we'll just
+`redo-ifchange paper.runtex`.  Then we want to materialize `paper.dvi` from
+the temporary files in `paper.tmp/paper.dvi`, which we can do with an
+efficient [hardlink](https://en.wikipedia.org/wiki/Hard_link) (rather than
+making an unnecessary copy), like this:
+<pre><code lang='sh' src='default.dvi.do'></code></pre>
+
+Notice that we *don't* do `redo-ifchange paper.tmp/paper.dvi`.  That's
+because redo has no knowledge of that file.  If you ask redo to build that
+file for you, it doesn't know how to do it.  You have to ask for
+`paper.runtex`, which you know - but redo doesn't know - will produce the
+input file you want.  Then you can safely use it.
+
+Once we have a .do file that produces the "real" (non-virtual,
+non-side-effect) `paper.dvi` file, however, it's safe to depend directly on
+it.  Let's use that to produce our .ps and .pdf outputs:
+<pre><code lang='sh' src='default.ps.do'></code></pre>
+<pre><code lang='sh' src='default.pdf.do'></code></pre>
+
+(As above, we include `exec >&2` lines because LaTeX tools incorrectly write
+their log messages to stdout.  We need to redirect it all to stderr.  That
+way [redo-log](../../redo-log) can handle all the log output appropriately.)
+
+
+### Explicit dependencies
+
+We've made a generalized script, `default.runtex.do`, that can compile any
+.latex file and produce a .tmp directory with its output.  But that's not
+quite enough: different .latex files might have extra dependencies that need
+to exist *before* the compilation can continue.  In our case, we need the
+auto-generated `mpg.eps` that we discussed above.
+
+To make that work, `default.runtex.do` looks for a .deps file with the same
+name as the .latex file being processed.  It contains just a list of extra
+dependencies that need to be built.  Here's ours:
+<pre><code src='paper.deps'></code></pre>
+
+You can use this same ".deps" technique in various different places in redo. 
+For example, you could have a default.do that can link a C program from any
+set of .o files.  To specify the right set of .o files for target `X`,
+default.do might look in an `X.deps` or `X.list` file.  If you later want to
+get even fancier, you could make an `X.deps.do` that programmatically
+generates the list of dependencies; for example, it might include one set of
+files on win32 platforms and a different set on unix platforms.
+
+
+### Autodependencies
+
+Our `paper.latex` file actually includes two files: `mpg.eps`, which we
+explicitly depended upon above, and `discovery.txt`, which we didn't.  The
+latter is a static source file, so we can let redo discover it
+automatically, based on the set of files that LaTeX opens while it runs. 
+The `latex` command has a `--record` option to do this; it produces a file
+called `paper.tmp/paper.fls` (.fls is short for "File LiSt").
+
+One of redo's best features is that you can declare dependencies *after*
+you've done your build steps, when you have the best knowledge of which
+files were actually needed.  That's why in `default.runtex.do`, we parse the
+.fls file and then redo-ifchange on its contents right at the end.
+
+(This brings up a rather subtle point about how redo works.  When you run
+redo-ifchange, redo adds to the list of files which, if they change, mean
+your target needs to be rebuilt.  But unlike make, redo will not actually
+rebuild those files merely because they're listed as a dependency; it just
+knows to rebuild your target, which means to run your .do file, which will
+run redo-ifchange *again* if it still needs those input files to be fresh.
+
+This avoids an annoying problem in `make` where you can teach it about
+which .h files your C program depended on last time, but if you change
+A.c to no longer include X.h, and then delete X.h, make might complain
+that X.h is missing, because A.c depended on it *last time*.  redo will
+simply notice that since X.h is missing, A.c needs to be recompiled, and let
+your compilation .do script report an error, or not.)
+
+Anyway, this feature catches not just our `discovery.txt` dependency, but
+also the implicit dependencies on various LaTeX template and font files, and
+so on.  If any of those change, our LaTeX file needs to be rebuilt.
+```shell
+$ redo --no-detail paper.pdf
+redo  paper.pdf
+redo    paper.dvi
+redo      paper.runtex
+redo        mpg.eps
+                                                                                            
+$ redo --no-detail paper.pdf
+redo  paper.pdf
+
+$ touch discovery.txt 
+
+$ redo --no-detail paper.pdf
+redo  paper.pdf
+redo    paper.dvi
+redo      paper.runtex
+
+$ redo --no-detail paper.pdf
+redo  paper.pdf
+```
+
+
+### Housekeeping
+
+As usual, to polish up our project, let's create an `all.do` and
+`clean.do`.
+
+Because this project is included in the redo source and we don't want redo
+to fail to build just because you don't have LaTeX or R installed, we'll
+have `all.do` quit politely if the necessary tools are missing.
+<pre><code lang='sh' src='all.do'></code></pre>
+<pre><code lang='sh' src='clean.do'></code></pre>
--- a/docs/cookbook/latex/mpg.R
+++ b/docs/cookbook/latex/mpg.R
@ -0,0 +1,4 @@
+library(ggplot2)
+
+qplot(mpg, wt, data = mtcars) + facet_wrap(~cyl) + theme_bw()
+ggsave("mpg.new.eps", width=4, height=2, units='in')
--- a/docs/cookbook/latex/mpg.eps.do
+++ b/docs/cookbook/latex/mpg.eps.do
@ -0,0 +1,7 @@
+redo-ifchange mpg.R
+Rscript mpg.R >&2
+mv mpg.new.eps $3
+
+# Some buggy ggplot2 versions produce this
+# junk file; throw it away.
+rm -f Rplots.pdf
--- a/docs/cookbook/latex/paper.deps
+++ b/docs/cookbook/latex/paper.deps
@ -0,0 +1 @@
+mpg.eps
--- a/docs/cookbook/latex/paper.latex
+++ b/docs/cookbook/latex/paper.latex
@ -0,0 +1,17 @@
+\documentclass{article}
+\usepackage{graphicx}
+
+\title{A very brief note on relativity}
+\author{The Redo Contributors}
+
+\begin{document}
+\maketitle
+\tableofcontents
+
+\newpage
+\section{Amazing Discovery}
+\input{discovery.txt}
+
+\section{Irrelevant Chart}
+\includegraphics{mpg.eps}
+\end{document}