Cookbook: add an example of using default.do for text processing.

This commit is contained in:
Avery Pennarun 2018-11-26 13:10:29 -05:00
commit 3b305edc7e
18 changed files with 377 additions and 28 deletions

View file

@ -0,0 +1,3 @@
for d in */all.do; do
echo "${d%.do}"
done | xargs redo-ifchange

View file

@ -0,0 +1,3 @@
for d in */clean.do; do
echo "${d%.do}"
done | xargs redo

View file

@ -0,0 +1,283 @@
### Building a text preprocessor
Let's say we want to process a bunch of files, each in the same way. An
easy example is a text preprocessor: we'll look through the text and replace
variables of the form %VARNAME% with real content. In our example, we'll
use variable %VERSION%, representing a version number, and %DATE%, with
today's date.
To play with this code on your own machine, get the [redo
source code](https://github.com/apenwarr/redo) and look in the
`Documentation/cookbook/defaults/` directory.
### Input files
Let's create some input files we can use. The input to our
preprocessor will have file extension `.in`. The output can be in various
formats
<pre><code src='test.txt.in'></code></pre>
<pre><code lang='c' src='include/version.h.in'></code></pre>
<pre><code lang='python' src='version.py.in'></code></pre>
For fun, let's also make a python test program that calls the newly
generated version.py:
<pre><code lang='python' src='test.py'></code></pre>
Finally, we need to provide values for the variables. Let's put each one
in its own file, named `version` and `date`, respectively.
<pre><b style='text-align: center; display: block'>version</b
><code>1.0</code></pre>
<pre><b style='text-align: center; display: block'>date</b
><code>1970-01-01</code></pre>
### default.do files
Now we want to teach redo how to do our substitutions: how to, in general,
generate file X from file X.in.
We could write a separate .do file for every output file X. For example, we
might make `test.txt.do` and `version.py.do`. But that gets tedious. To
make it easier, if there is no specific `X.do` for a target named X, redo
will try using `default.do` instead. Let's write a default.do for our text
preprocessor.
<pre><code lang='sh' src='default.do'></code></pre>
If `default.do` is asked to build X, and there exists a file named `X.in`, we
use `sed` to do our variable substitutions. In that case, `default.do` uses
redo-ifchange to depend on `X.in`, `version`, and `date`. If a file named
`X.in` does *not* exist, then we don't know what to do, so we give an error.
On the other hand, if we try to generate a file that *already* exists, like
`test.py`, redo does not call default.do at all. redo only tries to create
files that don't exist, or that were previously generated by redo. This
stops redo from accidentally overwriting your work.
```shell
$ redo test.txt
redo test.txt
$ cat test.txt
This is the documentation for MyProgram version
1.0. It was generated on 1970-01-01.
$ redo chicken
redo chicken
default.do: Fatal: don't know how to build 'chicken'
redo chicken (exit 99)
$ redo version.py
redo version.py
# test.py was created by us, so it's a "source" file.
# redo does *not* call default.do to replace it.
$ redo test.py
redo: test.py: exists and not marked as generated; not redoing.
$ python test.py
Version '1.0' has build date '1970-01-01'
```
Nice!
While we're here, let's make an `all.do` so that we don't have to tell redo
exactly which files to rebuild, every single time.
<pre><code lang='sh' src='all.do'></code></pre>
```shell
$ redo
redo all
redo test.txt
redo version.py
redo include/version.h
# input files didn't change, so nothing to rebuild
$ redo
redo all
$ touch test.txt.in
$ redo
redo all
redo test.txt
```
### Auto-generating the version and date (redo-always and redo-stamp)
Of course, in a real project, we won't want to hardcode the version number
and date into a file. Ideally, we can get the version number from a version
control system, like git, and we can use today's date.
To make that happen, we can replace the static `version` and `date` files with
`version.do` and `date.do`. `default.do` already uses redo-ifchange to
depend on `version` and `date`, so redo will create them as needed, and
if they change, redo will rebuild all the targets that depend on them.
However, the `version` and `date` files are special: they depend on the
environment outside redo itself. That is, there's no way to declare a
dependency on the current date. We might generate the `date` file once, but
tomorrow, there's no way for redo to know that its value should change.
To handle this situation, redo has the `redo-always` command. If we run
redo-always from a .do file, it means every time someone depends on that
target, it will be considered out-of-date and need to be rebuilt. The
result looks like this:
```shell
$ redo
redo all
redo test.txt
redo version
redo date
redo version.py
redo include/version.h
# version.do and date.do are redo-always, so
# everything depending on them needs to rebuild
# every time.
$ redo
redo all
redo test.txt
redo version
redo date
redo version.py
redo include/version.h
```
Of course, for many uses, that's overcompensating: the version number and
date don't change *that* often, so we might end up doing a lot of
unnecessary work on every build. To solve that, there's `redo-stamp`.
redo-stamp does the opposite of redo-always: while redo-always makes things
build *more* often, redo-stamp makes things build *less* often.
Specifically, it lets a .do file provide a "stamp value" for its output; if
that stamp value is the same as before, then its target should be considered
unchanged after all.
The most common stamp value is just the content itself. Since in redo, we
write the content to $3, we can also read it back from $3:
<pre><code lang='sh' src='version.do'></code></pre>
<pre><code lang='sh' src='date.do'></code></pre>
And the final result is what we want: although `version` and `date` are
generated every time, the targets which depend on them are not:
```shell
$ redo clean
redo clean
# version and date are generated just once per run,
# the first time they are used.
$ redo
redo all
redo test.txt
redo version
redo date
redo version.py
redo include/version.h
# Here, (test.txt) means redo is considering building
# test.txt, but can't decide yet. In order to decide,
# it needs to first build date and version. After
# that, it decides not to build test.txt after all.
$ redo
redo all
redo (test.txt)
redo date
redo version
```
### default.do, subdirectories, and redo-whichdo
There's one more thing we should mention, which is the interaction of
default.do with files in subdirectories. Notice that we are building
`include/version.h` in our example:
```shell
$ redo include/version.h
redo include/version.h
redo version
redo date
$ cat include/version.h
// C/C++ header file identifying the current version
#ifndef __VERSION_H
#define __VERSION_H
#define VERSION "redo-0.31-3-g974eb9f"
#define DATE "2018-11-26"
#endif // __VERSION_H
```
redo works differently from the `make` command when you ask it to build
files in subdirectories. In make's case, it always looks for a `Makefile`
in the *current* directory, and uses that for all build instructions. So
`make include/version.h` and `cd include && make version.h` are two
different things; the first uses `Makefile`, and the second uses
`include/Makefile` (or crashes if the latter does not exist).
redo, on the other hand, always uses the same formula to find a .do file for
a particular target. For a file named X, that formula is as follows:
- first, try X.do
- then try default.do
- then try ../default.do
- then try ../../default.do
- ...and so on...
(Note: for targets with an extension, like X.o, redo actually tries even
more .do files, like default.o.do and ../default.o.do. For precise details,
read the [redo man page](../../redo).)
You can see which .do files redo considers for a given target by using the
`redo-whichdo` command. If redo-whichdo returns successfully, the last name
in the list is the .do file it finally decided to use.
```shell
$ redo-whichdo include/version.h
include/version.h.do
include/default.h.do
include/default.do
default.h.do
default.do
```
### Redo always runs from the .do file's directory
To ensure consistency, redo always changes the current directory to
the directory *containing the selected .do file* (**not** the directory
containing the target, if they are different). As a result,
`redo include/version.h` and `cd include && redo version.h` always have
exactly the same effect:
```shell
$ redo include/version.h
redo include/version.h
redo version
redo date
$ cd include && redo version.h
redo version.h
redo ../version
redo ../date
```
(redo's display is slightly different between the two: it always shows the
files it's building relative to the $PWD at the time you started redo.)
This feature is critical to redo's recursive nature; it's the reason that
essays like [Recursive Make Considered Harmful](http://aegis.sourceforge.net/auug97.pdf)
don't apply to redo. Any redo target, anywhere in your source tree, can
use redo-ifchange to depend on one of your other targets, and the dependency
will work right.
Why does redo change to the directory containing the .do file, instead of
the directory containing the target? Because usually, the .do file needs to
refer to other dependencies, and it's easier to always express those
dependencies without adjusting any paths. In our text preprocessor example,
`default.do` does `redo-ifchange version date`; this wouldn't work properly
if it were running from the `include/` directory, because there are no files
named `version` and `date` in there.
Similarly, when compiling C programs, there are
usually compiler options like `-I../mylib/include`. If we're compiling
`foo.o` and `mydir/bar.o`, we would like `-I../mylib/include` to have the
same meaning in both cases.

View file

@ -0,0 +1,5 @@
date
version
include/version.h
version.py
test.txt

View file

@ -0,0 +1,2 @@
redo-ifchange test.txt version.py \
test.py include/version.h

View file

@ -0,0 +1,2 @@
rm -f date version include/version.h \
version.py test.txt *~ .*~

View file

@ -0,0 +1,3 @@
date +'%Y-%m-%d' >$3
redo-always
redo-stamp <$3

View file

@ -0,0 +1,25 @@
# $1 is the target name, eg. test.txt
# $2 in the same as $1. We'll talk about
# that in a later example.
# $3 is the temporary output file we should
# create. If this script is successful,
# redo will atomically replace $1 with $3.
if [ -e "$1.in" ]; then
# if a .in file exists, then do some
# text substitution.
#
# Remember, the user asks redo to build
# a particular *target* name. It's the .do
# file's job to figure out what source file(s)
# to use to generate the target.
redo-ifchange "$1.in" version date
read VERSION <version
read DATE <date
sed -e "s/%VERSION%/$VERSION/g" \
-e "s/%DATE%/$DATE/g" \
<$1.in >$3
else
echo "$0: Fatal: don't know how to build '$1'" >&2
exit 99
fi

View file

@ -0,0 +1,8 @@
// C/C++ header file identifying the current version
#ifndef __VERSION_H
#define __VERSION_H
#define VERSION "%VERSION%"
#define DATE "%DATE%"
#endif // __VERSION_H

View file

@ -0,0 +1,6 @@
#!/usr/bin/env python
"""Test program for auto-generated version.py"""
import version
print('Version %r has build date %r'
% (version.VERSION, version.DATE))

View file

@ -0,0 +1,2 @@
This is the documentation for MyProgram version
%VERSION%. It was generated on %DATE%.

View file

@ -0,0 +1,7 @@
# Try to get a version number from git, if possible.
if ! git describe >$3; then
echo "$0: Falling back to static version." >&2
echo 'UNKNOWN' >$3
fi
redo-always
redo-stamp <$3

View file

@ -0,0 +1,3 @@
# python module identifying the current version
VERSION='%VERSION%'
DATE='%DATE%'

View file

@ -2,7 +2,8 @@
Let's start with Hello World: famously, the simplest project that does
anything interesting. We'll write this one in C, but don't worry if
you're not a C programmer! We'll keep this simple.
you're not a C programmer. The focus isn't the C code itself, just to
compile it.
To play with the code on your own machine, get the [redo
source code](https://github.com/apenwarr/redo) and look in the
@ -13,10 +14,10 @@ source code](https://github.com/apenwarr/redo) and look in the
First, let's create a source file that we want to compile:
<pre><code lang='c' src='hello.c'></code></pre>
Now we need a .do file to tell redo how to build it:
Now we need a .do file to tell redo how to compile it:
<pre><code lang='sh' src='hello.do'></code></pre>
With those files in place, we can compile and run the program:
With those files in place, we can build and run the program:
```shell
$ redo hello
redo hello
@ -26,7 +27,7 @@ Hello, world!
```
Use the `redo` command to forcibly re-run a specific rule (in this case, the
compile rule). Or, if you only want to recompile `hello` when its input
compiler). Or, if you only want to recompile `hello` when its input
files (dependencies) have changed, use `redo-ifchange`.
```shell
$ redo hello
@ -60,14 +61,9 @@ redo all
redo hello
# Notice that this forcibly re-runs the 'all'
# rule, but all.do calls redo-ifchange, not redo,
# so it doesn't forcibly recompile hello itself.
#
# You can use this trick to experiment with one
# step in your build process at a time, without
# having to play tricks to make that step re-run,
# like you might do with make (eg. by deleting or
# touching files).
# rule, but all.do calls redo-ifchange, so
# hello itself is only recompiled if its
# dependencies change.
$ redo
redo all