283 lines
9.4 KiB
Markdown
283 lines
9.4 KiB
Markdown
### Building a text preprocessor
|
|
|
|
Let's say we want to process a bunch of files, each in the same way. An
|
|
easy example is a text preprocessor: we'll look through the text and replace
|
|
variables of the form %VARNAME% with real content. In our example, we'll
|
|
use variable %VERSION%, representing a version number, and %DATE%, with
|
|
today's date.
|
|
|
|
To play with this code on your own machine, get the [redo
|
|
source code](https://github.com/apenwarr/redo) and look in the
|
|
`Documentation/cookbook/defaults/` directory.
|
|
|
|
### Input files
|
|
|
|
Let's create some input files we can use. The input to our
|
|
preprocessor will have file extension `.in`. The output can be in various
|
|
formats
|
|
<pre><code src='test.txt.in'></code></pre>
|
|
<pre><code lang='c' src='include/version.h.in'></code></pre>
|
|
<pre><code lang='python' src='version.py.in'></code></pre>
|
|
|
|
For fun, let's also make a python test program that calls the newly
|
|
generated version.py:
|
|
<pre><code lang='python' src='test.py'></code></pre>
|
|
|
|
Finally, we need to provide values for the variables. Let's put each one
|
|
in its own file, named `version` and `date`, respectively.
|
|
|
|
<pre><b style='text-align: center; display: block'>version</b
|
|
><code>1.0</code></pre>
|
|
|
|
<pre><b style='text-align: center; display: block'>date</b
|
|
><code>1970-01-01</code></pre>
|
|
|
|
### default.do files
|
|
|
|
Now we want to teach redo how to do our substitutions: how to, in general,
|
|
generate file X from file X.in.
|
|
|
|
We could write a separate .do file for every output file X. For example, we
|
|
might make `test.txt.do` and `version.py.do`. But that gets tedious. To
|
|
make it easier, if there is no specific `X.do` for a target named X, redo
|
|
will try using `default.do` instead. Let's write a default.do for our text
|
|
preprocessor.
|
|
<pre><code lang='sh' src='default.do'></code></pre>
|
|
|
|
If `default.do` is asked to build X, and there exists a file named `X.in`, we
|
|
use `sed` to do our variable substitutions. In that case, `default.do` uses
|
|
redo-ifchange to depend on `X.in`, `version`, and `date`. If a file named
|
|
`X.in` does *not* exist, then we don't know what to do, so we give an error.
|
|
|
|
On the other hand, if we try to generate a file that *already* exists, like
|
|
`test.py`, redo does not call default.do at all. redo only tries to create
|
|
files that don't exist, or that were previously generated by redo. This
|
|
stops redo from accidentally overwriting your work.
|
|
```shell
|
|
$ redo test.txt
|
|
redo test.txt
|
|
|
|
$ cat test.txt
|
|
This is the documentation for MyProgram version
|
|
1.0. It was generated on 1970-01-01.
|
|
|
|
$ redo chicken
|
|
redo chicken
|
|
default.do: Fatal: don't know how to build 'chicken'
|
|
redo chicken (exit 99)
|
|
|
|
$ redo version.py
|
|
redo version.py
|
|
|
|
# test.py was created by us, so it's a "source" file.
|
|
# redo does *not* call default.do to replace it.
|
|
$ redo test.py
|
|
redo: test.py: exists and not marked as generated; not redoing.
|
|
|
|
$ python test.py
|
|
Version '1.0' has build date '1970-01-01'
|
|
```
|
|
|
|
Nice!
|
|
|
|
While we're here, let's make an `all.do` so that we don't have to tell redo
|
|
exactly which files to rebuild, every single time.
|
|
<pre><code lang='sh' src='all.do'></code></pre>
|
|
|
|
```shell
|
|
$ redo
|
|
redo all
|
|
redo test.txt
|
|
redo version.py
|
|
redo include/version.h
|
|
|
|
# input files didn't change, so nothing to rebuild
|
|
$ redo
|
|
redo all
|
|
|
|
$ touch test.txt.in
|
|
|
|
$ redo
|
|
redo all
|
|
redo test.txt
|
|
```
|
|
|
|
|
|
### Auto-generating the version and date (redo-always and redo-stamp)
|
|
|
|
Of course, in a real project, we won't want to hardcode the version number
|
|
and date into a file. Ideally, we can get the version number from a version
|
|
control system, like git, and we can use today's date.
|
|
|
|
To make that happen, we can replace the static `version` and `date` files with
|
|
`version.do` and `date.do`. `default.do` already uses redo-ifchange to
|
|
depend on `version` and `date`, so redo will create them as needed, and
|
|
if they change, redo will rebuild all the targets that depend on them.
|
|
|
|
However, the `version` and `date` files are special: they depend on the
|
|
environment outside redo itself. That is, there's no way to declare a
|
|
dependency on the current date. We might generate the `date` file once, but
|
|
tomorrow, there's no way for redo to know that its value should change.
|
|
|
|
To handle this situation, redo has the `redo-always` command. If we run
|
|
redo-always from a .do file, it means every time someone depends on that
|
|
target, it will be considered out-of-date and need to be rebuilt. The
|
|
result looks like this:
|
|
```shell
|
|
$ redo
|
|
redo all
|
|
redo test.txt
|
|
redo version
|
|
redo date
|
|
redo version.py
|
|
redo include/version.h
|
|
|
|
# version.do and date.do are redo-always, so
|
|
# everything depending on them needs to rebuild
|
|
# every time.
|
|
$ redo
|
|
redo all
|
|
redo test.txt
|
|
redo version
|
|
redo date
|
|
redo version.py
|
|
redo include/version.h
|
|
```
|
|
|
|
Of course, for many uses, that's overcompensating: the version number and
|
|
date don't change *that* often, so we might end up doing a lot of
|
|
unnecessary work on every build. To solve that, there's `redo-stamp`.
|
|
redo-stamp does the opposite of redo-always: while redo-always makes things
|
|
build *more* often, redo-stamp makes things build *less* often.
|
|
Specifically, it lets a .do file provide a "stamp value" for its output; if
|
|
that stamp value is the same as before, then its target should be considered
|
|
unchanged after all.
|
|
|
|
The most common stamp value is just the content itself. Since in redo, we
|
|
write the content to $3, we can also read it back from $3:
|
|
<pre><code lang='sh' src='version.do'></code></pre>
|
|
<pre><code lang='sh' src='date.do'></code></pre>
|
|
|
|
And the final result is what we want: although `version` and `date` are
|
|
generated every time, the targets which depend on them are not:
|
|
```shell
|
|
$ redo clean
|
|
redo clean
|
|
|
|
# version and date are generated just once per run,
|
|
# the first time they are used.
|
|
$ redo
|
|
redo all
|
|
redo test.txt
|
|
redo version
|
|
redo date
|
|
redo version.py
|
|
redo include/version.h
|
|
|
|
# Here, (test.txt) means redo is considering building
|
|
# test.txt, but can't decide yet. In order to decide,
|
|
# it needs to first build date and version. After
|
|
# that, it decides not to build test.txt after all.
|
|
$ redo
|
|
redo all
|
|
redo (test.txt)
|
|
redo date
|
|
redo version
|
|
```
|
|
|
|
|
|
### default.do, subdirectories, and redo-whichdo
|
|
|
|
There's one more thing we should mention, which is the interaction of
|
|
default.do with files in subdirectories. Notice that we are building
|
|
`include/version.h` in our example:
|
|
```shell
|
|
$ redo include/version.h
|
|
redo include/version.h
|
|
redo version
|
|
redo date
|
|
|
|
$ cat include/version.h
|
|
// C/C++ header file identifying the current version
|
|
#ifndef __VERSION_H
|
|
#define __VERSION_H
|
|
|
|
#define VERSION "redo-0.31-3-g974eb9f"
|
|
#define DATE "2018-11-26"
|
|
|
|
#endif // __VERSION_H
|
|
```
|
|
|
|
redo works differently from the `make` command when you ask it to build
|
|
files in subdirectories. In make's case, it always looks for a `Makefile`
|
|
in the *current* directory, and uses that for all build instructions. So
|
|
`make include/version.h` and `cd include && make version.h` are two
|
|
different things; the first uses `Makefile`, and the second uses
|
|
`include/Makefile` (or crashes if the latter does not exist).
|
|
|
|
redo, on the other hand, always uses the same formula to find a .do file for
|
|
a particular target. For a file named X, that formula is as follows:
|
|
|
|
- first, try X.do
|
|
- then try default.do
|
|
- then try ../default.do
|
|
- then try ../../default.do
|
|
- ...and so on...
|
|
|
|
(Note: for targets with an extension, like X.o, redo actually tries even
|
|
more .do files, like default.o.do and ../default.o.do. For precise details,
|
|
read the [redo man page](../../redo).)
|
|
|
|
You can see which .do files redo considers for a given target by using the
|
|
`redo-whichdo` command. If redo-whichdo returns successfully, the last name
|
|
in the list is the .do file it finally decided to use.
|
|
```shell
|
|
$ redo-whichdo include/version.h
|
|
include/version.h.do
|
|
include/default.h.do
|
|
include/default.do
|
|
default.h.do
|
|
default.do
|
|
```
|
|
|
|
|
|
### Redo always runs from the .do file's directory
|
|
|
|
To ensure consistency, redo always changes the current directory to
|
|
the directory *containing the selected .do file* (**not** the directory
|
|
containing the target, if they are different). As a result,
|
|
`redo include/version.h` and `cd include && redo version.h` always have
|
|
exactly the same effect:
|
|
```shell
|
|
$ redo include/version.h
|
|
redo include/version.h
|
|
redo version
|
|
redo date
|
|
|
|
$ cd include && redo version.h
|
|
redo version.h
|
|
redo ../version
|
|
redo ../date
|
|
```
|
|
|
|
(redo's display is slightly different between the two: it always shows the
|
|
files it's building relative to the $PWD at the time you started redo.)
|
|
|
|
This feature is critical to redo's recursive nature; it's the reason that
|
|
essays like [Recursive Make Considered Harmful](http://aegis.sourceforge.net/auug97.pdf)
|
|
don't apply to redo. Any redo target, anywhere in your source tree, can
|
|
use redo-ifchange to depend on one of your other targets, and the dependency
|
|
will work right.
|
|
|
|
Why does redo change to the directory containing the .do file, instead of
|
|
the directory containing the target? Because usually, the .do file needs to
|
|
refer to other dependencies, and it's easier to always express those
|
|
dependencies without adjusting any paths. In our text preprocessor example,
|
|
`default.do` does `redo-ifchange version date`; this wouldn't work properly
|
|
if it were running from the `include/` directory, because there are no files
|
|
named `version` and `date` in there.
|
|
|
|
Similarly, when compiling C programs, there are
|
|
usually compiler options like `-I../mylib/include`. If we're compiling
|
|
`foo.o` and `mydir/bar.o`, we would like `-I../mylib/include` to have the
|
|
same meaning in both cases.
|