From 1b616ddcbb09fe44f30366a2d440c02c6577dc17 Mon Sep 17 00:00:00 2001 From: Avery Pennarun Date: Tue, 20 Nov 2018 09:50:37 -0500 Subject: [PATCH] Improved documentation. Removed the Cookbook link for now, since it's empty. --- Documentation/Contributing.md | 3 + Documentation/Cookbook.md | 1 - Documentation/FAQImpl.md | 126 ++++++++++++++++++++++--------- Documentation/GettingStarted.md | 18 ++++- Documentation/index.md | 127 ++++++++++++++++---------------- mkdocs.yml | 24 +++--- 6 files changed, 185 insertions(+), 114 deletions(-) delete mode 100644 Documentation/Cookbook.md diff --git a/Documentation/Contributing.md b/Documentation/Contributing.md index b0f1573..626b7d4 100644 --- a/Documentation/Contributing.md +++ b/Documentation/Contributing.md @@ -35,6 +35,9 @@ The best things you can do for redo are: in default redo build rules for common program types would probably be very popular. +- Convince your favourite OS distro to build and include (up to date!) redo + packages. + - Tell your friends! diff --git a/Documentation/Cookbook.md b/Documentation/Cookbook.md deleted file mode 100644 index b152ac0..0000000 --- a/Documentation/Cookbook.md +++ /dev/null @@ -1 +0,0 @@ -Not written yet! Sorry. \ No newline at end of file diff --git a/Documentation/FAQImpl.md b/Documentation/FAQImpl.md index f475246..88234e7 100644 --- a/Documentation/FAQImpl.md +++ b/Documentation/FAQImpl.md @@ -1,22 +1,14 @@ -# Hey, does redo even *run* on Windows? +# Does redo even run on Windows? -FIXME: -Probably under cygwin. But it hasn't been tested, so no. +redo works fine in [Windows Services for Linux (WSL)](https://docs.microsoft.com/en-us/windows/wsl/initialize-distro) +on Windows 10. You might consider that to be "real" Windows, or not. If +you use it, it runs more or less like it does on Linux. WSL is considerably +slower than native Linux, so there is definitely room for speed improvements. If I were going to port redo to Windows in a "native" way, I might grab the source code to a posix shell (like the one in MSYS) and link it directly into redo. -`make` also doesn't *really* run on Windows (unless you use -MSYS or Cygwin or something like that). There are versions -of make that do - like Microsoft's version - but their -syntax is horrifically different from one vendor to -another, so you might as well just be writing for a -vendor-specific tool. - -At least redo is simple enough that, theoretically, one -day, I can imagine it being cross platform. - One interesting project that has appeared recently is busybox-w32 (https://github.com/pclouds/busybox-w32). It's a port of busybox to win32 that includes a mostly POSIX @@ -28,11 +20,67 @@ all of this needs more experimentation. # Can a *.do file itself be generated as part of the build process? -Not currently. There's nothing fundamentally preventing us from allowing -it. However, it seems easier to reason about your build process if you -*aren't* auto-generating your build scripts on the fly. +Yes and no. Kind of. Redo doesn't stop you from doing this, and it will +use a .do file if you generate one. However, you have to do the steps in +the right order if you want this to work. For example, this should work +fine: -This might change someday. + redo-ifchange default.o.do + redo-ifchange foo.o + +You've told redo to generate `default.o.do` (presumably using a +script like default.do.do), and *then* you've told redo to generate `foo.o`. +When it considers how to build `foo.o`, it will look for a .do file, and +thus find `default.o.do`, so it'll run it. Great. + +Some people would like us to go a step further, and *automatically* look for +rules that will help produce default.o.do. That is, you might want to just +write this: + + redo-ifchange foo.o bar.o + +and then expect that, implicitly, redo will know it needs to look for +`default.o.do`, and if `default.o.do` doesn't exist yet, it should look for +`default.do.do`, and so on. + +The problem with this idea is... where does it end? If there's no +`default.o.do`, so we look for a `default.do.do`, what if that doesn't exist +either? Perhaps there's a `default.do.do.do` for generating `default.do.do` +files? And so on. You have to draw the line somewhere. + + +# What can I do instead of auto-generating *.do files? + +When people ask about auto-generating .do files, they usually mean one of +two things: + +1. They want to create a new directory and auto-populate it with .do files +copied from somewhere. + + - This can be solved in various ways. For example, you might make a + trivial toplevel `default.do` file in the new directory; this will + match all possible targets. You can then use the sh "source" operator + (.) and `redo-whichdo` to pull the "real" rules from elsewhere. An + example of [.do file delegation can be found in the wvstreams + project](https://github.com/apenwarr/wvstreams/blob/master/delegate.od). + +2. They want to generate, eg. `default.o.do` based on auto-detected compiler +settings, in order to support different brands of compilers, or different +architectures, etc. + + - The trick here is not to generate `default.o.do`, but rather to + generate another script instead. For example, you could have a + `compile.do` that generates a script called `compile`. `default.o.do` + would simply be hard coded to something like `./compile $2 $3`. The + wvstreams project also has [an example of + compile.do](https://github.com/apenwarr/wvstreams/blob/master/compile.do). + + - The advantage of separating out into separate default.o.do and compile + scripts is that you can put all your system-dependent conditionals + into compile.do, so they run only once and choose the "right" compile + command. Then, for each file, you can simply run that command. If + This kind of micro-optimization doesn't appeal to you, then there's no + shame in just putting all the logic directly into `default.o.do`. # How does redo store dependencies? @@ -42,11 +90,19 @@ named `.redo`. That directory contains a sqlite3 database with dependency information. The format of the `.redo` directory is undocumented because -it may change at any time. Maybe it will turn out that we +it may change at any time. It will likely turn out that we can do something simpler than sqlite3. If you really need to make a tool that pokes around in there, please ask on the mailing list if we can standardize something for you. +Unfortunately, the design of having a *single* .redo directory at the +toplevel of a project has proven problematic: what exactly is the "top +level" of a "project"? If your project is a subdirectory of another project +that then switches to redo, should the .redo directory move up a level in +the hierarchy when that happens? And so on. Eventually, we will +probably migrate to a system where there is one .redo directory per target +directory. This avoids all kinds of problems with symlinks, directory +renames nested projects, and so on. # Isn't using sqlite3 overkill? And un-djb-ish? @@ -66,18 +122,18 @@ it! Or if you want to be really horrified: root root 1995612 2009-02-03 13:54 /usr/lib/libmysqlclient.so.15.0.0 The mysql *client* library is two megs, and it doesn't even -have a database server in it! People who think SQL +have a database in it! People who think SQL databases are automatically bloated and gross have not yet actually experienced the joys of sqlite. SQL has a well-deserved bad reputation, but sqlite is another story entirely. It's excellent, and much simpler and better written than you'd expect. -But still, I'm pretty sure it's not very "djbish" to use a +Still, it's not very "djbish" to use a general-purpose database, especially one that has a *SQL parser* in it. (One of the great things about redo's design is that it doesn't ever need to parse anything, so -a SQL parser is a bit embarrassing.) +embedding a whole SQL parser is a bit embarrassing.) I'm pretty sure djb never would have done it that way. However, I don't think we can reach the performance we want @@ -85,20 +141,15 @@ with dependency/build/lock information stored in plain text files; among other things, that results in too much fstat/open activity, which is slow in general, and even slower if you want to run on Windows. That leads us to a -binary database, and if the binary database isn't sqlite or -libdb or something, that means we have to implement our own -data structures. Which is probably what djb would do, of -course, but I'm just not convinced that I can do a better -(or even a smaller) job of it than the sqlite guys did. +binary database. And example of the kind of structure we need is the [one +used by ninja](https://github.com/ninja-build/ninja/blob/master/src/deps_log.h#L29) +which is very simple, fast, and efficient. Most of the state database stuff has been isolated in state.py. If you're feeling brave, you can try to implement your own better state database, with or without sqlite. -It is almost certainly possible to do it much more nicely -than I have, so if you do, please send it in! - # What hash algorithm does redo-stamp use? @@ -131,7 +182,7 @@ automatically, however: know you absolutely want that. (With timestamps, you can just `touch filename` to rebuild everything that depends on `filename`.) - + - Targets that are just used for aggregation (ie. they don't produce any output of their own) would always have the same checksum - the checksum of a zero-byte file - @@ -139,26 +190,26 @@ automatically, however: - Calculating checksums for every output file adds time to the build, even if you don't need that feature. - + - Building stuff unnecessarily and then stamping it is much slower than just not building it in the first place, so for *almost* every use of redo-stamp, it's not the right solution anyway. - + - To steal a line from the Zen of Python: explicit is better than implicit. Making people think about when they're using the stamp feature - knowing that it's slow and a little annoying to do - will help people design better build scripts that depend on this feature as little as possible. - + - djb's (as yet unreleased) version of redo doesn't implement checksums, so doing that would produce an incompatible implementation. With redo-stamp and redo-always being separate programs, you can simply choose not to use them if you want to keep maximum compatibility for the future. - + - Bonus: the redo-stamp algorithm is interchangeable. You don't have to stamp the target file or the source files or anything in particular; you can stamp any data you @@ -168,7 +219,7 @@ automatically, however: would always have been needed, and then we'd have to explain when to use the explicit one and when to use the implicit one. - + Thus, we made the decision to only use checksums for targets that explicitly call `redo-stamp` (see previous question). @@ -181,6 +232,10 @@ were really annoying, and I definitely felt it. Adding redo-stamp and redo-always work the way they do made the pain disappear, so I stopped changing things. +A longer and even more detailed explanation of timestamp vs checksum-based +build dependencies can be found in +[mtime comparison considered harmful](https://apenwarr.ca/log/20181113). + # Why doesn't redo by default print the commands as they are run? @@ -243,7 +298,6 @@ choose to run only that step in verbose mode: + cat c.c.c.b.b.a + ./sleep 1.1 redo t/c.c.c.b.b (done) - If you're using an autobuilder or something that logs build results for future examination, you should probably set it to always run redo with diff --git a/Documentation/GettingStarted.md b/Documentation/GettingStarted.md index 1cf1d02..3ade7b5 100644 --- a/Documentation/GettingStarted.md +++ b/Documentation/GettingStarted.md @@ -6,7 +6,7 @@ Optional, but recommended, is the `ps` output prettier. In modern versions of Debian, sqlite3 is already part of the python2.7 package. -You can install the requirements like this: +You can install the prerequisites like this: ```sh sudo apt-get install python2.7 python-setproctitle ``` @@ -34,3 +34,19 @@ your home directory: ```sh PREFIX=$HOME ./redo install ``` + + +# Distro packages + +## MacOS + +redo is available from the [Homebrew](https://brew.sh/) project: + + brew install redo + +## Linux + +Various linux distributions include redo under different names. Most of the +packages are unfortunately obsolete and don't contain the most recent bug +fixes. At this time (late 2018), we recommend using the latest tagged +version [from github](https://github.com/apenwarr/redo). diff --git a/Documentation/index.md b/Documentation/index.md index c7452f8..c43edb8 100644 --- a/Documentation/index.md +++ b/Documentation/index.md @@ -1,49 +1,35 @@ # redo: a recursive, general-purpose build system `redo` is a competitor to the long-lived, but sadly imperfect, `make` -program. There are many such competitors, because many people over the -years have been dissatisfied with make's limitations. However, of all the -replacements I've seen, only redo captures the essential simplicity and -flexibility of make, while avoiding its flaws. To my great surprise, it -manages to do this while being simultaneously simpler than make, more -flexible than make, and more powerful than make. +program. Unlike other such competitors, redo captures the essential +simplicity and flexibility of make, while avoiding its flaws. It manages to +do this while being simultaneously simpler than make, more flexible than +make, and more powerful than make, and without sacrificing performance - a +rare combination of features. -Although I wrote redo and I would love to take credit for it, the magical -simplicity and flexibility comes because I copied verbatim a design by -Daniel J. Bernstein (creator of qmail and djbdns, among many other useful -things). He posted some very terse notes on his web site at one point -(there is no date) with the unassuming title, "[Rebuilding target files when -source files have changed](http://cr.yp.to/redo.html)." Those notes are -enough information to understand how the system is supposed to work; -unfortunately there's no code to go with it. I get the impression that the -hypothetical "djb redo" is incomplete and Bernstein doesn't yet consider it -ready for the real world. - -I was led to that particular page by random chance from a link on [The djb -way](http://thedjbway.b0llix.net/future.html), by Wayne Marshall. +The original design for redo comes from Daniel J. Bernstein (creator of +qmail and djbdns, among many other useful things). He posted some +terse notes on his web site at one point (there is no date) with the +unassuming title, "[Rebuilding target files when source files have +changed](http://cr.yp.to/redo.html)." Those notes are enough information to +understand how the system is supposed to work; unfortunately there's no code +to go with it. I wrote this implementation of redo from scratch, based on +that design. After I found out about djb redo, I searched the Internet for any sign that other people had discovered what I had: a hidden, unimplemented gem of brilliant code design. I found only one interesting link: Alan Grosskurth, whose [Master's thesis at the University of Waterloo](http://grosskurth.ca/papers/mmath-thesis.pdf) was about top-down software rebuilding, that is, djb redo. He wrote his -own (admittedly slow) implementation in about 250 lines of shell script. - -If you've ever thought about rewriting GNU make from scratch, the idea of -doing it in 250 lines of shell script probably didn't occur to you. redo is -so simple that it's actually possible. For testing, I actually wrote an -even more minimal version, which always rebuilds everything instead of -checking dependencies, in 210 lines of shell (about 4 kbytes). - -The design is simply that good. +own (admittedly slow) implementation in about 250 lines of shell script, +which gives an idea for how straightforward the system is. My implementation of redo is called `redo` for the same reason that there are 75 different versions of `make` that are all called `make`. It's somehow -easier that way. Hopefully it will turn out to be compatible with the other -implementations, should there be any. +easier that way. -My extremely minimal implementation, called `do`, is in the `minimal/` -directory of this repository. +I also provide an extremely minimal pure-POSIX-sh implementation, called +`do`, in the `minimal/` directory of this repository. (Want to discuss redo? See the bottom of this file for information about our mailing list.) @@ -51,18 +37,19 @@ information about our mailing list.) # What's so special about redo? -The theory behind redo is almost magical: it can do everything `make` can -do, only the implementation is vastly simpler, the syntax is cleaner, and you -can do even more flexible things without resorting to ugly hacks. Also, you -get all the speed of non-recursive `make` (only check dependencies once per -run) combined with all the cleanliness of recursive `make` (you don't have -code from one module stomping on code from another module). +The theory behind redo sounds too good to be true: it can do everything +`make` can do, but the implementation is vastly simpler, the syntax is +cleaner, and you have even more flexibility without resorting to ugly hacks. +Also, you get all the speed of non-recursive `make` (only check dependencies +once per run) combined with all the cleanliness of recursive `make` (you +don't have code from one module stomping on code from another module). (Disclaimer: my current implementation is not as fast as `make` for some things, because it's written in python. Eventually I'll rewrite it an C and it'll be very, very fast.) -The easiest way to show it is with an example. +The easiest way to show it is to jump into an example. Here's one for +compiling a C++ program. Create a file called default.o.do: @@ -128,34 +115,43 @@ run the *current script* over again." Dependencies are tracked in a persistent `.redo` database so that redo can check them later. If a file needs to be rebuilt, it re-executes the `whatever.do` script and regenerates the dependencies. If a file doesn't -need to be rebuilt, redo can calculate that just using its persistent +need to be rebuilt, redo figures that out just using its persistent `.redo` database, without re-running the script. And it can do that check -just once right at the start of your project build. +just once right at the start of your project build, which is really fast. -But best of all, as you can see in `default.o.do`, you can declare a -dependency *after* building the program. In C, you get your best dependency +Best of all, as you can see in `default.o.do`, you can declare a dependency +*after* building the program. In C, you get your best dependency information by trying to actually build, since that's how you find out which -headers you need. redo is based on the following simple insight: -you don't actually -care what the dependencies are *before* you build the target; if the target -doesn't exist, you obviously need to build it. Then, the build script -itself can provide the dependency information however it wants; unlike in -`make`, you don't need a special dependency syntax at all. You can even -declare some of your dependencies after building, which makes C-style -autodependencies much simpler. +headers you need. redo is based on this simple insight: you don't +actually care what the dependencies are *before* you build the target. If +the target doesn't exist, you obviously need to build it. + +Once you're building it anyway, the build script itself can calculate the +dependency information however it wants; unlike in `make`, you don't need a +special dependency syntax at all. You can even declare some of your +dependencies after building, which makes C-style autodependencies much +simpler. + +redo therefore is a unique combination of imperative and declarative +programming. The initial build is almost entirely imperative (running a +series of scripts). As part of that, the scripts declare dependencies a few +at a time, and redo assembles those into a larger data structure. Then, in +the future, it uses that pre-declared data structure to decide what work +needs to be redone. (GNU make supports putting some of your dependencies in include files, and auto-reloading those include files if they change. But this is very confusing - the program flow through a Makefile is hard to trace already, -and even harder if it restarts randomly from the beginning when a file -changes. With redo, you can just read the script from top to bottom. A -`redo-ifchange` call is like calling a function, which you can also read -from top to bottom.) +and even harder when it restarts from the beginning because an include file +changes at runtime. With redo, you can just read each build script from top +to bottom. A `redo-ifchange` call is like calling a function, which you can +also read from top to bottom.) # What projects use redo? -Here are a few open source examples: +Some larger proprietary projects are using it, but unfortunately they can't +easily be linked from this document. Here are a few open source examples: * [Liberation Circuit](https://github.com/linleyh/liberation-circuit) is a straightforward example of a C++ binary (a game) compiled with redo. @@ -177,11 +173,12 @@ Here are a few open source examples: [`t/111-example/`](t/111-example) subdir of the redo project itself. If you switch your program's build process to use redo, please let us know and -we can link to it here. +we can link to it here for some free publicity. -(Please don't use the code in the `t/` directory as serious examples of how -to use redo. Many of the tests are doing things in deliberately psychotic -ways in order to stress redo's code and find bugs.) +(Please don't use the integration testing code in the redo project's `t/` +directory as serious examples of how to use redo. Many of the tests are +doing things in intentionally psychotic ways in order to stress redo's code +and find bugs.) # How does this redo compare to other redo implementations? @@ -190,11 +187,13 @@ djb never released his version, so other people have implemented their own variants based on his [published specification](http://cr.yp.to/redo.html). This version, sometimes called apenwarr/redo, is probably the most advanced -one, including support for parallel builds, advanced build logs, and helpful -debugging features. It's currently written in python for easier +one, including support for parallel builds, +[resilient timestamps](https://apenwarr.ca/log/20181113) and checksums, +[build log linearization](https://apenwarr.ca/log/20181106), and +helpful debugging features. It's currently written in python for easier experimentation, but the plan is to eventually migrate it to plain C. (Some people like to call this version "python-redo", but I don't like that name. -We shouldn't have to rename it just because we port the code to C.) +We shouldn't have to rename it when we later transliterate the code to C.) Here are some other redo variants (thanks to Nils Dagsson Moskopp for many of these links): @@ -241,4 +240,4 @@ redo semantics, and/or have few or no automated tests. At the time of this writing, none of them except apenwarr/redo (ie. this project) support parallel builds (`redo -j`). For large projects, -parallel builds are usually essential. +parallel builds are usually considered essential. diff --git a/mkdocs.yml b/mkdocs.yml index 4ddc6e2..0623aa4 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -7,7 +7,6 @@ strict: true pages: - Introduction: index.md - Getting Started: GettingStarted.md - - Cookbook.md - FAQ: - Basics: FAQBasics.md - Semantics: FAQSemantics.md @@ -15,14 +14,15 @@ pages: - Parallel Builds: FAQParallel.md - Implementation Details: FAQImpl.md - Contributing.md - - Command Reference (man pages): - - redo: redo.md - - redo-ifchange: redo-ifchange.md - - redo-ifcreate: redo-ifcreate.md - - redo-always: redo-always.md - - redo-stamp: redo-stamp.md - - redo-sources: redo-sources.md - - redo-targets: redo-targets.md - - redo-ood: redo-ood.md - - redo-whichdo: redo-whichdo.md - - redo-log: redo-log.md + - Reference: + - Manual Pages: + - redo: redo.md + - redo-ifchange: redo-ifchange.md + - redo-ifcreate: redo-ifcreate.md + - redo-always: redo-always.md + - redo-stamp: redo-stamp.md + - redo-sources: redo-sources.md + - redo-targets: redo-targets.md + - redo-ood: redo-ood.md + - redo-whichdo: redo-whichdo.md + - redo-log: redo-log.md