It was getting way too ad-hoc in there. Let's reorganize the tests so that there's a good, obvious, suggested sequence to run them in.
This fails if you make test *twice* without the preceding patch. Unfortunately I couldn't find a good way to make it fail if you only make test once.