It was getting way too ad-hoc in there. Let's reorganize the tests so that there's a good, obvious, suggested sequence to run them in.
This comes down to the lack of a 'seq' command (what?!) and the fact that BSD "wc -l" returns extra whitespace, while the GNU version doesn't. We should be using numeric comparisons instead of string comparisons, and then it's ok.