History

nobody b65103c9f7 refactor: decompose archive read/write into composable building blocks Delete archive_context.rs and archive_ops.rs (1200+ lines of duplicated logic). Replace with four focused modules: 1. open_archive() - opens a file, detects compression, returns raw bytes 2. read_archive() - parses bytes into validated observations 3. CompressionWriter - writes bytes with any compression format 4. WriteStrategy - given a list of files, determines input archive, output archive, output format, and which of four write modes to use: - Create: new archive, no input - Append: uncompressed input, seek to end - AtomicSwap: compressed input, rewrite via temp file - CopyOnWrite: different input/output paths, transcode between formats Previously you could not specify output format. Appending always preserved the input format, creating compressed archives didn't work. Now all four cases work with any supported compression format. Atomic swap now writes to temp file, then renames. Crash-safe. Trade-off: This approach prioritizes code clarity over syscall efficiency. The archive file may be opened and read multiple times during a single operation (once for format detection, once for reading state, once for copying content). A more optimized implementation could reuse file handles, but the current approach makes each step's purpose obvious.		2025-12-01 21:01:27 -08:00
..
generate_state.py	feat: transparently append to compressed archives	2025-11-30 17:09:49 -08:00
generate_state_files.py	feat: transparently append to compressed archives	2025-11-30 17:09:49 -08:00
README.md	feat: transparently append to compressed archives	2025-11-30 17:09:49 -08:00
run_all.sh	feat: transparently append to compressed archives	2025-11-30 17:09:49 -08:00
run_brotli_test.sh	refactor: decompose archive read/write into composable building blocks	2025-12-01 21:01:27 -08:00
run_gzip_test.sh	feat: transparently append to compressed archives	2025-11-30 17:09:49 -08:00
run_zstd_test.sh	feat: transparently append to compressed archives	2025-11-30 17:09:49 -08:00
validate.sh	feat: transparently append to compressed archives	2025-11-30 17:09:49 -08:00

README.md

Compression Integration Tests

Manual integration tests for compressed archive functionality.

These scripts exercise the tool's ability to:

Read archives that were compressed by external programs (gzip, brotli, zstd)
Append new observations to compressed archives
Produce correct results whether reading compressed or uncompressed

Scripts

`generate_state.py <n>`

Generates a JSON state file with n items in each array. Output goes to stdout.

./generate_state.py 3
# Output: {"colors":["color_1","color_2","color_3"],"numbers":["number_1","number_2","number_3"],"animals":["animal_1","animal_2","animal_3"]}

`generate_state_files.py <count> <output_dir>`

Generates a series of state files (state_1.json through state_N.json) with progressively more items.

./generate_state_files.py 9 ./data
# Creates: data/state_1.json, data/state_2.json, ... data/state_9.json

`run_gzip_test.sh`

Tests the gzip compression workflow:

Create archive from first state file
Compress with gzip
Append remaining 8 state files to the compressed archive
Decompress and inspect

`run_brotli_test.sh`

Same workflow but with brotli compression.

`run_zstd_test.sh`

Same workflow but with zstd compression.

`run_all.sh`

Runs all compression tests in sequence.

`validate.sh` (optional)

Smoke test to verify the final state matches expectations.

Usage

cd tests/compression-integration

# Run all tests (generates data, builds, runs all compression formats)
./run_all.sh

# Or run individual steps:
./generate_state_files.py 9 ./data
./run_gzip_test.sh
./run_brotli_test.sh
./run_zstd_test.sh

# Optional: validate outputs match
./validate.sh

What to look for

After running the tests, you can manually verify:

The compressed archives were created
Appending to compressed archives worked (check file sizes grew)
The info command shows the same observation count for compressed and decompressed versions
The state command returns the same final state

Dependencies

gzip (usually pre-installed)
brotli (brew install brotli)
zstd (brew install zstd)

README.md

Compression Integration Tests

Scripts

generate_state.py <n>

generate_state_files.py <count> <output_dir>

run_gzip_test.sh

run_brotli_test.sh

run_zstd_test.sh

run_all.sh

validate.sh (optional)