Delete archive_context.rs and archive_ops.rs (1200+ lines of duplicated
logic). Replace with four focused modules:
1. open_archive() - opens a file, detects compression, returns raw bytes
2. read_archive() - parses bytes into validated observations
3. CompressionWriter - writes bytes with any compression format
4. WriteStrategy - given a list of files, determines input archive,
output archive, output format, and which of four write modes to use:
- Create: new archive, no input
- Append: uncompressed input, seek to end
- AtomicSwap: compressed input, rewrite via temp file
- CopyOnWrite: different input/output paths, transcode between formats
Previously you could not specify output format. Appending always
preserved the input format, creating compressed archives didn't work.
Now all four cases work with any supported compression format.
Atomic swap now writes to temp file, then renames. Crash-safe.
Trade-off: This approach prioritizes code clarity over syscall efficiency.
The archive file may be opened and read multiple times during a single
operation (once for format detection, once for reading state, once for
copying content). A more optimized implementation could reuse file
handles, but the current approach makes each step's purpose obvious.
When appending to a compressed archive (gzip, brotli, zstd), the tool
now handles compression automatically. Since some compression formats don't
support appending to compressed files in place, we write a new
compressed file with all the data and atomically rename it to replace
the original (assuming there is enough space on that filesystem).
This means you can work with compressed archives the same way as
uncompressed ones. Point the tool at your .json.gz file and append
values. No manual decompression/recompression needed.