refactor: reorder readme sections
This commit is contained in:
parent
2ab1c31993
commit
44d15e29b7
1 changed files with 54 additions and 53 deletions
107
README.md
107
README.md
|
|
@ -20,54 +20,20 @@ json-archive data.json
|
|||
json-archive data.json.archive data.json
|
||||
```
|
||||
|
||||
## Real-world use case
|
||||
|
||||
Perfect for tracking YouTube video metadata over time:
|
||||
## Installation
|
||||
|
||||
```bash
|
||||
# Download video info with yt-dlp
|
||||
yt-dlp --write-info-json -o "%(id)s.%(ext)s" "https://youtube.com/watch?v=..."
|
||||
|
||||
# Create initial archive (creates videoID.info.json.archive)
|
||||
json-archive videoID.info.json
|
||||
|
||||
# Later, append new observations to existing archive
|
||||
json-archive videoID.info.json.archive videoID.info.json
|
||||
|
||||
# Or safely re-run (won't overwrite existing archive)
|
||||
json-archive videoID.info.json
|
||||
|
||||
# Run daily in a cron job to capture changes
|
||||
# The archive preserves your title/description experiments and view count history
|
||||
cargo install json-archive
|
||||
```
|
||||
|
||||
## Design philosophy
|
||||
|
||||
**Hackable over efficient**: The file format prioritizes human readability and scriptability over binary compactness. You can:
|
||||
|
||||
- Open archives in any text editor
|
||||
- Grep through them for specific changes
|
||||
- Parse them in JavaScript without special libraries
|
||||
- Pipe them through standard Unix tools
|
||||
|
||||
**Minimal workflow changes**: Archive files sit next to your original JSON files with a `.archive` extension. Your existing scripts need minimal modification.
|
||||
|
||||
### Compression support (as a concession)
|
||||
|
||||
While the core design keeps things simple and readable, the tool does work with compressed archives as a practical concession for those who need it. You can read from and write to gzip, deflate, zlib, brotli, and zstd compressed files without special flags.
|
||||
|
||||
**Important caveat**: Compressed archives may require rewriting the entire file during updates (depending on the compression format). If your temporary filesystem is full or too small, updates can fail. In that case, manually specify an output destination with `-o` to write the new archive elsewhere.
|
||||
|
||||
This works fine for the happy path with archive files up to a few hundred megabytes, but contradicts the "keep it simple" design philosophy - it's included because it's practically useful.
|
||||
|
||||
**Building without compression**: Compression libraries are a security vulnerability vector. The default build includes them because I want convenience. If you don't want to bundle compression libraries:
|
||||
Or build from source:
|
||||
|
||||
```bash
|
||||
cargo install json-archive --no-default-features
|
||||
git clone <repo>
|
||||
cd json-archive
|
||||
cargo build --release
|
||||
```
|
||||
|
||||
The minimal build detects compressed files and errors with a clear message explaining you need the full version or manual decompression.
|
||||
|
||||
## Archive format
|
||||
|
||||
The format is JSONL with delta-based changes using [JSON Pointer](https://tools.ietf.org/html/rfc6901) paths. For complete technical details about the file format, see the [file format specification](docs/file-format-spec.md).
|
||||
|
|
@ -89,6 +55,52 @@ Each observation records:
|
|||
- When it happened
|
||||
- A unique observation ID
|
||||
|
||||
## Design philosophy
|
||||
|
||||
**Hackable over efficient**: The file format prioritizes human readability and scriptability over binary compactness. You can:
|
||||
|
||||
- Open archives in any text editor
|
||||
- Grep through them for specific changes
|
||||
- Parse them in JavaScript without special libraries
|
||||
- Pipe them through standard Unix tools
|
||||
|
||||
**Minimal workflow changes**: Archive files sit next to your original JSON files with a `.archive` extension. Your existing scripts need minimal modification.
|
||||
|
||||
### Compression support
|
||||
|
||||
Compression libraries are a security vulnerability vector. The default build includes them because I want convenience. If you don't want to bundle compression libraries:
|
||||
|
||||
```bash
|
||||
cargo install json-archive --no-default-features
|
||||
```
|
||||
|
||||
The minimal build detects compressed files and errors with a clear message explaining you need the full version or manual decompression.
|
||||
|
||||
While the file format keeps things simple and readable, the full build of this tool also works with compressed archives. You can read from and write to gzip, deflate, zlib, brotli, and zstd compressed files without special flags.
|
||||
|
||||
**Important caveat**: Compressed archives may require rewriting the entire file during updates (depending on the compression format). If your temporary filesystem is full or too small, updates can fail. In that case, manually specify an output destination with `-o` to write the new archive elsewhere. This works fine for the happy path with archive files up to a few hundred megabytes. Maybe think about custom code if you want to track gigabytes of changes.
|
||||
|
||||
## Real-world use case
|
||||
|
||||
Perfect for tracking YouTube video metadata over time:
|
||||
|
||||
```bash
|
||||
# Download video info with yt-dlp
|
||||
yt-dlp --write-info-json -o "%(id)s.%(ext)s" "https://youtube.com/watch?v=..."
|
||||
|
||||
# Create initial archive (creates videoID.info.json.archive)
|
||||
json-archive videoID.info.json
|
||||
|
||||
# Later, append new observations to existing archive
|
||||
json-archive videoID.info.json.archive videoID.info.json
|
||||
|
||||
# Or safely re-run (won't overwrite existing archive)
|
||||
json-archive videoID.info.json
|
||||
|
||||
# Run daily in a cron job to capture changes
|
||||
# The archive preserves your title/description experiments and view count history
|
||||
```
|
||||
|
||||
## Commands
|
||||
|
||||
The tool infers behavior from filenames:
|
||||
|
|
@ -136,19 +148,6 @@ json-archive -s 50 data.json
|
|||
json-archive --source "youtube-metadata" data.json
|
||||
```
|
||||
|
||||
## Installation
|
||||
|
||||
```bash
|
||||
cargo install json-archive
|
||||
```
|
||||
|
||||
Or build from source:
|
||||
|
||||
```bash
|
||||
git clone <repo>
|
||||
cd json-archive
|
||||
cargo build --release
|
||||
```
|
||||
|
||||
## File naming convention
|
||||
|
||||
|
|
@ -178,6 +177,8 @@ fetch('data.json.archive')
|
|||
The format uses only standard JSON and organizes the data into roughly the shape
|
||||
you would need anyway.
|
||||
|
||||
|
||||
|
||||
## Contributing
|
||||
|
||||
Contributions are welcome! However, you will need to sign a contributor license agreement with Peoples Grocers before we can accept your pull request.
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue