feat: open source vibecoding example project

This commit is contained in:
nobody 2025-09-28 11:20:25 -07:00
commit b370f462f2
Signed by: GrocerPublishAgent
GPG key ID: 43B1C298CDDE181C
34 changed files with 7744 additions and 0 deletions

174
docs/demo/README.md Normal file
View file

@ -0,0 +1,174 @@
# json-archive Demo
This directory contains example JSON files that demonstrate how to use the `json-archive` tool to track changes over time.
## Sample Data
The demo uses three snapshots of the same JSON object as it evolves:
### [v1.json](v1.json)
```json
{"id": 1, "name": "Alice", "count": 0, "tags": ["initial"]}
```
### [v2.json](v2.json)
```json
{"id": 1, "name": "Alice", "count": 5, "tags": ["initial", "updated"], "lastSeen": "2025-01-15"}
```
### [v3.json](v3.json)
```json
{"id": 1, "name": "Bob", "count": 10, "tags": ["updated", "final"], "lastSeen": "2025-01-16", "score": 95}
```
## Basic Usage
### 1. Create Initial Archive
```bash
json-archive v1.json
```
**What happens:**
- Creates `v1.json.archive` with the initial state
- Leaves `v1.json` untouched
- The archive contains a header with the complete initial object
**Expected output:**
```
Created archive: v1.json.archive
```
**Archive contents (`v1.json.archive`):**
```jsonl
{"version": 1, "created": "2025-01-15T10:00:00Z", "initial": {"id": 1, "name": "Alice", "count": 0, "tags": ["initial"]}}
```
### 2. Append New Observation
```bash
json-archive v1.json.archive v2.json
```
**What happens:**
- Compares v2.json against the current state (v1.json)
- Appends delta changes to the existing archive
- Shows what fields changed between versions
**Expected output:**
```
Appended observation to: v1.json.archive
Changes detected: 3 modifications
```
**New archive contents:**
```jsonl
{"version": 1, "created": "2025-01-15T10:00:00Z", "initial": {"id": 1, "name": "Alice", "count": 0, "tags": ["initial"]}}
["observe", "obs-002", "2025-01-15T10:05:00Z", 3]
["change", "/count", 5, "obs-002"]
["add", "/lastSeen", "2025-01-15", "obs-002"]
["change", "/tags", ["initial", "updated"], "obs-002"]
```
### 3. Continue Building History
```bash
json-archive v1.json.archive v3.json
```
**Expected output:**
```
Appended observation to: v1.json.archive
Changes detected: 4 modifications
```
**Final archive contents:**
```jsonl
{"version": 1, "created": "2025-01-15T10:00:00Z", "initial": {"id": 1, "name": "Alice", "count": 0, "tags": ["initial"]}}
["observe", "obs-002", "2025-01-15T10:05:00Z", 3]
["change", "/count", 5, "obs-002"]
["add", "/lastSeen", "2025-01-15", "obs-002"]
["change", "/tags", ["initial", "updated"], "obs-002"]
["observe", "obs-003", "2025-01-15T10:10:00Z", 4]
["change", "/name", "Bob", "obs-003"]
["change", "/count", 10, "obs-003"]
["change", "/tags", ["updated", "final"], "obs-003"]
["change", "/lastSeen", "2025-01-16", "obs-003"]
["add", "/score", 95, "obs-003"]
```
## Real-World Workflow Example
```bash
# Daily automation script
curl https://example.com/123456/user-profile.json -L -O /tmp/user-profile.json
json-archive \
/mnt/share/backups/user-profile.json.archive /tmp/user-profile.json \
--source "my-backup.sh:example.com:123456" \
--remove-source-files
```
**What you're seeing here:**
1. **Self-documenting archives**: The archive filename `user-profile.json.archive` contains the original filename, making it clear what data is being tracked.
2. **File cleanup with `--remove-source-files`**: This flag moves the JSON file into the archive rather than copying it, automatically cleaning up temporary files in shell scripts.
3. **Flexible file handling**: You don't have to remove the source file. For example, you could snapshot a `state.json` file that some process uses as a database without disrupting the running process.
4. **Source labeling for data integrity**: The `--source` flag creates a unique ID for the JSON object you're tracking. When appending to an existing archive, if the source label doesn't match, the tool refuses to run to protect against data loss. Use your own naming convention to create meaningful identifiers (URLs, script names, etc.).
## Most Useful Features Tour
### 1. Create Archive All At Once
```bash
json-archive v1.json v2.json v3.json
```
**What happens:**
- Creates `v1.json.archive` with all three observations
- Processes each file sequentially, computing deltas between them
- Equivalent to the step-by-step approach above
### 2. Remove Source Files After Archiving
```bash
json-archive v1.json --remove-source-files
```
**What happens:**
- Creates `v1.json.archive`
- **Deletes** `v1.json` after successful archive creation
- Useful for cleanup workflows where you only want the archive
### 3. Force Overwrite Existing Archive
```bash
json-archive --force v1.json
```
**What happens:**
- Overwrites `v1.json.archive` if it already exists
- Without `--force`, the command safely refuses to overwrite
### 4. Custom Output Location
```bash
json-archive -o my-custom.json.archive v1.json v2.json v3.json
```
**What happens:**
- Creates archive at specified path instead of inferring from input filename
- Useful when you want a different naming convention
### 5. Add Source Metadata
```bash
json-archive --source "yt-dlp:youtube:dQw4w9WgXcQ" v1.json v2.json v3.json
```
**What happens:**
- Adds source identifier to archive header
- Helps track where the data came from in complex workflows

1
docs/demo/v1.json Normal file
View file

@ -0,0 +1 @@
{"id": 1, "name": "Alice", "count": 0, "tags": ["initial"]}

1
docs/demo/v2.json Normal file
View file

@ -0,0 +1 @@
{"id": 1, "name": "Alice", "count": 5, "tags": ["initial", "updated"], "lastSeen": "2025-01-15"}

1
docs/demo/v3.json Normal file
View file

@ -0,0 +1 @@
{"id": 1, "name": "Bob", "count": 10, "tags": ["updated", "final"], "lastSeen": "2025-01-16", "score": 95}

288
docs/file-format-spec.md Normal file
View file

@ -0,0 +1,288 @@
# JSON Archive Format Specification v1.0
## Overview
A JSONL (JSON Lines) format for tracking the evolution of JSON objects over time using delta-based changes with JSON Pointer (RFC 6901) for path notation. Inspired by asciicinema v3 format.
## File Structure
```
{header}
[event]
[event]
...
```
- First line: JSON header object
- Following lines: Event arrays or comments
- File extension: `.json.archive`
- Encoding: UTF-8
### File Extension Rationale
The `.json.archive` extension is designed for workflows where a command line program generates the same `filename.json` file daily. By appending `.archive` to the existing filename, minimal changes are needed to existing processes.
For example, if you have a daily process that generates `data.json`, the archive becomes `data.json.archive`. This makes shell pipeline operations intuitive:
```bash
json-archive concat data.json.archive data.json --remove-source-files
```
This command clearly shows that `json-archive` is moving the contents of `data.json` into the archive, making the operation self-documenting and requiring minimal changes to existing workflows.
## Header Format
```json
{
"version": 1,
"created": "ISO-8601 timestamp",
"source": "optional source identifier",
"initial": { ... initial object state ... },
"metadata": { ... optional metadata ... }
}
```
Required fields:
- `version`: Format version (currently 1)
- `created`: ISO-8601 timestamp of archive creation
- `initial`: Complete initial state of the tracked object
## Event Types
Each event is a JSON array with the event type as the first element.
### 1. Observe Event
Marks the beginning of an observation with a count of following changes.
```json
["observe", observationId, timestamp, changeCount]
```
- `observationId`: Unique string identifier (can be any string: UUID, timestamp, sequential ID, etc.)
- `timestamp`: ISO-8601 timestamp
- `changeCount`: Number of add/change/remove/move events that follow
### 2. Add Event
Adds a new field to the object.
```json
["add", path, value, observationId]
```
- `path`: JSON Pointer path to the field
- `value`: Any JSON value
- `observationId`: String referencing the preceding observe event
### 3. Change Event
Modifies an existing field value.
```json
["change", path, newValue, observationId]
```
- `path`: JSON Pointer path to the field
- `newValue`: New value
- `observationId`: String referencing the preceding observe event
### 4. Remove Event
Removes a field from the object.
```json
["remove", path, observationId]
```
- `path`: JSON Pointer path to the field
- `observationId`: String referencing the preceding observe event
### 5. Move Event
Reorders existing elements within an array. Moves are applied sequentially.
```json
["move", path, [[fromIndex, toIndex], ...], observationId]
```
- `path`: JSON Pointer path to the array
- `[[fromIndex, toIndex], ...]`: List of move operations applied in order
- `observationId`: String referencing the preceding observe event
**Important implementation detail:** Each move operation should:
1. First, insert a copy of the element at `fromIndex` into position `toIndex`
2. Then, remove the original element from its position (accounting for any shift caused by the insertion)
This approach prevents index calculation errors. When `fromIndex > toIndex`, the removal happens at `fromIndex + 1`. When `fromIndex < toIndex`, the removal happens at `fromIndex`.
Example: Given array [A, B, C, D]:
```json
["move", "/items", [[3, 1]], "obs-001"]
```
Step 1: Insert D at index 1 → [A, D, B, C, D]
Step 2: Remove from index 4 → [A, D, B, C]
For multiple moves on [A, B, C]:
```json
["move", "/items", [[2, 0], [2, 1]], "obs-001"]
```
First move: Insert C at 0 → [C, A, B, C], remove from 3 → [C, A, B]
Second move: Insert B at 1 → [C, B, A, B], remove from 3 → [C, B, A]
Note: Use `add` events for new elements, not `move`. Move is strictly for reordering existing elements.
### 6. Snapshot Event
Stores complete object state for faster seeking and append performance.
```json
["snapshot", observationId, timestamp, object]
```
- `observationId`: Unique string identifier for this snapshot
- `timestamp`: ISO-8601 timestamp
- `object`: Complete object state at this point
Snapshot events are interchangeable with observe+delta sequences: you can rewrite
["observe", observatinID, timestamp, N] followed by N delta events into a single
[ "snapshot", observationID, timestamp, object], and vice versa.
Implementation Reality: The current tool naively appends snapshots without removing
equivalent observe+delta sequences. Yhis is lazy programming for implementation ease.
Better tools should replace the delta sequence with the snapshot, not duplicate
the information.
The snapshot placement strategy is about append performance optimization.
1. **Append requires replay:** To append a new observation, you must know the current
state. The tool seeks backward from EOF to find the most recent snapshot, then
replays forward from there. Placing snapshots near the end of large archives
minimizes this replay cost.
2. **Snapshot placement is arbitrary:** Unlike fixed-interval I-frames in video codecs,
snapshots can be placed anywhere based on access patterns and file size.
by moving/adding snapshots closer to the end, reducing replay cost on append.
Archives can be "losslessly re-encoded" by repositioning snapshots based on
access patterns and file size.
3. Practical example:
- Small files (10 versions): No snapshots needed, full replay is cheap
- Large files (GBs of updates): Place snapshots near end-of-file so append only replays
recent deltas
- High-frequency appends: Consider periodic re-encoding to maintain snapshots near EOF
## Identifier Format
The `observationId` field used throughout events is an arbitrary string that must be unique within the file. Common patterns include:
- UUIDs: `"550e8400-e29b-41d4-a716-446655440000"`
- Timestamps: `"1705325400.123"`
- Sequential IDs: `"obs-001"`, `"obs-002"`
- ISO timestamps: `"2025-01-15T10:05:00.123Z"`
- Any other string scheme that guarantees uniqueness
## Path Notation
Paths use JSON Pointer notation (RFC 6901):
- Object fields: `/user/profile/name`
- Array elements: `/items/0`
- Root level: `/fieldname`
- Empty string key: `/`
- Escape sequences: `~0` for `~`, `~1` for `/`
Examples:
- `/users/0/email` - First user's email
- `/metadata/tags/3` - Fourth tag in tags array
- `/strange~1key` - Key containing forward slash
## Comments
Lines beginning with `#` are treated as comments and ignored by parsers.
```
# This is a comment
["observe", "obs-001", "2025-01-15T10:00:00Z", 1]
```
## Example File
```json
{"version": 1, "created": "2025-01-15T10:00:00Z", "initial": {"id": 1, "views": 0, "tags": ["api", "v1"]}}
# First observation - using sequential ID
["observe", "obs-001", "2025-01-15T10:05:00Z", 2]
["add", "/title", "Hello World", "obs-001"]
["change", "/views", 10, "obs-001"]
# Array modification - using UUID
["observe", "550e8400-e29b-41d4-a716-446655440000", "2025-01-15T10:10:00Z", 3]
["change", "/views", 25, "550e8400-e29b-41d4-a716-446655440000"]
["add", "/tags/2", "public", "550e8400-e29b-41d4-a716-446655440000"]
["move", "/tags", [[2, 0]], "550e8400-e29b-41d4-a716-446655440000"]
# Snapshot - using timestamp as ID
["snapshot", "1705325700.456", "2025-01-15T10:15:00Z", {"id": 1, "views": 25, "title": "Hello World", "tags": ["public", "api", "v1"]}]
["observe", "obs-003", "2025-01-15T10:20:00Z", 2]
["add", "/likes", 5, "obs-003"]
["remove", "/title", "obs-003"]
```
## Reading Algorithm
1. Parse header from first line
2. Initialize state from `header.initial`
3. For each subsequent line:
- Skip if comment (`#`)
- Parse JSON array
- Apply event based on type:
- `observe`: Note changeCount for bounded reading
- `add`: Set field at path
- `change`: Update field at path
- `remove`: Delete field at path
- `move`: Apply array reordering operations sequentially
- `snapshot`: Optionally update state completely
**Important:** Observations in the archive file are not required to be in chronological order. The reader implementation should parse all events and sort them by timestamp if chronological ordering is needed for the use case.
## CLI Implementation Notes
### Basic Command Structure
```bash
jsonarchive create output.jarch input1.json input2.json ...
```
### Processing Logic
1. Read first JSON file as initial state
2. For each subsequent JSON file:
- Compare with current state
- Generate observe event with timestamp
- Generate add/change/remove/move events for differences (using JSON Pointer paths)
- Update current state
3. Optionally insert snapshots based on:
- Number of observations (e.g., every 100)
- Size of accumulated deltas
- Time intervals
### JSON Pointer Implementation
- Implement RFC 6901 compliant JSON Pointer resolution
- Handle escape sequences: `~0``~`, `~1``/`
- Validate paths before applying operations
- Array indices must be valid integers
### Diff Generation
**For objects:**
1. Recursively traverse both objects
2. For keys in new but not old: generate `add`
3. For keys in old but not new: generate `remove`
4. For keys in both with different values: generate `change`
**For arrays:**
1. Identify common elements (present in both arrays)
2. Generate `remove` for elements only in old array
3. Generate `add` for elements only in new array
4. Generate `change` for common elements with different values
5. Generate minimal `move` sequence for common elements in different positions
**Important:** New array elements should always use `add` operations at their final positions. The `move` operation is strictly for reordering existing elements. This keeps the semantics clear and the diff minimal.
## Design Rationale
- **Delta-based**: Minimizes file size for small incremental changes
- **Self-contained**: Header contains initial state, making file complete
- **Human-readable**: JSONL with comments for debugging
- **Standards-compliant**: Uses RFC 6901 JSON Pointer for path notation
- **Seekable**: Snapshots allow jumping to recent states without full replay
- **Simple parsing**: Line-based format with standard JSON
- **Change bounds**: Observe events include count for predictable reads

160
docs/info-command.md Normal file
View file

@ -0,0 +1,160 @@
# Info Command
Shows what's in an archive file: metadata, observation timeline, and file statistics.
## Basic Usage
```bash
json-archive info file.archive
json-archive info --output json file.archive
```
The command reads the entire archive file to collect observation data, so expect memory usage proportional to file size.
## Output Modes
### Human-readable (default)
```bash
json-archive info docs/demo/v1.json.archive
```
```
Archive: docs/demo/v1.json.archive
Created: Sun 15:23:40 28-Sep-2025
3 observations from Sun 15:23:40 28-Sep-2025 to Sun 15:23:40 28-Sep-2025
# Observation ID Date & Time Changes JSON Size
────────────────────────────────────────────────────────────────────────────────────────
0 (initial) Sun 15:23:40 28-Sep-2025 - 52 bytes
1 obs-c4636428-1400-44... Sun 15:23:40 28-Sep-2025 3 86 bytes
2 obs-389b4a7c-4d78-42... Sun 15:23:40 28-Sep-2025 7 94 bytes
Total archive size: 1.1 KB (0 snapshots)
To get the JSON value at a specific observation:
json-archive state --index <#> docs/demo/v1.json.archive
json-archive state --id <observation-id> docs/demo/v1.json.archive
Examples:
json-archive state --index 0 docs/demo/v1.json.archive # Get initial state
json-archive state --index 2 docs/demo/v1.json.archive # Get state after observation 2
```
Use this mode when you want to quickly understand what's in an archive or debug observation timelines.
### JSON output
Use JSON output for scripting, monitoring, or feeding data into other tools.
```bash
json-archive info --output json docs/demo/v1.json.archive
```
```json
{
"archive": "docs/demo/v1.json.archive",
"created": "2025-09-28T15:23:40.633960+00:00",
"file_size": 1096,
"snapshot_count": 0,
"observations": [
{
"index": 0,
"id": "initial",
"timestamp": "2025-09-28T15:23:40.633960+00:00",
"changes": 0,
"json_size": 52
},
{
"index": 1,
"id": "obs-c4636428-1400-44d7-b30f-1c080c608e3c",
"timestamp": "2025-09-28T15:23:40.634520+00:00",
"changes": 3,
"json_size": 86
}
]
}
```
## Field Reference
### Human-readable fields
- **#**: Index number (0-indexed) for use with `--index` flag
- **Observation ID**: Unique identifier for use with `--id` flag
- **Date & Time**: When the observation was recorded
- **Changes**: Number of fields modified (dash for initial state)
- **JSON Size**: Size in bytes of the reconstructed JSON at this observation
### JSON output fields
- **archive**: File path
- **created**: Archive creation timestamp (ISO-8601)
- **file_size**: Archive file size in bytes
- **snapshot_count**: Number of snapshots for seeking optimization
- **observations[]**: Array of observation metadata
- **index**: 0-based position for `--index` access
- **id**: Unique ID for `--id` access ("initial" for index 0)
- **timestamp**: ISO-8601 timestamp
- **changes**: Change count (0 for initial)
- **json_size**: Reconstructed JSON size in bytes
## Practical Use Cases
### Monitoring archive growth
```bash
# Archive size and observation count
json-archive info --output json data.json.archive | jq '{file_size, observation_count: (.observations | length)}'
```
### Finding large observations
```bash
# Observations with JSON size > 1KB
json-archive info --output json data.json.archive | jq '.observations[] | select(.json_size > 1024)'
```
## Performance Characteristics
- **Memory usage**: Loads entire archive into memory to reconstruct states
- **I/O pattern**: Single full file read, no seeking
- **CPU usage**: Minimal - mostly JSON parsing and formatting
For archives larger than available RAM, consider using [`json-archive state`](state-command.md) with specific observation IDs instead of getting full timeline info.
## Error Cases
### Missing archive file
```bash
$ json-archive info nonexistent.archive
error E051: Path not found
I couldn't find the archive file: nonexistent.archive
Make sure the file path is correct and the file exists.
Check for typos in the filename.
```
### Corrupt archive header
```bash
$ json-archive info corrupted.archive
error E003: Missing header
I couldn't parse the header: unexpected character at line 1
The archive file appears to be corrupted or not a valid json-archive file.
```
### Invalid output format
```bash
$ json-archive info --output xml data.json.archive
# Silently falls back to human-readable format
# (no validation on output parameter)
```
## Known Issues
**Command reference bug**: The human-readable output currently shows `json-archive get` commands in the usage examples, but the correct command is `json-archive state`. This is a display bug - the actual functionality is unaffected.
## See Also
- [`json-archive state`](state-command.md) - Retrieve JSON data at specific observations
- [File Format Specification](file-format-spec.md) - Archive format details
- [Getting Started Guide](../README.md) - Basic usage examples

202
docs/state-command.md Normal file
View file

@ -0,0 +1,202 @@
# State Command
The `json-archive state` command retrieves the JSON state at a specific observation
## Flags and Semantics:
Primary access methods:
- --id <observation-id>: Get state at specific observation (unambiguous, primary method)
- --index <n>: Get state at Nth observation in file order (convenience for using outout from info command, with caveat)
Timestamp-based access:
- --as-of <timestamp>: Most recent observation with timestamp ≤ given time ("state as of
this moment")
- ---before <timestamp>: Most recent observation with timestamp < given time
(strictly before)
- --after <timestamp>: Earliest observation with timestamp > given time (first state
after)
- --latest: Most recent observation by timestamp (default if no flags given)
## Access Methods
You must specify exactly one access method to identify which observation's state to retrieve:
### By Observation ID (Recommended)
```bash
json-archive state --id <OBSERVATION_ID> file.archive
```
Gets the state at the observation with the specified ID. This is the most unambiguous method since observation IDs are unique within the archive.
Example:
```bash
json-archive state --id obs-c4636428-1400-44d7-b30f-1c080c608e3c data.json.archive
```
### By File Index
```bash
json-archive state --index <INDEX> file.archive
```
Gets the state at the Nth observation in file order (0-indexed). **Note:** Observations are not guaranteed to be in chronological order in the file.
Example:
```bash
json-archive state --index 0 data.json.archive # Initial state (first observation)
json-archive state --index 1 data.json.archive # Second observation
```
### By Timestamp
#### As-Of Timestamp
```bash
json-archive state --as-of <TIMESTAMP> file.archive
```
Gets the state from the most recent observation with timestamp ≤ the given time.
Example:
```bash
json-archive state --as-of "2025-01-15T10:05:00Z" data.json.archive
```
#### Right Before Timestamp
```bash
json-archive state --before <TIMESTAMP> file.archive
```
Gets the state from the most recent observation with timestamp < the given time (strictly before).
Example:
```bash
json-archive state --before "2025-01-15T10:05:00Z" data.json.archive
```
#### After Timestamp
```bash
json-archive state --after <TIMESTAMP> file.archive
```
Gets the state from the earliest observation with timestamp > the given time.
Example:
```bash
json-archive state --after "2025-01-15T10:05:00Z" data.json.archive
```
### Latest State (Default)
```bash
json-archive state --latest file.archive
# Or simply:
json-archive state file.archive
```
Gets the state from the observation with the latest timestamp. This is the default behavior when no other access method is specified.
## Timestamp Format
All timestamps must be in ISO-8601 format with UTC timezone:
- `2025-01-15T10:05:00Z`
- `2025-01-15T10:05:00.123Z` (with milliseconds)
## Output
The command outputs the JSON state as pretty-printed JSON to stdout:
```json
{
"count": 10,
"id": 1,
"lastSeen": "2025-01-16",
"name": "Bob",
"score": 95,
"tags": [
"initial",
"final"
]
}
```
## Error Cases
### Non-existent Observation ID
```bash
$ json-archive state --id nonexistent-id file.archive
error E030: Non-existent observation ID
I couldn't find an observation with ID 'nonexistent-id'
Use 'json-archive info' to see available observation IDs
```
### Out-of-bounds Index
```bash
$ json-archive state --index 10 file.archive
error E053: Array index out of bounds
Index 10 is out of bounds. The archive has 3 observations (0-2)
Use 'json-archive info' to see available observation indices
```
### Invalid Timestamp
```bash
$ json-archive state --as-of "invalid-timestamp" file.archive
error W012: Invalid timestamp
I couldn't parse the timestamp 'invalid-timestamp'. Please use ISO-8601 format like '2025-01-15T10:05:00Z'
```
### No Observations Match Timestamp
```bash
$ json-archive state --as-of "2020-01-01T00:00:00Z" file.archive
error E051: Path not found
No observations found as of 2020-01-01 00:00:00 UTC
Try using --after to find the first observation after this time
```
### Multiple Access Methods
```bash
$ json-archive state --id obs-123 --index 1 file.archive
error E022: Wrong field count
Please specify only one access method (--id, --index, --as-of, --right-before, --after, or --latest)
Examples:
json-archive state --id obs-123 file.archive
json-archive state --index 2 file.archive
json-archive state --as-of "2025-01-15T10:05:00Z" file.archive
```
## Implementation Details and Design Rationale
### Why Timestamp Access Uses More Memory
The timestamp-based flags (`--as-of`, `--before`, `--after`, `--latest`) are **memory-intensive** because they must read the entire archive file into memory. Here's why:
**The Problem**: Since observations are not guaranteed to be in chronological order in the file, we cannot use efficient seeking or leverage snapshots for optimization. To find "the most recent observation ≤ timestamp", we must:
1. Read every single observation in the file
2. Parse all their timestamps
3. Sort them chronologically in memory
4. Find the target observation
5. Replay all events to reconstruct the state
This is fundamentally different from index-based access (`--index`, `--id`) which can potentially use snapshots and delta compression for efficiency.
### Why Files Aren't Chronologically Sorted
The tool accepts observations in any order because data collection is ad hoc. You don't always have perfect control over when or how observations are collected.
**Real scenario**: You have JSON files scattered across multiple hard drives from different systems. You can't mount all drives simultaneously, so you load them one at a time and absorb the data as you go. The observations end up out of chronological order, but that's fine - you consolidate first, then organize later.
**Another scenario**: You inherit data from multiple APIs and systems that were collecting observations over time. The timestamps are there, but the files weren't collected in chronological order. You want to merge everything into one archive without having to pre-sort.
**The approach**: "Octopus-style merging" - absorb all the data as-is, then make sense of it incrementally. The file format doesn't fight this workflow.
### Design Philosophy
Timestamp access works the way it does because the tool prioritizes data consolidation over performance optimization. When you use `--as-of` or `--latest`, you're asking the tool to figure out chronological relationships that may not exist in the file structure.
Index/ID access is direct - you're referencing observations by their position in the file or unique identifier. This is efficient because it doesn't require chronological analysis.
## See Also
- [`json-archive info`](info-command.md) - View archive metadata and observation timeline
- [File Format Specification](file-format-spec.md) - Details about the archive format
- [Getting Started Guide](../README.md) - Basic usage examples