refactor: increase parse error diagnostic converage

Replace several of the generic serde_json parse error messages with
detailed descriptions of what went wrong.
This commit is contained in:
nobody 2025-09-30 09:14:54 -07:00
commit 07e604ac25
Signed by: GrocerPublishAgent
GPG key ID: 43B1C298CDDE181C
4 changed files with 836 additions and 561 deletions

View file

@ -201,15 +201,28 @@ The format uses only standard JSON. No special parsing required.
## Contributing ## Contributing
This is a practical tool built for real workflow needs. Contributions welcome, especially: Contributions are welcome! However, you will need to sign a contributor license agreement with Peoples Grocers before we can accept your pull request.
I promise to fix bugs quickly, but the overall design prioritizes being hackable over raw performance. This means many obvious performance improvements won't be implemented as they would compromise the tool's simplicity and inspectability.
Areas where contributions are especially appreciated:
- Additional CLI commands (validate, info, extract) - Additional CLI commands (validate, info, extract)
- Performance optimizations for large archives
- More compression format support
- Better diff algorithms for arrays - Better diff algorithms for arrays
- More compression format support
- Bug fixes and edge case handling
## License ## License
This project is licensed under the GNU Affero General Public License v3.0 (AGPL-3.0). This means:
- You can use, modify, and distribute this software
- If you modify and distribute it, you must share your changes under the same license
- If you run a modified version on a server or embed it in a larger system, you must make the entire system's source code available to users
- No TiVoization - hardware restrictions that prevent users from running modified versions are prohibited
The AGPL ensures that improvements to this tool remain open and available to everyone, even when used in hosted services or embedded systems.
--- ---
*Built with Rust for reliability and performance. Designed to be simple enough to understand, powerful enough to be useful.* *Built with Rust for reliability and performance. Designed to be simple enough to understand, powerful enough to be useful.*

570
src/event_deserialize.rs Normal file
View file

@ -0,0 +1,570 @@
// json-archive is a tool for tracking JSON file changes over time
// Copyright (C) 2025 Peoples Grocers LLC
//
// This program is free software: you can redistribute it and/or modify
// it under the terms of the GNU Affero General Public License as published
// by the Free Software Foundation, either version 3 of the License, or
// (at your option) any later version.
//
// This program is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
// GNU Affero General Public License for more details.
//
// You should have received a copy of the GNU Affero General Public License
// along with this program. If not, see <https://www.gnu.org/licenses/>.
//
// To purchase a license under different terms contact admin@peoplesgrocers.com
// To request changes, report bugs, or give user feedback contact
// marxism@peoplesgrocers.com
//
//! Event deserialization with diagnostic collection.
//!
//! ## Why this exists
//!
//! The .json.archive format uses arrays for events because that's compact and easy to work
//! with in JavaScript: `["add", "/path", value, "obs-id"]`. The format is human-editable
//! since people might want to experiment with it or fix issues by hand.
//!
//! Two problems in Rust:
//!
//! 1. **Array-based format**: Serde derive expects named struct fields. Deserializing from
//! positional arrays into structs requires custom Visitor implementation.
//!
//! 2. **Detailed error messages**: Goal is Elm-style diagnostics that show exactly what went
//! wrong, what was expected, and how to fix it. Serde's Deserialize trait only allows
//! returning string errors. To generate detailed diagnostics (with codes, severity levels,
//! advice), we need to manually implement the Visitor and collect errors in a wrapper type
//! instead of failing immediately. The wrapper gives us access to which field is being
//! parsed so we can say "expected observation ID at position 3" instead of "parse error".
//!
//! ## Library search
//!
//! Spent 30 minutes looking for existing solutions. Checked:
//! - serde_path_to_error: Adds field path context but still returns string errors
//! - figment: Configuration library, but sounded like could be used only for diagnostics
//! - config/serde_value: Similar issue
//! - json5: Relaxed JSON syntax, not diagnostic-focused
//! - miette: a diagnostic library for Rust. It includes a series of
//! traits/protocols that allow you to hook into its error reporting facilities,
//! and even write your own error reports. This is better than my home built
//! Diagnostic struct, but does not help me with deserialization.
//!
//! Found no library that handles both array deserialization and rich diagnostic collection.
//! This could probably be automated or turned into a library, but for a simple format it was
//! faster to implement by hand. Also serves as exploration of what diagnostic-driven parsing
//! costs in terms of code.
//!
//! ## What this does
//!
//! EventDeserializer wraps Event and collects diagnostics during parsing. It implements
//! Deserialize with a custom Visitor that validates each array position and populates the
//! diagnostics vec instead of returning errors. The calling code (reader.rs) attaches
//! location information (filename, line number) after deserialization.
use serde::de::{Deserialize, Deserializer, SeqAccess, Visitor};
use serde_json::Value;
use std::fmt;
use chrono::{DateTime, Utc};
use crate::diagnostics::{Diagnostic, DiagnosticCode, DiagnosticLevel};
use crate::events::Event;
#[derive(Debug, Default)]
pub struct EventDeserializer {
pub event: Option<Event>,
pub diagnostics: Vec<Diagnostic>,
}
impl EventDeserializer {
pub fn new() -> Self {
Self::default()
}
fn add_diagnostic(&mut self, level: DiagnosticLevel, code: DiagnosticCode, message: String) {
self.diagnostics.push(Diagnostic::new(level, code, message));
}
}
impl<'de> Deserialize<'de> for EventDeserializer {
fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
where
D: Deserializer<'de>,
{
deserializer.deserialize_seq(EventVisitor::new())
}
}
struct EventVisitor {
deserializer: EventDeserializer,
}
impl EventVisitor {
fn new() -> Self {
Self {
deserializer: EventDeserializer::new(),
}
}
}
impl<'de> Visitor<'de> for EventVisitor {
type Value = EventDeserializer;
fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
formatter.write_str("an array representing an event")
}
fn visit_seq<A>(mut self, mut seq: A) -> Result<Self::Value, A::Error>
where
A: SeqAccess<'de>,
{
let mut elements: Vec<Value> = Vec::new();
while let Some(elem) = seq.next_element::<Value>()? {
elements.push(elem);
}
if elements.is_empty() {
self.deserializer.add_diagnostic(
DiagnosticLevel::Fatal,
DiagnosticCode::WrongFieldCount,
"I found an empty array, but events must have at least a string type field as first element.".to_string(),
);
return Ok(self.deserializer);
}
let event_type = match elements[0].as_str() {
Some(t) => t,
None => {
self.deserializer.add_diagnostic(
DiagnosticLevel::Fatal,
DiagnosticCode::WrongFieldType,
"I expected the first element of an event to be a string event type.".to_string(),
);
return Ok(self.deserializer);
}
};
match event_type {
"observe" => {
if elements.len() != 4 {
self.deserializer.add_diagnostic(
DiagnosticLevel::Fatal,
DiagnosticCode::WrongFieldCount,
format!("I expected an observe event to have 4 fields, but found {}.", elements.len()),
);
return Ok(self.deserializer);
}
let id = match elements[1].as_str() {
Some(s) => s.to_string(),
None => {
self.deserializer.add_diagnostic(
DiagnosticLevel::Fatal,
DiagnosticCode::WrongFieldType,
"I expected the observation ID to be a string.".to_string(),
);
return Ok(self.deserializer);
}
};
let timestamp = match elements[2].as_str() {
Some(s) => match s.parse::<DateTime<Utc>>() {
Ok(dt) => dt,
Err(_) => {
self.deserializer.add_diagnostic(
DiagnosticLevel::Fatal,
DiagnosticCode::WrongFieldType,
"I expected the timestamp to be a valid ISO-8601 datetime string.".to_string(),
);
return Ok(self.deserializer);
}
},
None => {
self.deserializer.add_diagnostic(
DiagnosticLevel::Fatal,
DiagnosticCode::WrongFieldType,
"I expected the timestamp to be a string.".to_string(),
);
return Ok(self.deserializer);
}
};
let change_count = match elements[3].as_u64() {
Some(n) => n as usize,
None => {
self.deserializer.add_diagnostic(
DiagnosticLevel::Fatal,
DiagnosticCode::WrongFieldType,
"I expected the change count to be a non-negative integer.".to_string(),
);
return Ok(self.deserializer);
}
};
self.deserializer.event = Some(Event::Observe {
observation_id: id,
timestamp,
change_count,
});
}
"add" => {
if elements.len() != 4 {
self.deserializer.add_diagnostic(
DiagnosticLevel::Fatal,
DiagnosticCode::WrongFieldCount,
format!("I expected an add event to have 4 fields, but found {}.", elements.len()),
);
return Ok(self.deserializer);
}
let path = match elements[1].as_str() {
Some(s) => s.to_string(),
None => {
self.deserializer.add_diagnostic(
DiagnosticLevel::Fatal,
DiagnosticCode::WrongFieldType,
"I expected the path to be a string.".to_string(),
);
return Ok(self.deserializer);
}
};
let value = elements[2].clone();
let observation_id = match elements[3].as_str() {
Some(s) => s.to_string(),
None => {
self.deserializer.add_diagnostic(
DiagnosticLevel::Fatal,
DiagnosticCode::WrongFieldType,
"I expected the observation ID to be a string.".to_string(),
);
return Ok(self.deserializer);
}
};
self.deserializer.event = Some(Event::Add {
path,
value,
observation_id,
});
}
"change" => {
if elements.len() != 4 {
self.deserializer.add_diagnostic(
DiagnosticLevel::Fatal,
DiagnosticCode::WrongFieldCount,
format!("I expected a change event to have 4 fields, but found {}.", elements.len()),
);
return Ok(self.deserializer);
}
let path = match elements[1].as_str() {
Some(s) => s.to_string(),
None => {
self.deserializer.add_diagnostic(
DiagnosticLevel::Fatal,
DiagnosticCode::WrongFieldType,
"I expected the path to be a string.".to_string(),
);
return Ok(self.deserializer);
}
};
let new_value = elements[2].clone();
let observation_id = match elements[3].as_str() {
Some(s) => s.to_string(),
None => {
self.deserializer.add_diagnostic(
DiagnosticLevel::Fatal,
DiagnosticCode::WrongFieldType,
"I expected the observation ID to be a string.".to_string(),
);
return Ok(self.deserializer);
}
};
self.deserializer.event = Some(Event::Change {
path,
new_value,
observation_id,
});
}
"remove" => {
if elements.len() != 3 {
self.deserializer.add_diagnostic(
DiagnosticLevel::Fatal,
DiagnosticCode::WrongFieldCount,
format!("I expected a remove event to have 3 fields, but found {}.", elements.len()),
);
return Ok(self.deserializer);
}
let path = match elements[1].as_str() {
Some(s) => s.to_string(),
None => {
self.deserializer.add_diagnostic(
DiagnosticLevel::Fatal,
DiagnosticCode::WrongFieldType,
"I expected the path to be a string.".to_string(),
);
return Ok(self.deserializer);
}
};
let observation_id = match elements[2].as_str() {
Some(s) => s.to_string(),
None => {
self.deserializer.add_diagnostic(
DiagnosticLevel::Fatal,
DiagnosticCode::WrongFieldType,
"I expected the observation ID to be a string.".to_string(),
);
return Ok(self.deserializer);
}
};
self.deserializer.event = Some(Event::Remove {
path,
observation_id,
});
}
"move" => {
if elements.len() != 4 {
self.deserializer.add_diagnostic(
DiagnosticLevel::Fatal,
DiagnosticCode::WrongFieldCount,
format!("I expected a move event to have 4 fields, but found {}.", elements.len()),
);
return Ok(self.deserializer);
}
let path = match elements[1].as_str() {
Some(s) => s.to_string(),
None => {
self.deserializer.add_diagnostic(
DiagnosticLevel::Fatal,
DiagnosticCode::WrongFieldType,
"I expected the path to be a string.".to_string(),
);
return Ok(self.deserializer);
}
};
let moves = match self.parse_moves(&elements[2]) {
Ok(moves) => moves,
Err(err_msg) => {
self.deserializer.add_diagnostic(
DiagnosticLevel::Fatal,
DiagnosticCode::WrongFieldType,
err_msg,
);
return Ok(self.deserializer);
}
};
let observation_id = match elements[3].as_str() {
Some(s) => s.to_string(),
None => {
self.deserializer.add_diagnostic(
DiagnosticLevel::Fatal,
DiagnosticCode::WrongFieldType,
"I expected the observation ID to be a string.".to_string(),
);
return Ok(self.deserializer);
}
};
self.deserializer.event = Some(Event::Move {
path,
moves,
observation_id,
});
}
"snapshot" => {
if elements.len() != 4 {
self.deserializer.add_diagnostic(
DiagnosticLevel::Fatal,
DiagnosticCode::WrongFieldCount,
format!("I expected a snapshot event to have 4 fields, but found {}.", elements.len()),
);
return Ok(self.deserializer);
}
let observation_id = match elements[1].as_str() {
Some(s) => s.to_string(),
None => {
self.deserializer.add_diagnostic(
DiagnosticLevel::Fatal,
DiagnosticCode::WrongFieldType,
"I expected the observation ID to be a string.".to_string(),
);
return Ok(self.deserializer);
}
};
let timestamp = match elements[2].as_str() {
Some(s) => match s.parse::<DateTime<Utc>>() {
Ok(dt) => dt,
Err(_) => {
self.deserializer.add_diagnostic(
DiagnosticLevel::Fatal,
DiagnosticCode::WrongFieldType,
"I expected the timestamp to be a valid ISO-8601 datetime string.".to_string(),
);
return Ok(self.deserializer);
}
},
None => {
self.deserializer.add_diagnostic(
DiagnosticLevel::Fatal,
DiagnosticCode::WrongFieldType,
"I expected the timestamp to be a string.".to_string(),
);
return Ok(self.deserializer);
}
};
let object = elements[3].clone();
self.deserializer.event = Some(Event::Snapshot {
observation_id,
timestamp,
object,
});
}
_ => {
self.deserializer.add_diagnostic(
DiagnosticLevel::Warning,
DiagnosticCode::UnknownEventType,
format!("I found an unknown event type: '{}'", event_type),
);
}
}
Ok(self.deserializer)
}
}
impl EventVisitor {
fn parse_moves(&mut self, moves_value: &Value) -> Result<Vec<(usize, usize)>, String> {
let moves_array = match moves_value.as_array() {
Some(arr) => arr,
None => {
return Err("I expected the moves to be an array of [from, to] pairs.".to_string());
}
};
let mut moves = Vec::new();
for move_pair in moves_array {
let pair = match move_pair.as_array() {
Some(p) if p.len() == 2 => p,
_ => {
return Err("I expected each move to be a [from, to] pair.".to_string());
}
};
let from_idx = match pair[0].as_u64() {
Some(i) => i as usize,
None => {
return Err("I expected the 'from' index to be a non-negative integer.".to_string());
}
};
let to_idx = match pair[1].as_u64() {
Some(i) => i as usize,
None => {
return Err("I expected the 'to' index to be a non-negative integer.".to_string());
}
};
moves.push((from_idx, to_idx));
}
Ok(moves)
}
}
#[cfg(test)]
mod tests {
use super::*;
use serde_json::json;
#[test]
fn test_deserialize_observe_event() {
let json = json!(["observe", "obs-1", "2025-01-01T00:00:00Z", 1]);
let result: Result<EventDeserializer, _> = serde_json::from_value(json);
assert!(result.is_ok());
let deserializer = result.unwrap();
assert!(deserializer.diagnostics.is_empty());
assert!(matches!(
deserializer.event,
Some(Event::Observe { observation_id, timestamp: _, change_count })
if observation_id == "obs-1" && change_count == 1
));
}
#[test]
fn test_deserialize_add_event() {
let json = json!(["add", "/count", 42, "obs-1"]);
let result: Result<EventDeserializer, _> = serde_json::from_value(json);
assert!(result.is_ok());
let deserializer = result.unwrap();
assert!(deserializer.diagnostics.is_empty());
assert!(matches!(
deserializer.event,
Some(Event::Add { path, value, observation_id })
if path == "/count" && value == json!(42) && observation_id == "obs-1"
));
}
#[test]
fn test_deserialize_invalid_event_type() {
let json = json!(["invalid", "some", "data"]);
let result: Result<EventDeserializer, _> = serde_json::from_value(json);
assert!(result.is_ok());
let deserializer = result.unwrap();
assert_eq!(deserializer.diagnostics.len(), 1);
assert_eq!(deserializer.diagnostics[0].code, DiagnosticCode::UnknownEventType);
assert!(deserializer.event.is_none());
}
#[test]
fn test_deserialize_wrong_field_count() {
let json = json!(["observe", "obs-1"]);
let result: Result<EventDeserializer, _> = serde_json::from_value(json);
assert!(result.is_ok());
let deserializer = result.unwrap();
assert_eq!(deserializer.diagnostics.len(), 1);
assert_eq!(deserializer.diagnostics[0].code, DiagnosticCode::WrongFieldCount);
assert!(deserializer.event.is_none());
}
#[test]
fn test_deserialize_move_event() {
let json = json!(["move", "/items", [[0, 2], [1, 0]], "obs-1"]);
let result: Result<EventDeserializer, _> = serde_json::from_value(json);
assert!(result.is_ok());
let deserializer = result.unwrap();
assert!(deserializer.diagnostics.is_empty());
assert!(matches!(
deserializer.event,
Some(Event::Move { path, moves, observation_id })
if path == "/items" && moves == vec![(0, 2), (1, 0)] && observation_id == "obs-1"
));
}
}

View file

@ -23,6 +23,7 @@ pub mod archive;
pub mod detection; pub mod detection;
pub mod diagnostics; pub mod diagnostics;
pub mod diff; pub mod diff;
pub mod event_deserialize;
pub mod events; pub mod events;
pub mod flags; pub mod flags;
pub mod pointer; pub mod pointer;

View file

@ -26,7 +26,8 @@ use std::io::{BufRead, BufReader};
use std::path::Path; use std::path::Path;
use crate::diagnostics::{Diagnostic, DiagnosticCode, DiagnosticCollector, DiagnosticLevel}; use crate::diagnostics::{Diagnostic, DiagnosticCode, DiagnosticCollector, DiagnosticLevel};
use crate::events::Header; use crate::event_deserialize::EventDeserializer;
use crate::events::{Event, Header};
use crate::pointer::JsonPointer; use crate::pointer::JsonPointer;
#[derive(Debug, Clone, Copy, PartialEq, Eq)] #[derive(Debug, Clone, Copy, PartialEq, Eq)]
@ -165,8 +166,8 @@ impl ArchiveReader {
continue; continue;
} }
let event = match serde_json::from_str::<Value>(&line) { let event_deserializer = match serde_json::from_str::<EventDeserializer>(&line) {
Ok(v) => v, Ok(d) => d,
Err(e) => { Err(e) => {
diagnostics.add( diagnostics.add(
Diagnostic::new( Diagnostic::new(
@ -188,44 +189,28 @@ impl ArchiveReader {
} }
}; };
if let Some(arr) = event.as_array() { // Add any diagnostics from deserialization with location info
if arr.is_empty() { for diagnostic in event_deserializer.diagnostics {
diagnostics.add( diagnostics.add(
Diagnostic::new( diagnostic
DiagnosticLevel::Fatal,
DiagnosticCode::WrongFieldCount,
"I found an empty array, but events must have at least a type field."
.to_string(),
)
.with_location(self.filename.clone(), line_number)
.with_snippet(format!("{} | {}", line_number, line)),
);
continue;
}
let event_type = match arr[0].as_str() {
Some(t) => t,
None => {
diagnostics.add(
Diagnostic::new(
DiagnosticLevel::Fatal,
DiagnosticCode::WrongFieldType,
"I expected the first element of an event to be a string event type.".to_string()
)
.with_location(self.filename.clone(), line_number) .with_location(self.filename.clone(), line_number)
.with_snippet(format!("{} | {}", line_number, line)) .with_snippet(format!("{} | {}", line_number, line))
.with_advice(
"Events must look like [eventType, ...]. The eventType must be one of:\n\
observe, add, change, remove, move, snapshot."
.to_string()
)
); );
continue;
} }
// Continue processing to collect additional errors before failing.
// Even though this function must now return an error, we continue to help
// the user identify all issues in the file at once rather than one at a time.
let event = match event_deserializer.event {
Some(e) => e,
None => {
assert!(diagnostics.has_fatal(), "Expected a fatal diagnostic when deserialization fails");
continue
},
}; };
match event_type { match event {
"observe" => { Event::Observe { observation_id, timestamp: _, change_count } => {
if let Some((_obs_id, obs_line, expected_count)) = &current_observation { if let Some((_obs_id, obs_line, expected_count)) = &current_observation {
if events_in_observation != *expected_count { if events_in_observation != *expected_count {
diagnostics.add( diagnostics.add(
@ -247,32 +232,12 @@ impl ArchiveReader {
} }
} }
if arr.len() != 4 { if seen_observations.contains(&observation_id) {
diagnostics.add(
Diagnostic::new(
DiagnosticLevel::Fatal,
DiagnosticCode::WrongFieldCount,
format!("I expected an observe event to have 4 fields, but found {}.", arr.len())
)
.with_location(self.filename.clone(), line_number)
.with_snippet(format!("{} | {}", line_number, line))
.with_advice(
"Observe events must be: [\"observe\", observationId, timestamp, changeCount]"
.to_string()
)
);
continue;
}
let obs_id = arr[1].as_str().unwrap_or("").to_string();
let change_count = arr[3].as_u64().unwrap_or(0) as usize;
if seen_observations.contains(&obs_id) {
diagnostics.add( diagnostics.add(
Diagnostic::new( Diagnostic::new(
DiagnosticLevel::Warning, DiagnosticLevel::Warning,
DiagnosticCode::DuplicateObservationId, DiagnosticCode::DuplicateObservationId,
format!("I found a duplicate observation ID: '{}'", obs_id), format!("I found a duplicate observation ID: '{}'", observation_id),
) )
.with_location(self.filename.clone(), line_number) .with_location(self.filename.clone(), line_number)
.with_advice( .with_advice(
@ -283,42 +248,23 @@ impl ArchiveReader {
); );
} }
seen_observations.insert(obs_id.clone()); seen_observations.insert(observation_id.clone());
current_observation = Some((obs_id, line_number, change_count)); current_observation = Some((observation_id, line_number, change_count));
events_in_observation = 0; events_in_observation = 0;
observation_count += 1; observation_count += 1;
} }
"add" => { Event::Add { path, value, observation_id } => {
events_in_observation += 1; events_in_observation += 1;
if arr.len() != 4 {
diagnostics.add(
Diagnostic::new(
DiagnosticLevel::Fatal,
DiagnosticCode::WrongFieldCount,
format!(
"I expected an add event to have 4 fields, but found {}.",
arr.len()
),
)
.with_location(self.filename.clone(), line_number)
.with_snippet(format!("{} | {}", line_number, line)),
);
continue;
}
let path = arr[1].as_str().unwrap_or("");
let value = arr[2].clone();
let obs_id = arr[3].as_str().unwrap_or("");
if self.mode == ReadMode::FullValidation if self.mode == ReadMode::FullValidation
&& !seen_observations.contains(obs_id) && !seen_observations.contains(&observation_id)
{ {
diagnostics.add( diagnostics.add(
Diagnostic::new( Diagnostic::new(
DiagnosticLevel::Fatal, DiagnosticLevel::Fatal,
DiagnosticCode::NonExistentObservationId, DiagnosticCode::NonExistentObservationId,
format!("I found a reference to observation '{}', but I haven't seen an observe event with that ID yet.", obs_id) format!("I found a reference to observation '{}', but I haven't seen an observe event with that ID yet.", observation_id)
) )
.with_location(self.filename.clone(), line_number) .with_location(self.filename.clone(), line_number)
.with_snippet(format!("{} | {}", line_number, line)) .with_snippet(format!("{} | {}", line_number, line))
@ -330,165 +276,86 @@ impl ArchiveReader {
continue; continue;
} }
if let Err(_) = if let Err(diag) = apply_add(&mut state, &path, value) {
self.apply_add(&mut state, path, value, line_number, &mut diagnostics) diagnostics.add(diag.with_location(self.filename.clone(), line_number));
{
continue; continue;
} }
} }
"change" => { Event::Change { path, new_value, observation_id } => {
events_in_observation += 1; events_in_observation += 1;
if arr.len() != 4 {
diagnostics.add(
Diagnostic::new(
DiagnosticLevel::Fatal,
DiagnosticCode::WrongFieldCount,
format!(
"I expected a change event to have 4 fields, but found {}.",
arr.len()
),
)
.with_location(self.filename.clone(), line_number)
.with_snippet(format!("{} | {}", line_number, line)),
);
continue;
}
let path = arr[1].as_str().unwrap_or("");
let new_value = arr[2].clone();
let obs_id = arr[3].as_str().unwrap_or("");
if self.mode == ReadMode::FullValidation if self.mode == ReadMode::FullValidation
&& !seen_observations.contains(obs_id) && !seen_observations.contains(&observation_id)
{ {
diagnostics.add( diagnostics.add(
Diagnostic::new( Diagnostic::new(
DiagnosticLevel::Fatal, DiagnosticLevel::Fatal,
DiagnosticCode::NonExistentObservationId, DiagnosticCode::NonExistentObservationId,
format!("I found a reference to observation '{}', but I haven't seen an observe event with that ID yet.", obs_id) format!("I found a reference to observation '{}', but I haven't seen an observe event with that ID yet.", observation_id)
) )
.with_location(self.filename.clone(), line_number) .with_location(self.filename.clone(), line_number)
); );
continue; continue;
} }
if let Err(_) = self.apply_change( if let Err(diag) = apply_change(&mut state, &path, new_value) {
&mut state, diagnostics.add(diag.with_location(self.filename.clone(), line_number));
path,
new_value,
line_number,
&mut diagnostics,
) {
continue; continue;
} }
} }
"remove" => { Event::Remove { path, observation_id } => {
events_in_observation += 1; events_in_observation += 1;
if arr.len() != 3 {
diagnostics.add(
Diagnostic::new(
DiagnosticLevel::Fatal,
DiagnosticCode::WrongFieldCount,
format!(
"I expected a remove event to have 3 fields, but found {}.",
arr.len()
),
)
.with_location(self.filename.clone(), line_number)
.with_snippet(format!("{} | {}", line_number, line)),
);
continue;
}
let path = arr[1].as_str().unwrap_or("");
let obs_id = arr[2].as_str().unwrap_or("");
if self.mode == ReadMode::FullValidation if self.mode == ReadMode::FullValidation
&& !seen_observations.contains(obs_id) && !seen_observations.contains(&observation_id)
{ {
diagnostics.add( diagnostics.add(
Diagnostic::new( Diagnostic::new(
DiagnosticLevel::Fatal, DiagnosticLevel::Fatal,
DiagnosticCode::NonExistentObservationId, DiagnosticCode::NonExistentObservationId,
format!("I found a reference to observation '{}', but I haven't seen an observe event with that ID yet.", obs_id) format!("I found a reference to observation '{}', but I haven't seen an observe event with that ID yet.", observation_id)
) )
.with_location(self.filename.clone(), line_number) .with_location(self.filename.clone(), line_number)
); );
continue; continue;
} }
if let Err(_) = if let Err(diag) = apply_remove(&mut state, &path) {
self.apply_remove(&mut state, path, line_number, &mut diagnostics) diagnostics.add(diag.with_location(self.filename.clone(), line_number));
{
continue; continue;
} }
} }
"move" => { Event::Move { path, moves, observation_id } => {
events_in_observation += 1; events_in_observation += 1;
if arr.len() != 4 {
diagnostics.add(
Diagnostic::new(
DiagnosticLevel::Fatal,
DiagnosticCode::WrongFieldCount,
format!(
"I expected a move event to have 4 fields, but found {}.",
arr.len()
),
)
.with_location(self.filename.clone(), line_number)
.with_snippet(format!("{} | {}", line_number, line)),
);
continue;
}
let path = arr[1].as_str().unwrap_or("");
let moves = arr[2].clone();
let obs_id = arr[3].as_str().unwrap_or("");
if self.mode == ReadMode::FullValidation if self.mode == ReadMode::FullValidation
&& !seen_observations.contains(obs_id) && !seen_observations.contains(&observation_id)
{ {
diagnostics.add( diagnostics.add(
Diagnostic::new( Diagnostic::new(
DiagnosticLevel::Fatal, DiagnosticLevel::Fatal,
DiagnosticCode::NonExistentObservationId, DiagnosticCode::NonExistentObservationId,
format!("I found a reference to observation '{}', but I haven't seen an observe event with that ID yet.", obs_id) format!("I found a reference to observation '{}', but I haven't seen an observe event with that ID yet.", observation_id)
) )
.with_location(self.filename.clone(), line_number) .with_location(self.filename.clone(), line_number)
); );
continue; continue;
} }
if let Err(_) = if let Err(diag) = apply_move(&mut state, &path, moves) {
self.apply_move(&mut state, path, moves, line_number, &mut diagnostics) diagnostics.add(diag.with_location(self.filename.clone(), line_number));
{
continue; continue;
} }
} }
"snapshot" => { Event::Snapshot { observation_id: _, timestamp: _, object } => {
if arr.len() != 4 { if self.mode == ReadMode::FullValidation && state != object {
diagnostics.add( diagnostics.add(
Diagnostic::new( Diagnostic::new(
DiagnosticLevel::Fatal, DiagnosticLevel::Fatal,
DiagnosticCode::WrongFieldCount,
format!("I expected a snapshot event to have 4 fields, but found {}.", arr.len())
)
.with_location(self.filename.clone(), line_number)
.with_snippet(format!("{} | {}", line_number, line))
);
continue;
}
let snapshot_state = arr[3].clone();
if self.mode == ReadMode::FullValidation && state != snapshot_state {
diagnostics.add(
Diagnostic::new(
DiagnosticLevel::Warning,
DiagnosticCode::SnapshotStateMismatch, DiagnosticCode::SnapshotStateMismatch,
"I found a snapshot whose state doesn't match the replayed state up to this point.".to_string() "I found a snapshot whose state doesn't match the replayed state up to this point.".to_string()
) )
@ -502,37 +369,8 @@ impl ArchiveReader {
); );
} }
state = snapshot_state; state = object;
} }
_ => {
diagnostics.add(
Diagnostic::new(
DiagnosticLevel::Warning,
DiagnosticCode::UnknownEventType,
format!("I found an unknown event type: '{}'", event_type)
)
.with_location(self.filename.clone(), line_number)
.with_snippet(format!("{} | {}", line_number, line))
.with_advice(
"Valid event types are: observe, add, change, remove, move, snapshot. \
This line will be skipped."
.to_string()
)
);
}
}
} else {
diagnostics.add(
Diagnostic::new(
DiagnosticLevel::Fatal,
DiagnosticCode::WrongFieldType,
"I expected an event to be a JSON array, but found a different type."
.to_string(),
)
.with_location(self.filename.clone(), line_number)
.with_snippet(format!("{} | {}", line_number, line)),
);
} }
} }
@ -630,120 +468,49 @@ impl ArchiveReader {
} }
} }
fn apply_add( }
&self,
state: &mut Value, fn apply_add(state: &mut Value, path: &str, value: Value) -> Result<(), Diagnostic> {
path: &str, let pointer = JsonPointer::new(path).map_err(|diag| {
value: Value, diag.with_advice(
line_number: usize,
diagnostics: &mut DiagnosticCollector,
) -> Result<(), ()> {
let pointer = match JsonPointer::new(path) {
Ok(p) => p,
Err(diag) => {
diagnostics.add(
diag.with_location(self.filename.clone(), line_number)
.with_advice(
"JSON Pointer paths must start with '/' and use '/' to separate segments.\n\ "JSON Pointer paths must start with '/' and use '/' to separate segments.\n\
Special characters: use ~0 for ~ and ~1 for /" Special characters: use ~0 for ~ and ~1 for /"
.to_string() .to_string()
) )
); })?;
return Err(());
}
};
if let Err(diag) = pointer.set(state, value) { pointer.set(state, value).map_err(|diag| {
diagnostics.add( diag.with_advice(
diag.with_location(self.filename.clone(), line_number)
.with_advice(
"For add operations, the parent path must exist. \ "For add operations, the parent path must exist. \
For example, to add /a/b/c, the paths /a and /a/b must already exist." For example, to add /a/b/c, the paths /a and /a/b must already exist."
.to_string(), .to_string()
), )
); })
return Err(());
} }
fn apply_change(state: &mut Value, path: &str, new_value: Value) -> Result<(), Diagnostic> {
let pointer = JsonPointer::new(path)?;
pointer.set(state, new_value)?;
Ok(()) Ok(())
} }
fn apply_change( fn apply_remove(state: &mut Value, path: &str) -> Result<(), Diagnostic> {
&self, let pointer = JsonPointer::new(path)?;
state: &mut Value, pointer.remove(state)?;
path: &str,
new_value: Value,
line_number: usize,
diagnostics: &mut DiagnosticCollector,
) -> Result<(), ()> {
let pointer = match JsonPointer::new(path) {
Ok(p) => p,
Err(diag) => {
diagnostics.add(diag.with_location(self.filename.clone(), line_number));
return Err(());
}
};
if let Err(diag) = pointer.set(state, new_value) {
diagnostics.add(diag.with_location(self.filename.clone(), line_number));
return Err(());
}
Ok(())
}
fn apply_remove(
&self,
state: &mut Value,
path: &str,
line_number: usize,
diagnostics: &mut DiagnosticCollector,
) -> Result<(), ()> {
let pointer = match JsonPointer::new(path) {
Ok(p) => p,
Err(diag) => {
diagnostics.add(diag.with_location(self.filename.clone(), line_number));
return Err(());
}
};
if let Err(mut diag) = pointer.remove(state) {
if self.mode == ReadMode::FullValidation {
diag.level = DiagnosticLevel::Fatal;
} else {
diag.level = DiagnosticLevel::Warning;
}
diagnostics.add(diag.with_location(self.filename.clone(), line_number));
if self.mode == ReadMode::FullValidation {
return Err(());
}
}
Ok(()) Ok(())
} }
fn apply_move( fn apply_move(
&self,
state: &mut Value, state: &mut Value,
path: &str, path: &str,
moves_value: Value, moves: Vec<(usize, usize)>,
line_number: usize, ) -> Result<(), Diagnostic> {
diagnostics: &mut DiagnosticCollector, let pointer = JsonPointer::new(path)?;
) -> Result<(), ()> {
let pointer = match JsonPointer::new(path) {
Ok(p) => p,
Err(diag) => {
diagnostics.add(diag.with_location(self.filename.clone(), line_number));
return Err(());
}
};
let array = match pointer.get(state) { let array = pointer.get(state)?;
Ok(v) => {
if !v.is_array() { if !array.is_array() {
diagnostics.add( return Err(
Diagnostic::new( Diagnostic::new(
DiagnosticLevel::Fatal, DiagnosticLevel::Fatal,
DiagnosticCode::MoveOnNonArray, DiagnosticCode::MoveOnNonArray,
@ -752,88 +519,19 @@ impl ArchiveReader {
path path
), ),
) )
.with_location(self.filename.clone(), line_number)
.with_advice( .with_advice(
"Move operations can only reorder elements within an array. \ "Move operations can only reorder elements within an array. \
The path must point to an array value." The path must point to an array value."
.to_string(), .to_string(),
), ),
); );
return Err(());
} }
v.clone()
}
Err(diag) => {
diagnostics.add(diag.with_location(self.filename.clone(), line_number));
return Err(());
}
};
let mut arr = array.as_array().unwrap().clone(); let mut arr = array.as_array().unwrap().clone();
let moves = match moves_value.as_array() { for (from_idx, to_idx) in moves {
Some(m) => m,
None => {
diagnostics.add(
Diagnostic::new(
DiagnosticLevel::Fatal,
DiagnosticCode::WrongFieldType,
"I expected the moves to be an array of [from, to] pairs.".to_string(),
)
.with_location(self.filename.clone(), line_number),
);
return Err(());
}
};
for move_pair in moves {
let pair = match move_pair.as_array() {
Some(p) if p.len() == 2 => p,
_ => {
diagnostics.add(
Diagnostic::new(
DiagnosticLevel::Fatal,
DiagnosticCode::WrongFieldType,
"I expected each move to be a [from, to] pair.".to_string(),
)
.with_location(self.filename.clone(), line_number),
);
return Err(());
}
};
let from_idx = match pair[0].as_u64() {
Some(i) => i as usize,
None => {
diagnostics.add(
Diagnostic::new(
DiagnosticLevel::Fatal,
DiagnosticCode::InvalidMoveIndex,
"I expected the 'from' index to be a non-negative integer.".to_string(),
)
.with_location(self.filename.clone(), line_number),
);
return Err(());
}
};
let to_idx = match pair[1].as_u64() {
Some(i) => i as usize,
None => {
diagnostics.add(
Diagnostic::new(
DiagnosticLevel::Fatal,
DiagnosticCode::InvalidMoveIndex,
"I expected the 'to' index to be a non-negative integer.".to_string(),
)
.with_location(self.filename.clone(), line_number),
);
return Err(());
}
};
if from_idx >= arr.len() { if from_idx >= arr.len() {
diagnostics.add( return Err(
Diagnostic::new( Diagnostic::new(
DiagnosticLevel::Fatal, DiagnosticLevel::Fatal,
DiagnosticCode::MoveIndexOutOfBounds, DiagnosticCode::MoveIndexOutOfBounds,
@ -843,13 +541,11 @@ impl ArchiveReader {
arr.len() arr.len()
), ),
) )
.with_location(self.filename.clone(), line_number),
); );
return Err(());
} }
if to_idx > arr.len() { if to_idx > arr.len() {
diagnostics.add( return Err(
Diagnostic::new( Diagnostic::new(
DiagnosticLevel::Fatal, DiagnosticLevel::Fatal,
DiagnosticCode::MoveIndexOutOfBounds, DiagnosticCode::MoveIndexOutOfBounds,
@ -859,9 +555,7 @@ impl ArchiveReader {
arr.len() arr.len()
), ),
) )
.with_location(self.filename.clone(), line_number),
); );
return Err(());
} }
let element = arr[from_idx].clone(); let element = arr[from_idx].clone();
@ -874,10 +568,7 @@ impl ArchiveReader {
arr.remove(remove_idx); arr.remove(remove_idx);
} }
pointer.set(state, Value::Array(arr)).map_err(|diag| { pointer.set(state, Value::Array(arr))
diagnostics.add(diag.with_location(self.filename.clone(), line_number));
})
}
} }
#[cfg(test)] #[cfg(test)]