Files

Document command-line interface including validate command,
benchmark subcommands, and output formatting options.

2026-02-03 18:22:50 +00:00

3.4 KiB

Raw Permalink Blame History

Changelog

All notable changes to Veritext will be documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

[Unreleased]

Added

Project scaffold with pyproject.toml and development tooling
Core exception hierarchy (VeritextError and subclasses)
Core types: ValidationContext, CheckResult, ValidationResult
Word tokeniser with Unicode normalisation support
Configuration module with pydantic-settings
Structured logging with structlog
Metrics module with Metric protocol, AggregateStats, and BatchResult types
BLEU metric implementation (BLEU-1 through BLEU-4 with brevity penalty)
Lexical similarity metric (Jaccard similarity and token overlap)
ROUGE metric (ROUGE-1, ROUGE-2, ROUGE-L with precision/recall/F-measure)
Flesch-Kincaid readability metrics (grade level and reading ease)
Batch scoring with aggregate statistics for all metrics
Validators module with Check protocol for validation checks
Metric-based validators: BleuValidator, RougeValidator, LexicalValidator
Constraint validators: LengthValidator, ReadabilityValidator, ContainsValidator, ExcludesValidator
Composite validators: AllOf (all checks must pass), AnyOf (any check must pass)
Factory functions for clean validator API (bleu(), rouge(), lexical(), length(), readability(), contains(), excludes(), all_of(), any_of())
Semantic similarity module with embedding-based text comparison (requires veritext[semantic] extra)
SemanticSimilarity metric using sentence-transformers for semantic relatedness
SemanticValidator for threshold-based semantic similarity validation
semantic() factory function for creating semantic validators
Embedding caching for performance optimisation in repeated comparisons
Native pytest plugin for CI/CD integration (entry point: pytest11)
validate_text() assertion function for expressive test assertions
text_validation marker for filtering validation tests
Pytest fixtures: text_validator factory and validation_context helper
Detailed failure messages with text preview and check diagnostics
Benchmark module for quality tracking and regression detection
Benchmark class for evaluating text quality over time with metric storage
BenchmarkRun and RegressionReport data models for tracking runs
SQLite storage backend with WAL mode for concurrent access
Rolling window baseline computation for historical comparison
check_regression() for statistical comparison against baseline
assert_no_regression() raises RegressionDetectedError for CI integration
Customisable tolerance threshold and window size for regression detection
Metadata support for tracking git SHA, model versions, etc.
Command-line interface (CLI) via veritext command
veritext validate command for inline and file-based text validation
JSONL input format support for batch validation (--file option)
Separate candidate/reference file support (--reference-file option)
Multiple output formats: table (default), JSON, and simple text
veritext benchmark run command for running evaluations and storing results
veritext benchmark show command for viewing benchmark history
veritext benchmark check command for regression detection with exit code 1 on failure
Rich-formatted terminal output with tables and coloured panels

3.4 KiB Raw Permalink Blame History

Changelog

[Unreleased]

Added

3.4 KiB

Raw Permalink Blame History