Files
veritext/changelog.md
Kai Chappell 8b7c087de7 docs(changelog): add CLI entries
Document command-line interface including validate command,
benchmark subcommands, and output formatting options.
2026-02-03 18:22:50 +00:00

3.4 KiB

Changelog

All notable changes to Veritext will be documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

[Unreleased]

Added

  • Project scaffold with pyproject.toml and development tooling
  • Core exception hierarchy (VeritextError and subclasses)
  • Core types: ValidationContext, CheckResult, ValidationResult
  • Word tokeniser with Unicode normalisation support
  • Configuration module with pydantic-settings
  • Structured logging with structlog
  • Metrics module with Metric protocol, AggregateStats, and BatchResult types
  • BLEU metric implementation (BLEU-1 through BLEU-4 with brevity penalty)
  • Lexical similarity metric (Jaccard similarity and token overlap)
  • ROUGE metric (ROUGE-1, ROUGE-2, ROUGE-L with precision/recall/F-measure)
  • Flesch-Kincaid readability metrics (grade level and reading ease)
  • Batch scoring with aggregate statistics for all metrics
  • Validators module with Check protocol for validation checks
  • Metric-based validators: BleuValidator, RougeValidator, LexicalValidator
  • Constraint validators: LengthValidator, ReadabilityValidator, ContainsValidator, ExcludesValidator
  • Composite validators: AllOf (all checks must pass), AnyOf (any check must pass)
  • Factory functions for clean validator API (bleu(), rouge(), lexical(), length(), readability(), contains(), excludes(), all_of(), any_of())
  • Semantic similarity module with embedding-based text comparison (requires veritext[semantic] extra)
  • SemanticSimilarity metric using sentence-transformers for semantic relatedness
  • SemanticValidator for threshold-based semantic similarity validation
  • semantic() factory function for creating semantic validators
  • Embedding caching for performance optimisation in repeated comparisons
  • Native pytest plugin for CI/CD integration (entry point: pytest11)
  • validate_text() assertion function for expressive test assertions
  • text_validation marker for filtering validation tests
  • Pytest fixtures: text_validator factory and validation_context helper
  • Detailed failure messages with text preview and check diagnostics
  • Benchmark module for quality tracking and regression detection
  • Benchmark class for evaluating text quality over time with metric storage
  • BenchmarkRun and RegressionReport data models for tracking runs
  • SQLite storage backend with WAL mode for concurrent access
  • Rolling window baseline computation for historical comparison
  • check_regression() for statistical comparison against baseline
  • assert_no_regression() raises RegressionDetectedError for CI integration
  • Customisable tolerance threshold and window size for regression detection
  • Metadata support for tracking git SHA, model versions, etc.
  • Command-line interface (CLI) via veritext command
  • veritext validate command for inline and file-based text validation
  • JSONL input format support for batch validation (--file option)
  • Separate candidate/reference file support (--reference-file option)
  • Multiple output formats: table (default), JSON, and simple text
  • veritext benchmark run command for running evaluations and storing results
  • veritext benchmark show command for viewing benchmark history
  • veritext benchmark check command for regression detection with exit code 1 on failure
  • Rich-formatted terminal output with tables and coloured panels