From 1754556c99e3fb4c516097539e8b2c9ee3bbc118 Mon Sep 17 00:00:00 2001 From: Kai Chappell Date: Tue, 3 Feb 2026 19:16:37 +0000 Subject: [PATCH] docs(changelog): release 0.1.0 Initial release with metrics, validators, pytest plugin, benchmark module, CLI, and comprehensive documentation. --- changelog.md | 31 +++++++++++++++++++++++++++++-- 1 file changed, 29 insertions(+), 2 deletions(-) diff --git a/changelog.md b/changelog.md index 9b7ccbd..d1f87c9 100644 --- a/changelog.md +++ b/changelog.md @@ -5,37 +5,56 @@ All notable changes to Veritext will be documented in this file. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). -## [Unreleased] +## [0.1.0] — 2026-02-03 + +Initial release of Veritext, a semantic text validation framework for Python. ### Added +#### Core + - Project scaffold with pyproject.toml and development tooling - Core exception hierarchy (`VeritextError` and subclasses) - Core types: `ValidationContext`, `CheckResult`, `ValidationResult` - Word tokeniser with Unicode normalisation support - Configuration module with pydantic-settings - Structured logging with structlog + +#### Metrics + - Metrics module with `Metric` protocol, `AggregateStats`, and `BatchResult` types - BLEU metric implementation (BLEU-1 through BLEU-4 with brevity penalty) -- Lexical similarity metric (Jaccard similarity and token overlap) - ROUGE metric (ROUGE-1, ROUGE-2, ROUGE-L with precision/recall/F-measure) +- Lexical similarity metric (Jaccard similarity and token overlap) - Flesch-Kincaid readability metrics (grade level and reading ease) - Batch scoring with aggregate statistics for all metrics + +#### Validators + - Validators module with `Check` protocol for validation checks - Metric-based validators: `BleuValidator`, `RougeValidator`, `LexicalValidator` - Constraint validators: `LengthValidator`, `ReadabilityValidator`, `ContainsValidator`, `ExcludesValidator` - Composite validators: `AllOf` (all checks must pass), `AnyOf` (any check must pass) - Factory functions for clean validator API (`bleu()`, `rouge()`, `lexical()`, `length()`, `readability()`, `contains()`, `excludes()`, `all_of()`, `any_of()`) + +#### Semantic Similarity + - Semantic similarity module with embedding-based text comparison (requires `veritext[semantic]` extra) - `SemanticSimilarity` metric using sentence-transformers for semantic relatedness - `SemanticValidator` for threshold-based semantic similarity validation - `semantic()` factory function for creating semantic validators - Embedding caching for performance optimisation in repeated comparisons + +#### Pytest Plugin + - Native pytest plugin for CI/CD integration (entry point: `pytest11`) - `validate_text()` assertion function for expressive test assertions - `text_validation` marker for filtering validation tests - Pytest fixtures: `text_validator` factory and `validation_context` helper - Detailed failure messages with text preview and check diagnostics + +#### Benchmarking + - Benchmark module for quality tracking and regression detection - `Benchmark` class for evaluating text quality over time with metric storage - `BenchmarkRun` and `RegressionReport` data models for tracking runs @@ -45,6 +64,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - `assert_no_regression()` raises `RegressionDetectedError` for CI integration - Customisable tolerance threshold and window size for regression detection - Metadata support for tracking git SHA, model versions, etc. + +#### CLI + - Command-line interface (CLI) via `veritext` command - `veritext validate` command for inline and file-based text validation - JSONL input format support for batch validation (`--file` option) @@ -54,3 +76,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - `veritext benchmark show` command for viewing benchmark history - `veritext benchmark check` command for regression detection with exit code 1 on failure - Rich-formatted terminal output with tables and coloured panels + +#### Documentation + +- Comprehensive readme with usage examples +- Example scripts: basic validation, chatbot testing, benchmark regression