Initial release with metrics, validators, pytest plugin, benchmark module, CLI, and comprehensive documentation.
3.8 KiB
3.8 KiB
Changelog
All notable changes to Veritext will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
[0.1.0] — 2026-02-03
Initial release of Veritext, a semantic text validation framework for Python.
Added
Core
- Project scaffold with pyproject.toml and development tooling
- Core exception hierarchy (
VeritextErrorand subclasses) - Core types:
ValidationContext,CheckResult,ValidationResult - Word tokeniser with Unicode normalisation support
- Configuration module with pydantic-settings
- Structured logging with structlog
Metrics
- Metrics module with
Metricprotocol,AggregateStats, andBatchResulttypes - BLEU metric implementation (BLEU-1 through BLEU-4 with brevity penalty)
- ROUGE metric (ROUGE-1, ROUGE-2, ROUGE-L with precision/recall/F-measure)
- Lexical similarity metric (Jaccard similarity and token overlap)
- Flesch-Kincaid readability metrics (grade level and reading ease)
- Batch scoring with aggregate statistics for all metrics
Validators
- Validators module with
Checkprotocol for validation checks - Metric-based validators:
BleuValidator,RougeValidator,LexicalValidator - Constraint validators:
LengthValidator,ReadabilityValidator,ContainsValidator,ExcludesValidator - Composite validators:
AllOf(all checks must pass),AnyOf(any check must pass) - Factory functions for clean validator API (
bleu(),rouge(),lexical(),length(),readability(),contains(),excludes(),all_of(),any_of())
Semantic Similarity
- Semantic similarity module with embedding-based text comparison (requires
veritext[semantic]extra) SemanticSimilaritymetric using sentence-transformers for semantic relatednessSemanticValidatorfor threshold-based semantic similarity validationsemantic()factory function for creating semantic validators- Embedding caching for performance optimisation in repeated comparisons
Pytest Plugin
- Native pytest plugin for CI/CD integration (entry point:
pytest11) validate_text()assertion function for expressive test assertionstext_validationmarker for filtering validation tests- Pytest fixtures:
text_validatorfactory andvalidation_contexthelper - Detailed failure messages with text preview and check diagnostics
Benchmarking
- Benchmark module for quality tracking and regression detection
Benchmarkclass for evaluating text quality over time with metric storageBenchmarkRunandRegressionReportdata models for tracking runs- SQLite storage backend with WAL mode for concurrent access
- Rolling window baseline computation for historical comparison
check_regression()for statistical comparison against baselineassert_no_regression()raisesRegressionDetectedErrorfor CI integration- Customisable tolerance threshold and window size for regression detection
- Metadata support for tracking git SHA, model versions, etc.
CLI
- Command-line interface (CLI) via
veritextcommand veritext validatecommand for inline and file-based text validation- JSONL input format support for batch validation (
--fileoption) - Separate candidate/reference file support (
--reference-fileoption) - Multiple output formats: table (default), JSON, and simple text
veritext benchmark runcommand for running evaluations and storing resultsveritext benchmark showcommand for viewing benchmark historyveritext benchmark checkcommand for regression detection with exit code 1 on failure- Rich-formatted terminal output with tables and coloured panels
Documentation
- Comprehensive readme with usage examples
- Example scripts: basic validation, chatbot testing, benchmark regression