docs(changelog): release 0.1.0
Initial release with metrics, validators, pytest plugin, benchmark module, CLI, and comprehensive documentation.
This commit is contained in:
31
changelog.md
31
changelog.md
@@ -5,37 +5,56 @@ All notable changes to Veritext will be documented in this file.
|
|||||||
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
|
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
|
||||||
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
||||||
|
|
||||||
## [Unreleased]
|
## [0.1.0] — 2026-02-03
|
||||||
|
|
||||||
|
Initial release of Veritext, a semantic text validation framework for Python.
|
||||||
|
|
||||||
### Added
|
### Added
|
||||||
|
|
||||||
|
#### Core
|
||||||
|
|
||||||
- Project scaffold with pyproject.toml and development tooling
|
- Project scaffold with pyproject.toml and development tooling
|
||||||
- Core exception hierarchy (`VeritextError` and subclasses)
|
- Core exception hierarchy (`VeritextError` and subclasses)
|
||||||
- Core types: `ValidationContext`, `CheckResult`, `ValidationResult`
|
- Core types: `ValidationContext`, `CheckResult`, `ValidationResult`
|
||||||
- Word tokeniser with Unicode normalisation support
|
- Word tokeniser with Unicode normalisation support
|
||||||
- Configuration module with pydantic-settings
|
- Configuration module with pydantic-settings
|
||||||
- Structured logging with structlog
|
- Structured logging with structlog
|
||||||
|
|
||||||
|
#### Metrics
|
||||||
|
|
||||||
- Metrics module with `Metric` protocol, `AggregateStats`, and `BatchResult` types
|
- Metrics module with `Metric` protocol, `AggregateStats`, and `BatchResult` types
|
||||||
- BLEU metric implementation (BLEU-1 through BLEU-4 with brevity penalty)
|
- BLEU metric implementation (BLEU-1 through BLEU-4 with brevity penalty)
|
||||||
- Lexical similarity metric (Jaccard similarity and token overlap)
|
|
||||||
- ROUGE metric (ROUGE-1, ROUGE-2, ROUGE-L with precision/recall/F-measure)
|
- ROUGE metric (ROUGE-1, ROUGE-2, ROUGE-L with precision/recall/F-measure)
|
||||||
|
- Lexical similarity metric (Jaccard similarity and token overlap)
|
||||||
- Flesch-Kincaid readability metrics (grade level and reading ease)
|
- Flesch-Kincaid readability metrics (grade level and reading ease)
|
||||||
- Batch scoring with aggregate statistics for all metrics
|
- Batch scoring with aggregate statistics for all metrics
|
||||||
|
|
||||||
|
#### Validators
|
||||||
|
|
||||||
- Validators module with `Check` protocol for validation checks
|
- Validators module with `Check` protocol for validation checks
|
||||||
- Metric-based validators: `BleuValidator`, `RougeValidator`, `LexicalValidator`
|
- Metric-based validators: `BleuValidator`, `RougeValidator`, `LexicalValidator`
|
||||||
- Constraint validators: `LengthValidator`, `ReadabilityValidator`, `ContainsValidator`, `ExcludesValidator`
|
- Constraint validators: `LengthValidator`, `ReadabilityValidator`, `ContainsValidator`, `ExcludesValidator`
|
||||||
- Composite validators: `AllOf` (all checks must pass), `AnyOf` (any check must pass)
|
- Composite validators: `AllOf` (all checks must pass), `AnyOf` (any check must pass)
|
||||||
- Factory functions for clean validator API (`bleu()`, `rouge()`, `lexical()`, `length()`, `readability()`, `contains()`, `excludes()`, `all_of()`, `any_of()`)
|
- Factory functions for clean validator API (`bleu()`, `rouge()`, `lexical()`, `length()`, `readability()`, `contains()`, `excludes()`, `all_of()`, `any_of()`)
|
||||||
|
|
||||||
|
#### Semantic Similarity
|
||||||
|
|
||||||
- Semantic similarity module with embedding-based text comparison (requires `veritext[semantic]` extra)
|
- Semantic similarity module with embedding-based text comparison (requires `veritext[semantic]` extra)
|
||||||
- `SemanticSimilarity` metric using sentence-transformers for semantic relatedness
|
- `SemanticSimilarity` metric using sentence-transformers for semantic relatedness
|
||||||
- `SemanticValidator` for threshold-based semantic similarity validation
|
- `SemanticValidator` for threshold-based semantic similarity validation
|
||||||
- `semantic()` factory function for creating semantic validators
|
- `semantic()` factory function for creating semantic validators
|
||||||
- Embedding caching for performance optimisation in repeated comparisons
|
- Embedding caching for performance optimisation in repeated comparisons
|
||||||
|
|
||||||
|
#### Pytest Plugin
|
||||||
|
|
||||||
- Native pytest plugin for CI/CD integration (entry point: `pytest11`)
|
- Native pytest plugin for CI/CD integration (entry point: `pytest11`)
|
||||||
- `validate_text()` assertion function for expressive test assertions
|
- `validate_text()` assertion function for expressive test assertions
|
||||||
- `text_validation` marker for filtering validation tests
|
- `text_validation` marker for filtering validation tests
|
||||||
- Pytest fixtures: `text_validator` factory and `validation_context` helper
|
- Pytest fixtures: `text_validator` factory and `validation_context` helper
|
||||||
- Detailed failure messages with text preview and check diagnostics
|
- Detailed failure messages with text preview and check diagnostics
|
||||||
|
|
||||||
|
#### Benchmarking
|
||||||
|
|
||||||
- Benchmark module for quality tracking and regression detection
|
- Benchmark module for quality tracking and regression detection
|
||||||
- `Benchmark` class for evaluating text quality over time with metric storage
|
- `Benchmark` class for evaluating text quality over time with metric storage
|
||||||
- `BenchmarkRun` and `RegressionReport` data models for tracking runs
|
- `BenchmarkRun` and `RegressionReport` data models for tracking runs
|
||||||
@@ -45,6 +64,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
|
|||||||
- `assert_no_regression()` raises `RegressionDetectedError` for CI integration
|
- `assert_no_regression()` raises `RegressionDetectedError` for CI integration
|
||||||
- Customisable tolerance threshold and window size for regression detection
|
- Customisable tolerance threshold and window size for regression detection
|
||||||
- Metadata support for tracking git SHA, model versions, etc.
|
- Metadata support for tracking git SHA, model versions, etc.
|
||||||
|
|
||||||
|
#### CLI
|
||||||
|
|
||||||
- Command-line interface (CLI) via `veritext` command
|
- Command-line interface (CLI) via `veritext` command
|
||||||
- `veritext validate` command for inline and file-based text validation
|
- `veritext validate` command for inline and file-based text validation
|
||||||
- JSONL input format support for batch validation (`--file` option)
|
- JSONL input format support for batch validation (`--file` option)
|
||||||
@@ -54,3 +76,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
|
|||||||
- `veritext benchmark show` command for viewing benchmark history
|
- `veritext benchmark show` command for viewing benchmark history
|
||||||
- `veritext benchmark check` command for regression detection with exit code 1 on failure
|
- `veritext benchmark check` command for regression detection with exit code 1 on failure
|
||||||
- Rich-formatted terminal output with tables and coloured panels
|
- Rich-formatted terminal output with tables and coloured panels
|
||||||
|
|
||||||
|
#### Documentation
|
||||||
|
|
||||||
|
- Comprehensive readme with usage examples
|
||||||
|
- Example scripts: basic validation, chatbot testing, benchmark regression
|
||||||
|
|||||||
Reference in New Issue
Block a user