6d1bece815
test(benchmark): add benchmark module tests
...
Comprehensive tests for models, storage, regression detection, and runner.
2026-02-03 18:10:13 +00:00
40fa39485e
feat(benchmark): add module exports
...
Public API exports for the benchmark module.
2026-02-03 18:10:07 +00:00
9115f0c25b
feat(benchmark): add Benchmark runner class
...
Main Benchmark class for evaluating text quality and tracking regressions.
2026-02-03 18:10:01 +00:00
83c4b4bee5
feat(benchmark): add regression detection
...
Rolling window baseline computation and statistical regression detection.
2026-02-03 18:09:55 +00:00
44e3e8f4ea
feat(benchmark): add SQLite storage backend
...
Persistent storage for benchmark history with WAL mode for concurrent access.
2026-02-03 18:09:49 +00:00
45dfe07772
feat(benchmark): add BenchmarkRun and RegressionReport models
...
Data models for benchmark runs and regression reports using Pydantic.
2026-02-03 18:09:43 +00:00
6bafc43754
docs(changelog): add pytest plugin entries
2026-02-03 17:40:52 +00:00
012b306749
test(pytest-plugin): add plugin tests
...
Cover validate_text assertions, fixture factories, marker registration,
and pytest integration using pytester for subprocess testing.
2026-02-03 17:40:46 +00:00
ac7c5c69cf
feat(pytest-plugin): add validate_text assertion
...
Primary API for text validation in pytest with keyword arguments
for BLEU, ROUGE, semantic similarity, length, readability, and
pattern matching. Includes detailed failure formatting.
2026-02-03 17:40:40 +00:00
cd36c54e22
feat(pytest-plugin): add plugin hooks and markers
...
Register text_validation marker via pytest_configure hook.
2026-02-03 17:40:33 +00:00
107fc4e275
docs(changelog): add semantic similarity entries
2026-02-03 17:31:14 +00:00
571b770281
test(semantic): add semantic similarity tests
2026-02-03 17:31:07 +00:00
8b3536873e
feat(validators): add SemanticValidator
2026-02-03 17:31:01 +00:00
9a4ac359a3
feat(semantic): add SemanticSimilarity metric
2026-02-03 17:30:56 +00:00
de5ad93524
feat(metrics): add SemanticResult type
2026-02-03 17:30:50 +00:00
cab8099d06
docs(changelog): add validator entries
...
Document validators module with Check protocol, metric validators,
constraint validators, composite validators, and factory functions.
2026-02-03 17:14:37 +00:00
e2be3daffd
test(validators): add validator tests
...
Add comprehensive tests for metric validators, constraint validators,
and composite validators covering pass/fail cases and error handling.
2026-02-03 17:14:32 +00:00
9239300fd9
feat(validators): add factory functions and exports
...
Export all validators and provide factory functions for clean API:
bleu(), rouge(), lexical(), length(), readability(), contains(),
excludes(), all_of(), any_of().
2026-02-03 17:14:26 +00:00
b9f805b2f4
feat(validators): add composite validators
...
Implement AllOf and AnyOf for combining multiple checks into
composite validation rules.
2026-02-03 17:14:20 +00:00
75cd7b68de
feat(validators): add constraint validators
...
Implement LengthValidator, ReadabilityValidator, ContainsValidator, and
ExcludesValidator for text constraints without reference text.
2026-02-03 17:14:14 +00:00
b2b5eb1518
feat(validators): add metric-based validators
...
Implement BleuValidator, RougeValidator, and LexicalValidator for
validating text against reference using metric thresholds.
2026-02-03 17:14:09 +00:00
9e7b0131b3
feat(validators): add Check protocol and base types
...
Define the Check protocol for validation checks that compute a score
and return pass/fail results with diagnostics.
2026-02-03 17:14:03 +00:00
b8ab5811dd
docs(changelog): add ROUGE and readability entries
2026-02-03 17:03:39 +00:00
62fac688e4
test(metrics): add ROUGE and readability tests
2026-02-03 17:03:34 +00:00
14ac7dbbb9
feat(metrics): export ROUGE and readability from module
2026-02-03 17:03:28 +00:00
aad933f9c4
feat(metrics): add readability implementation
2026-02-03 17:03:24 +00:00
2a7476046d
feat(metrics): add ROUGE implementation
2026-02-03 17:03:19 +00:00
914c738013
feat(metrics): add ROUGE and readability result types
2026-02-03 17:03:14 +00:00
a4f5fa4cc6
docs(changelog): add metrics module entries
2026-02-03 16:46:03 +00:00
027d2d3beb
test(metrics): add BLEU and lexical tests
...
Add comprehensive tests for BLEU and lexical metrics including edge
cases, batch scoring, and aggregate statistics.
2026-02-03 16:45:57 +00:00
74ee8c2e7b
feat(metrics): add lexical similarity metrics
...
Implement Jaccard similarity and token overlap metrics with batch
scoring support.
2026-02-03 16:45:51 +00:00
e1c8c25142
feat(metrics): add BLEU implementation
...
Implement BLEU-1 through BLEU-4 with modified n-gram precision,
brevity penalty, and support for multiple references.
2026-02-03 16:45:45 +00:00
e6167005e5
feat(metrics): add metric protocol and batch types
...
Add Metric protocol, AggregateStats for statistical summaries, and
BatchResult for batch processing support.
2026-02-03 16:45:38 +00:00
14dcddcbba
chore: add gitignore and remove cached files
...
Add comprehensive gitignore for Python projects. Remove accidentally
committed __pycache__ directories.
2026-02-03 16:16:33 +00:00
1e3618e637
test(core): add tokenisation and types tests
...
Cover WordTokeniser (Unicode, empty input, punctuation, multiple scripts)
and validation types (immutability, edge cases, failure summary).
2026-02-03 16:16:20 +00:00
a65249fa44
feat(core): add config and structured logging
...
Implement pydantic-settings based configuration with environment variable
support and structlog integration for JSON/console output modes.
2026-02-03 16:16:13 +00:00
697b1ddfeb
feat(core): add tokenisation with unicode support
...
Implement Tokeniser protocol and WordTokeniser class with NFC Unicode
normalisation, optional lowercasing, and punctuation removal.
2026-02-03 16:16:07 +00:00
efc6a031a3
feat(core): add validation types
...
Implement ValidationContext, CheckResult, and ValidationResult models
using Pydantic with frozen (immutable) configuration.
2026-02-03 16:16:01 +00:00
a1e862550c
feat(core): add exception hierarchy
...
Implement VeritextError base class and specialised exceptions:
MetricError, ValidationError, BenchmarkError, ConfigurationError, DependencyError.
2026-02-03 16:15:55 +00:00
60aaa33327
chore(project): add pyproject.toml and project configuration
...
Configure Python project with pydantic, structlog, typer, rich dependencies.
Set up ruff, mypy, pytest tooling with strict type checking.
2026-02-03 16:15:48 +00:00
818e241ab2
docs(plans): improve consistency and add edge case handling
...
- Add requires_reference property to Metric protocol for standalone metrics
- Make reference parameter optional in score/batch_score methods
- Add comprehensive Edge Case Handling section (empty text, Unicode, etc.)
- Expand phase tasks with explicit test coverage requirements
- Fix path reference to use relative workspace path
- Add missing test_runner.py to directory structure
- Clarify SemanticValidator integration in Phase 5
- Fix tuple/list type annotation in Benchmark.evaluate()
2026-02-03 16:04:02 +00:00
49f1e27cb1
docs: add project and implementation plans
...
Comprehensive documentation for Veritext semantic text validation framework:
- Project plan with architecture, use cases, and success criteria
- Implementation plan with 9 phases, interfaces, and verification steps
2026-02-03 15:27:00 +00:00