Configure Python project with pydantic, structlog, typer, rich dependencies. Set up ruff, mypy, pytest tooling with strict type checking.
3.3 KiB
3.3 KiB
CLAUDE.md
Guidelines for working on the Veritext project.
Project Overview
Veritext is a semantic text validation framework for Python. It validates text outputs against quality criteria using metrics like BLEU, ROUGE, and semantic similarity.
Directory Structure
veritext/
├── src/veritext/ # Package source
│ ├── core/ # Shared types, tokenisation, config
│ ├── metrics/ # BLEU, ROUGE, lexical, readability
│ ├── semantic/ # Optional embedding-based similarity
│ ├── validators/ # Composable validation checks
│ ├── benchmark/ # Quality tracking & regression detection
│ ├── pytest_plugin/ # Native pytest integration
│ └── cli/ # Command-line interface
├── tests/ # Test suite (mirrors src structure)
├── docs/ # Project documentation
└── examples/ # Usage examples
Code Style
Python Conventions
- Python 3.11+ with modern type hints
- UK English in all text (colour, behaviour, summarisation, tokenisation)
- snake_case for variables, functions, modules
- PascalCase for classes
- Absolute imports from package root:
from veritext.core.types import ...
Quality Gates
All must pass with zero issues before any commit:
uv run ruff check . # Linting
uv run ruff format --check . # Formatting
uv run mypy src/ # Type checking
uv run pytest # Tests
Documentation
- Docstrings for all public APIs (Google style)
- Type hints on all function signatures
- Keep docstrings concise; let types speak where possible
Architecture
Layer Dependencies
CLI / pytest_plugin (presentation)
↓
validators / benchmark (decision logic)
↓
metrics (pure computation)
↓
core (shared types, tokenisation)
Each layer depends only on layers below it.
Metrics vs Validators
| Concept | Responsibility | Output |
|---|---|---|
| Metric | Compute a score | Typed result (e.g., BleuResult) |
| Validator | Make pass/fail decision | ValidationResult with diagnostics |
Edge Case Handling
- Empty text: Metrics return zero scores; validators fail
- Empty reference: Comparison metrics raise
ValueError - Whitespace-only: Treated as empty after tokenisation
- Unicode: NFC normalisation by default
Git Workflow
Commits
- Format:
type(scope): description - Types: feat, fix, chore, refactor, docs, test
- Atomic: ≤3 new files, ≤150 LOC per commit
- Update changelog.md before completing a task
Branches
feat/kebab-case— new featuresfix/kebab-case— bug fixeschore/— maintenancerefactor/— code restructuredocs/— documentation only
Testing
- Test files mirror source structure:
tests/test_core/test_types.py - Use pytest fixtures for common setup
- Target ≥80% coverage
- Include edge cases: empty input, Unicode, boundary values
Pre-Completion Checklist
Before marking ANY task complete:
- All linting/formatting/type checks pass
- Tests pass with adequate coverage
- changelog.md updated if user-facing changes
- Filenames are lowercase (except CLAUDE.md)
- Commit follows
type(scope): descriptionformat