Files
veritext/CLAUDE.md
Kai Chappell 60aaa33327 chore(project): add pyproject.toml and project configuration
Configure Python project with pydantic, structlog, typer, rich dependencies.
Set up ruff, mypy, pytest tooling with strict type checking.
2026-02-03 16:15:48 +00:00

3.3 KiB

CLAUDE.md

Guidelines for working on the Veritext project.

Project Overview

Veritext is a semantic text validation framework for Python. It validates text outputs against quality criteria using metrics like BLEU, ROUGE, and semantic similarity.

Directory Structure

veritext/
├── src/veritext/          # Package source
│   ├── core/              # Shared types, tokenisation, config
│   ├── metrics/           # BLEU, ROUGE, lexical, readability
│   ├── semantic/          # Optional embedding-based similarity
│   ├── validators/        # Composable validation checks
│   ├── benchmark/         # Quality tracking & regression detection
│   ├── pytest_plugin/     # Native pytest integration
│   └── cli/               # Command-line interface
├── tests/                 # Test suite (mirrors src structure)
├── docs/                  # Project documentation
└── examples/              # Usage examples

Code Style

Python Conventions

  • Python 3.11+ with modern type hints
  • UK English in all text (colour, behaviour, summarisation, tokenisation)
  • snake_case for variables, functions, modules
  • PascalCase for classes
  • Absolute imports from package root: from veritext.core.types import ...

Quality Gates

All must pass with zero issues before any commit:

uv run ruff check .              # Linting
uv run ruff format --check .     # Formatting
uv run mypy src/                 # Type checking
uv run pytest                    # Tests

Documentation

  • Docstrings for all public APIs (Google style)
  • Type hints on all function signatures
  • Keep docstrings concise; let types speak where possible

Architecture

Layer Dependencies

CLI / pytest_plugin  (presentation)
        ↓
validators / benchmark  (decision logic)
        ↓
metrics  (pure computation)
        ↓
core  (shared types, tokenisation)

Each layer depends only on layers below it.

Metrics vs Validators

Concept Responsibility Output
Metric Compute a score Typed result (e.g., BleuResult)
Validator Make pass/fail decision ValidationResult with diagnostics

Edge Case Handling

  • Empty text: Metrics return zero scores; validators fail
  • Empty reference: Comparison metrics raise ValueError
  • Whitespace-only: Treated as empty after tokenisation
  • Unicode: NFC normalisation by default

Git Workflow

Commits

  • Format: type(scope): description
  • Types: feat, fix, chore, refactor, docs, test
  • Atomic: ≤3 new files, ≤150 LOC per commit
  • Update changelog.md before completing a task

Branches

  • feat/kebab-case — new features
  • fix/kebab-case — bug fixes
  • chore/ — maintenance
  • refactor/ — code restructure
  • docs/ — documentation only

Testing

  • Test files mirror source structure: tests/test_core/test_types.py
  • Use pytest fixtures for common setup
  • Target ≥80% coverage
  • Include edge cases: empty input, Unicode, boundary values

Pre-Completion Checklist

Before marking ANY task complete:

  • All linting/formatting/type checks pass
  • Tests pass with adequate coverage
  • changelog.md updated if user-facing changes
  • Filenames are lowercase (except CLAUDE.md)
  • Commit follows type(scope): description format