Explicitly documents the requirement to create a new branch before starting work from a plan, consistent with the parent workspace CLAUDE.md instruction.
122 lines
3.5 KiB
Markdown
122 lines
3.5 KiB
Markdown
# CLAUDE.md
|
|
|
|
Guidelines for working on the Veritext project.
|
|
|
|
## Project Overview
|
|
|
|
Veritext is a semantic text validation framework for Python. It validates text outputs
|
|
against quality criteria using metrics like BLEU, ROUGE, and semantic similarity.
|
|
|
|
## Directory Structure
|
|
|
|
```
|
|
veritext/
|
|
├── src/veritext/ # Package source
|
|
│ ├── core/ # Shared types, tokenisation, config
|
|
│ ├── metrics/ # BLEU, ROUGE, lexical, readability
|
|
│ ├── semantic/ # Optional embedding-based similarity
|
|
│ ├── validators/ # Composable validation checks
|
|
│ ├── benchmark/ # Quality tracking & regression detection
|
|
│ ├── pytest_plugin/ # Native pytest integration
|
|
│ └── cli/ # Command-line interface
|
|
├── tests/ # Test suite (mirrors src structure)
|
|
├── docs/ # Project documentation
|
|
└── examples/ # Usage examples
|
|
```
|
|
|
|
## Code Style
|
|
|
|
### Python Conventions
|
|
|
|
- **Python 3.11+** with modern type hints
|
|
- **UK English** in all text (colour, behaviour, summarisation, tokenisation)
|
|
- **snake_case** for variables, functions, modules
|
|
- **PascalCase** for classes
|
|
- Absolute imports from package root: `from veritext.core.types import ...`
|
|
|
|
### Quality Gates
|
|
|
|
All must pass with zero issues before any commit:
|
|
|
|
```bash
|
|
uv run ruff check . # Linting
|
|
uv run ruff format --check . # Formatting
|
|
uv run mypy src/ # Type checking
|
|
uv run pytest # Tests
|
|
```
|
|
|
|
### Documentation
|
|
|
|
- Docstrings for all public APIs (Google style)
|
|
- Type hints on all function signatures
|
|
- Keep docstrings concise; let types speak where possible
|
|
|
|
## Architecture
|
|
|
|
### Layer Dependencies
|
|
|
|
```
|
|
CLI / pytest_plugin (presentation)
|
|
↓
|
|
validators / benchmark (decision logic)
|
|
↓
|
|
metrics (pure computation)
|
|
↓
|
|
core (shared types, tokenisation)
|
|
```
|
|
|
|
Each layer depends only on layers below it.
|
|
|
|
### Metrics vs Validators
|
|
|
|
| Concept | Responsibility | Output |
|
|
|---------|----------------|--------|
|
|
| **Metric** | Compute a score | Typed result (e.g., `BleuResult`) |
|
|
| **Validator** | Make pass/fail decision | `ValidationResult` with diagnostics |
|
|
|
|
### Edge Case Handling
|
|
|
|
- Empty text: Metrics return zero scores; validators fail
|
|
- Empty reference: Comparison metrics raise `ValueError`
|
|
- Whitespace-only: Treated as empty after tokenisation
|
|
- Unicode: NFC normalisation by default
|
|
|
|
## Git Workflow
|
|
|
|
### Before Starting Work
|
|
|
|
When starting work from a plan, create a new branch matching the plan's scope before
|
|
making any changes. Do not reuse an existing branch from previous work, even if related.
|
|
|
|
### Commits
|
|
|
|
- Format: `type(scope): description`
|
|
- Types: feat, fix, chore, refactor, docs, test
|
|
- Atomic: ≤3 new files, ≤150 LOC per commit
|
|
- Update changelog.md before completing a task
|
|
|
|
### Branches
|
|
|
|
- `feat/kebab-case` — new features
|
|
- `fix/kebab-case` — bug fixes
|
|
- `chore/` — maintenance
|
|
- `refactor/` — code restructure
|
|
- `docs/` — documentation only
|
|
|
|
## Testing
|
|
|
|
- Test files mirror source structure: `tests/test_core/test_types.py`
|
|
- Use pytest fixtures for common setup
|
|
- Target ≥80% coverage
|
|
- Include edge cases: empty input, Unicode, boundary values
|
|
|
|
## Pre-Completion Checklist
|
|
|
|
Before marking ANY task complete:
|
|
|
|
- [ ] All linting/formatting/type checks pass
|
|
- [ ] Tests pass with adequate coverage
|
|
- [ ] changelog.md updated if user-facing changes
|
|
- [ ] Filenames are lowercase (except CLAUDE.md)
|
|
- [ ] Commit follows `type(scope): description` format
|