refactor: CLI cleanup and documentation updates

- Refactor CLI metric computation to eliminate code duplication - Update version format to PEP 440 compliance (0.1.0.dev0) - Cache Settings instance via @lru_cache for performance - Document composite validators' protocol deviation - Consolidate redundant empty checks in ROUGE-L computation - Add Phase 10 (Portfolio Demos) to implementation plan
2026-02-04 15:38:46 +00:00
parent 7de4505e31
commit 0699e97e1d
8 changed files with 224 additions and 66 deletions
--- a/docs/project-plan.md
+++ b/docs/project-plan.md
@@ -488,3 +488,47 @@ benchmark.assert_no_regression(tolerance=0.03)

 5. **Natural portfolio narrative** — "I was building X and needed a better way to test
   it, so I built this tool." Every interviewer has faced similar problems.
+
+---
+
+## Portfolio Demos (Future)
+
+Interactive demos to showcase Veritext without requiring installation.
+
+### Streamlit Demo
+
+A quick interactive web UI for general visitors and recruiters.
+
+**Features:**
+- Text input boxes (candidate + reference)
+- Metric selector (BLEU, ROUGE, lexical, readability)
+- Threshold sliders for pass/fail validation
+- Results table with scores and status
+
+**Deployment:** Self-hosted on homeserver (e.g., `veritext.kschappell.com`)
+
+**Effort:** ~30 minutes
+
+### Jupyter Notebook Collection
+
+Deep-dive notebooks targeting data science and ML recruiters.
+
+**Notebooks:**
+
+| Notebook | Purpose |
+|----------|---------|
+| `01-metrics-overview.ipynb` | Introduction to each metric with visualisations |
+| `02-batch-evaluation.ipynb` | Evaluating model outputs at scale, statistical analysis |
+| `03-regression-detection.ipynb` | Tracking quality over time, detecting degradation |
+| `04-chatbot-validation.ipynb` | Real-world use case: validating chatbot responses |
+
+**Hosting:** JupyterLite (static files, runs in browser via WebAssembly)
+
+**Deployment:** Self-hosted alongside Streamlit demo
+
+**Why both:**
+
+| Demo Type | Audience | Value |
+|-----------|----------|-------|
+| Streamlit | General visitors | Quick, interactive, no friction |
+| Notebooks | Data/ML recruiters | Shows analytical depth, speaks their language |