279 lines
8.1 KiB
Markdown
279 lines
8.1 KiB
Markdown
# Arbiter
|
|
|
|
A multi-agent code review system that shows its work.
|
|
|
|
## What is this?
|
|
|
|
Arbiter is a code review tool where specialised AI agents independently analyse pull
|
|
requests, then deliberate to produce unified feedback. Unlike black-box AI reviewers,
|
|
Arbiter exposes the reasoning process — you see how agents disagree, weigh trade-offs,
|
|
and reach consensus.
|
|
|
|
## Why?
|
|
|
|
Current AI code review tools give you a verdict but hide their reasoning. When they
|
|
flag something, you can't tell if it's a security expert's concern or a style nitpick.
|
|
Arbiter surfaces the editorial board's discussion.
|
|
|
|
## Features
|
|
|
|
- **Static analysis pre-pass** — ruff, mypy, bandit, radon run first
|
|
- **Specialised agents** — Security, Style, Complexity (LLM-powered)
|
|
- **Transparent deliberation** — See how agents reason and resolve conflicts
|
|
- **Configurable policies** — Adapt to your team's standards
|
|
- **Cost controls** — Token budgets, model selection, response caching
|
|
- **GitHub/GitLab integration** — Webhook-driven, posts comments to PRs
|
|
|
|
## Architecture
|
|
|
|
```
|
|
GitHub/GitLab
|
|
│
|
|
│ Webhook (PR opened/updated)
|
|
▼
|
|
┌─────────────────────────────────────────────┐
|
|
│ FastAPI Application │
|
|
│ │
|
|
│ Webhook ──► Redis Queue ──► Worker │
|
|
│ │ │
|
|
│ ▼ │
|
|
│ ┌───────────────────────────────────────┐ │
|
|
│ │ Review Orchestrator │ │
|
|
│ │ │ │
|
|
│ │ 1. Static analysis (ruff, mypy...) │ │
|
|
│ │ 2. Agents in parallel │ │
|
|
│ │ 3. Deliberation │ │
|
|
│ │ 4. Post results │ │
|
|
│ │ │ │
|
|
│ │ ┌──────────┐ ┌───────┐ ┌──────────┐ │ │
|
|
│ │ │ Security │ │ Style │ │Complexity│ │ │
|
|
│ │ └────┬─────┘ └───┬───┘ └────┬─────┘ │ │
|
|
│ │ └───────────┼──────────┘ │ │
|
|
│ │ ▼ │ │
|
|
│ │ ┌─────────────┐ │ │
|
|
│ │ │ Coordinator │ │ │
|
|
│ │ └─────────────┘ │ │
|
|
│ └───────────────────────────────────────┘ │
|
|
└─────────────────────────────────────────────┘
|
|
│
|
|
├──► PR Comment
|
|
├──► Database (history)
|
|
└──► Metrics
|
|
```
|
|
|
|
## Tech Stack
|
|
|
|
| Component | Technology |
|
|
|-----------|------------|
|
|
| Backend | Python 3.12, FastAPI |
|
|
| Queue | Redis, arq |
|
|
| Database | PostgreSQL |
|
|
| LLM | LiteLLM (OpenAI, Anthropic, local) |
|
|
| Static analysis | ruff, mypy, bandit, radon |
|
|
|
|
## Quick Start
|
|
|
|
```bash
|
|
# Clone the repository
|
|
git clone https://gitea.kschappell.com/kschappell/arbiter.git
|
|
cd arbiter
|
|
|
|
# Start infrastructure
|
|
docker compose up -d db redis
|
|
|
|
# Install dependencies
|
|
pip install -e ".[dev]"
|
|
|
|
# Run migrations
|
|
alembic upgrade head
|
|
|
|
# Start API server
|
|
uvicorn src.arbiter.main:app --reload
|
|
|
|
# Start worker (separate terminal)
|
|
arq src.arbiter.worker.tasks.WorkerSettings
|
|
```
|
|
|
|
## CLI Usage
|
|
|
|
Review a local diff without running the full server:
|
|
|
|
```bash
|
|
# Review a diff file
|
|
arbiter review changes.diff --policy .arbiter/policy.yaml
|
|
|
|
# Review staged changes
|
|
git diff --cached | arbiter review - --policy .arbiter/policy.yaml
|
|
```
|
|
|
|
## Configuration
|
|
|
|
Create `.arbiter/policy.yaml` in your repository:
|
|
|
|
```yaml
|
|
version: "1.0"
|
|
|
|
static_analysis:
|
|
ruff:
|
|
enabled: true
|
|
mypy:
|
|
enabled: true
|
|
bandit:
|
|
enabled: true
|
|
severity_threshold: medium
|
|
|
|
agents:
|
|
security:
|
|
enabled: true
|
|
model: "gpt-4o"
|
|
severity_threshold: medium
|
|
|
|
style:
|
|
enabled: true
|
|
model: "gpt-4o-mini"
|
|
config:
|
|
naming_convention: snake_case
|
|
|
|
complexity:
|
|
enabled: true
|
|
model: "gpt-4o-mini"
|
|
thresholds:
|
|
max_cyclomatic: 10
|
|
|
|
deliberation:
|
|
conflict_resolution: security_first
|
|
minimum_confidence: 0.7
|
|
|
|
cost_controls:
|
|
max_tokens_per_review: 50000
|
|
max_cost_per_review_usd: 0.50
|
|
cache_similar_diffs: true
|
|
```
|
|
|
|
## Example Output
|
|
|
|
```markdown
|
|
## Arbiter Review
|
|
|
|
**Verdict:** Request changes (confidence: 92%)
|
|
|
|
### Static Analysis
|
|
- **bandit** B105: Possible hardcoded password (line 52)
|
|
- **radon** CC: Function `process_data` has complexity 12 (threshold: 10)
|
|
|
|
### Agent Findings
|
|
|
|
🔒 **Security** (High)
|
|
Line 47: Endpoint `/api/admin/export` has no authentication decorator.
|
|
→ All admin endpoints should use `@require_admin` per project patterns.
|
|
|
|
📐 **Style** (Low)
|
|
Line 23: Function name `getData` doesn't match snake_case convention.
|
|
|
|
### Deliberation
|
|
|
|
All agents agree authentication is missing. Static analysis confirms
|
|
hardcoded password on line 52. Both issues require resolution.
|
|
```
|
|
|
|
## Dashboard
|
|
|
|
Arbiter includes a React dashboard for exploring reviews and monitoring metrics:
|
|
|
|
- **Review List** — Browse all reviews with filtering by repository, status, verdict, and author
|
|
- **Review Detail** — View findings grouped by severity with expandable cards
|
|
- **Deliberation Explorer** — Step-by-step timeline of how agents reached their verdict
|
|
- **Metrics** — Charts showing verdicts, severities, and review trends over time
|
|
|
|
Start the dashboard:
|
|
|
|
```bash
|
|
cd dashboard
|
|
npm install
|
|
npm run dev
|
|
```
|
|
|
|
Access at `http://localhost:5173`. Configure the API URL via `VITE_API_URL` environment variable.
|
|
|
|
## API Documentation
|
|
|
|
The API server provides interactive documentation:
|
|
|
|
- **Swagger UI** — `http://localhost:8000/docs`
|
|
- **ReDoc** — `http://localhost:8000/redoc`
|
|
- **OpenAPI Schema** — `http://localhost:8000/openapi.json`
|
|
|
|
For detailed endpoint documentation, see [docs/api.md](docs/api.md).
|
|
|
|
## Environment Variables
|
|
|
|
Quick reference of key environment variables (prefix with `ARBITER_`):
|
|
|
|
| Variable | Description | Default |
|
|
|----------|-------------|---------|
|
|
| `DATABASE_URL` | PostgreSQL connection URL | `postgresql+asyncpg://arbiter:arbiter@localhost:5432/arbiter` |
|
|
| `REDIS_URL` | Redis connection URL | `redis://localhost:6379/0` |
|
|
| `DEFAULT_MODEL` | LLM model for agents | `gpt-4o` |
|
|
| `GITHUB_TOKEN` | GitHub API token | - |
|
|
| `GITHUB_WEBHOOK_SECRET` | Webhook HMAC secret | - |
|
|
| `GITLAB_TOKEN` | GitLab API token | - |
|
|
| `GITLAB_WEBHOOK_TOKEN` | Webhook verification token | - |
|
|
| `POST_COMMENTS` | Post review comments to PRs | `true` |
|
|
| `UPDATE_STATUS` | Update commit status checks | `true` |
|
|
|
|
See [.env.example](.env.example) for the complete list.
|
|
|
|
## Deployment
|
|
|
|
For production deployment instructions, see [docs/deployment.md](docs/deployment.md).
|
|
|
|
## Troubleshooting
|
|
|
|
### Worker not processing jobs
|
|
|
|
Check that Redis is running and accessible:
|
|
|
|
```bash
|
|
redis-cli ping # Should return PONG
|
|
```
|
|
|
|
Verify the worker is connected:
|
|
|
|
```bash
|
|
arq src.arbiter.worker.tasks.WorkerSettings --check
|
|
```
|
|
|
|
### Webhook not receiving events
|
|
|
|
1. Verify the webhook URL is publicly accessible
|
|
2. Check that webhook secrets match between GitHub/GitLab and your configuration
|
|
3. Inspect webhook deliveries in GitHub/GitLab settings for error responses
|
|
|
|
### LLM timeouts
|
|
|
|
Increase timeout and reduce model complexity:
|
|
|
|
```bash
|
|
export ARBITER_LLM_TIMEOUT=120
|
|
export ARBITER_DEFAULT_MODEL=gpt-4o-mini
|
|
```
|
|
|
|
### Database connection errors
|
|
|
|
Ensure PostgreSQL is running and the connection URL is correct:
|
|
|
|
```bash
|
|
psql $DATABASE_URL -c "SELECT 1" # Test connection
|
|
alembic upgrade head # Run pending migrations
|
|
```
|
|
|
|
### Review not appearing in dashboard
|
|
|
|
1. Check that the API server is running
|
|
2. Verify CORS settings include your dashboard URL
|
|
3. Check browser console for API errors
|
|
|
|
## License
|
|
|
|
MIT
|