2025-06-16 19:13:53 +00:00
2025-06-07 11:13:00 +00:00
2025-06-15 17:11:23 +00:00
2025-06-05 19:05:01 +00:00
2025-06-08 10:16:19 +00:00
2025-03-08 15:18:11 +00:00
2025-03-08 15:18:11 +00:00
2025-03-22 10:12:35 +00:00
2025-03-22 10:12:35 +00:00
2025-06-16 19:13:53 +00:00
2025-03-08 15:18:11 +00:00

Arbiter

A multi-agent code review system that shows its work.

What is this?

Arbiter is a code review tool where specialised AI agents independently analyse pull requests, then deliberate to produce unified feedback. Unlike black-box AI reviewers, Arbiter exposes the reasoning process — you see how agents disagree, weigh trade-offs, and reach consensus.

Why?

Current AI code review tools give you a verdict but hide their reasoning. When they flag something, you can't tell if it's a security expert's concern or a style nitpick. Arbiter surfaces the editorial board's discussion.

Features

  • Static analysis pre-pass — ruff, mypy, bandit, radon run first
  • Specialised agents — Security, Style, Complexity (LLM-powered)
  • Transparent deliberation — See how agents reason and resolve conflicts
  • Configurable policies — Adapt to your team's standards
  • Cost controls — Token budgets, model selection, response caching
  • GitHub/GitLab integration — Webhook-driven, posts comments to PRs

Architecture

GitHub/GitLab
     │
     │ Webhook (PR opened/updated)
     ▼
┌─────────────────────────────────────────────┐
│           FastAPI Application               │
│                                             │
│  Webhook ──► Redis Queue ──► Worker         │
│                                │            │
│                                ▼            │
│  ┌───────────────────────────────────────┐  │
│  │        Review Orchestrator            │  │
│  │                                       │  │
│  │  1. Static analysis (ruff, mypy...)  │  │
│  │  2. Agents in parallel               │  │
│  │  3. Deliberation                     │  │
│  │  4. Post results                     │  │
│  │                                       │  │
│  │  ┌──────────┐ ┌───────┐ ┌──────────┐ │  │
│  │  │ Security │ │ Style │ │Complexity│ │  │
│  │  └────┬─────┘ └───┬───┘ └────┬─────┘ │  │
│  │       └───────────┼──────────┘       │  │
│  │                   ▼                  │  │
│  │           ┌─────────────┐            │  │
│  │           │ Coordinator │            │  │
│  │           └─────────────┘            │  │
│  └───────────────────────────────────────┘  │
└─────────────────────────────────────────────┘
     │
     ├──► PR Comment
     ├──► Database (history)
     └──► Metrics

Tech Stack

Component Technology
Backend Python 3.12, FastAPI
Queue Redis, arq
Database PostgreSQL
LLM LiteLLM (OpenAI, Anthropic, local)
Static analysis ruff, mypy, bandit, radon

Quick Start

# Clone the repository
git clone https://gitea.kschappell.com/kschappell/arbiter.git
cd arbiter

# Start infrastructure
docker compose up -d db redis

# Install dependencies
pip install -e ".[dev]"

# Run migrations
alembic upgrade head

# Start API server
uvicorn src.arbiter.main:app --reload

# Start worker (separate terminal)
arq src.arbiter.worker.tasks.WorkerSettings

CLI Usage

Review a local diff without running the full server:

# Review a diff file
arbiter review changes.diff --policy .arbiter/policy.yaml

# Review staged changes
git diff --cached | arbiter review - --policy .arbiter/policy.yaml

Configuration

Create .arbiter/policy.yaml in your repository:

version: "1.0"

static_analysis:
  ruff:
    enabled: true
  mypy:
    enabled: true
  bandit:
    enabled: true
    severity_threshold: medium

agents:
  security:
    enabled: true
    model: "gpt-4o"
    severity_threshold: medium

  style:
    enabled: true
    model: "gpt-4o-mini"
    config:
      naming_convention: snake_case

  complexity:
    enabled: true
    model: "gpt-4o-mini"
    thresholds:
      max_cyclomatic: 10

deliberation:
  conflict_resolution: security_first
  minimum_confidence: 0.7

cost_controls:
  max_tokens_per_review: 50000
  max_cost_per_review_usd: 0.50
  cache_similar_diffs: true

Example Output

## Arbiter Review

**Verdict:** Request changes (confidence: 92%)

### Static Analysis
- **bandit** B105: Possible hardcoded password (line 52)
- **radon** CC: Function `process_data` has complexity 12 (threshold: 10)

### Agent Findings

🔒 **Security** (High)
Line 47: Endpoint `/api/admin/export` has no authentication decorator.
→ All admin endpoints should use `@require_admin` per project patterns.

📐 **Style** (Low)
Line 23: Function name `getData` doesn't match snake_case convention.

### Deliberation

All agents agree authentication is missing. Static analysis confirms
hardcoded password on line 52. Both issues require resolution.

Dashboard

Arbiter includes a React dashboard for exploring reviews and monitoring metrics:

  • Review List — Browse all reviews with filtering by repository, status, verdict, and author
  • Review Detail — View findings grouped by severity with expandable cards
  • Deliberation Explorer — Step-by-step timeline of how agents reached their verdict
  • Metrics — Charts showing verdicts, severities, and review trends over time

Start the dashboard:

cd dashboard
npm install
npm run dev

Access at http://localhost:5173. Configure the API URL via VITE_API_URL environment variable.

API Documentation

The API server provides interactive documentation:

  • Swagger UIhttp://localhost:8000/docs
  • ReDochttp://localhost:8000/redoc
  • OpenAPI Schemahttp://localhost:8000/openapi.json

For detailed endpoint documentation, see docs/api.md.

Environment Variables

Quick reference of key environment variables (prefix with ARBITER_):

Variable Description Default
DATABASE_URL PostgreSQL connection URL postgresql+asyncpg://arbiter:arbiter@localhost:5432/arbiter
REDIS_URL Redis connection URL redis://localhost:6379/0
DEFAULT_MODEL LLM model for agents gpt-4o
GITHUB_TOKEN GitHub API token -
GITHUB_WEBHOOK_SECRET Webhook HMAC secret -
GITLAB_TOKEN GitLab API token -
GITLAB_WEBHOOK_TOKEN Webhook verification token -
POST_COMMENTS Post review comments to PRs true
UPDATE_STATUS Update commit status checks true

See .env.example for the complete list.

Deployment

For production deployment instructions, see docs/deployment.md.

Troubleshooting

Worker not processing jobs

Check that Redis is running and accessible:

redis-cli ping  # Should return PONG

Verify the worker is connected:

arq src.arbiter.worker.tasks.WorkerSettings --check

Webhook not receiving events

  1. Verify the webhook URL is publicly accessible
  2. Check that webhook secrets match between GitHub/GitLab and your configuration
  3. Inspect webhook deliveries in GitHub/GitLab settings for error responses

LLM timeouts

Increase timeout and reduce model complexity:

export ARBITER_LLM_TIMEOUT=120
export ARBITER_DEFAULT_MODEL=gpt-4o-mini

Database connection errors

Ensure PostgreSQL is running and the connection URL is correct:

psql $DATABASE_URL -c "SELECT 1"  # Test connection
alembic upgrade head  # Run pending migrations

Review not appearing in dashboard

  1. Check that the API server is running
  2. Verify CORS settings include your dashboard URL
  3. Check browser console for API errors

License

MIT

Description
No description provided
Readme 1.8 MiB
Languages
Python 92.2%
TypeScript 7.3%
Dockerfile 0.2%
Mako 0.1%
JavaScript 0.1%