# Catalyst Agent Harness

## Purpose

Catalyst Agent is the general-purpose agent harness inside CATALYST Command Center. It combines:

- Cloudflare Workers AI as the reasoning provider.
- Default model: `@cf/moonshotai/kimi-k2.6`.
- Catalyst HMK-style session memory for persistent continuity.
- Browser Run grounding intent for research and dashboard validation.
- Connector manifests for GitHub, email, Cloudflare, memory, and research tools.
- Approval-gated execution for anything that mutates external systems.
- Multi-agent decomposition into planner, grounder, and operator roles.

The harness is hosted in `cloudflare-product/` and deployed as a Cloudflare Worker.

## Workers AI Configuration

No external Kimi or Moonshot API key is required. The Worker uses the Cloudflare Workers AI binding:

```toml
[ai]
binding = "AI"
```

Optional Worker vars:

```text
AGENT_MODEL=@cf/moonshotai/kimi-k2.6
AGENT_FAST_MODEL=@cf/meta/llama-3.1-8b-instruct
```

The production path calls:

```ts
await env.AI.run("@cf/moonshotai/kimi-k2.6", { messages, temperature: 0.2 });
```

If `env.AI` is unavailable in local tests, the harness falls back to deterministic mode. That fallback exists for tests and local validation, not for production claims.

## Production Boundary

The agent can plan and prepare Git, email, and Cloudflare actions. It must not silently:

- send emails or messages
- mutate repositories
- deploy Cloudflare resources
- create API/OAuth keys
- change permissions
- delete data
- transmit sensitive data

Those actions require explicit action-time approval.

## API Endpoints

### Capabilities

```text
GET /v1/agent/capabilities
```

Returns model, provider, connector manifest, and safety boundary.

### Respond

```text
POST /v1/agent/respond
```

```json
{
  "tenantId": "demo",
  "sessionId": "founder-hq",
  "message": "Research the repo, draft a deploy plan, and prepare a launch email.",
  "mode": "plan",
  "connectors": ["memory", "research", "browser_run", "github", "email", "cloudflare"],
  "context": ["project:CATALYST Command", "release:agent-harness"]
}
```

Response includes:

- provider: `workers-ai` or `deterministic-fallback`
- HMK session key
- retained context facts
- planner/grounder/operator swarm
- connector action plan
- approval gates
- harness benchmark score

### Benchmark

```text
POST /v1/agent/benchmark
```

```json
{
  "tenantId": "demo",
  "sessionId": "bench",
  "connectors": ["memory", "research", "browser_run", "github", "email", "cloudflare"]
}
```

Returns harness-level benchmark dimensions:

- grounding
- memory
- tool safety
- orchestration

These are harness checks, not claims about live model intelligence.

## Browser Run Grounding Plan

Current implementation:

- exposes `browser_run` as a read-only action intent
- displays Browser Run evidence and approval gates in the UI
- keeps grounded research separate from account mutations

Next adapter:

- add Browser Run Quick Actions for markdown extraction and screenshots
- store research packets in R2 under `ghost-forge/<tenant>/<session>/research/`
- reuse Browser Sessions through Durable Objects for long-running research tasks

## Connector Model

| Connector | Role | Mutating |
| --- | --- | --- |
| `memory` | HMK continuity and artifact retrieval lanes | No |
| `research` | artifact synthesis and evidence comparison | No |
| `browser_run` | grounded web/browser validation | No |
| `github` | repository inspection, issue/PR drafting, patch planning | Yes |
| `email` | inbox triage and reply drafting | Yes |
| `cloudflare` | deployment status, deploy planning, resource inspection | Yes |

Mutating connectors are approval-gated whenever the user asks for execution, deploys, sends, commits, pushes, permission changes, key creation, or destructive work.

## Test Path

```bash
cd cloudflare-product
npm run check
npm run test
```

The suite verifies:

- deterministic HMK keys
- Workers AI `env.AI.run()` path with `@cf/moonshotai/kimi-k2.6`
- safe fallback without an AI binding
- approval gating for mutating connectors
- module ordering
- dashboard product landmarks
- benchmark scoring

## Cloudflare Artifact Path

`ghost-forge` is treated as a Cloudflare-native product artifact bundle:

- runtime Worker: `catalyst-enterprise-api`
- Pages dashboard: `catalyst-holographic-dashboard`
- R2 artifact bucket: `catalyst-artifacts`
- manifest: `.cloudflare/ghost-forge/pipeline.json`

Use `scripts/ghost-forge-ci.sh` for a local/CI deploy lane that validates, publishes docs to R2, deploys the Worker, then deploys the Pages site.

## Deployment

Deploy Worker:

```bash
cd cloudflare-product
npx wrangler deploy
```

Deploy dashboard:

```bash
npx wrangler pages deploy cloudflare-pages-site --project-name catalyst-holographic-dashboard
```

Publish Cloudflare artifacts:

```bash
npx wrangler r2 object put catalyst-artifacts/ghost-forge/latest/pipeline.json --file .cloudflare/ghost-forge/pipeline.json
```

## Roadmap

1. Persist session summaries to Durable Object SQLite.
2. Store long-horizon artifacts in R2 and index them in Vectorize.
3. Add Browser Run Quick Actions adapter.
4. Add draft-only GitHub/email/Cloudflare connectors.
5. Add eval suite for real tasks: repo triage, deployment planning, docs update, and research synthesis.
