Added WardGate comparison

This commit is contained in:
welsberr 2026-02-27 20:39:02 -05:00
parent d31c506de8
commit 26eef11b09
1 changed files with 155 additions and 0 deletions

155
docs/wardgate_comparison.md Normal file
View File

@ -0,0 +1,155 @@
# Wardgate vs ThreeGate: Comparison, Gaps, and Integration Plan
This document analyzes Wardgates approach to agent security and compares it to ThreeGates
compartmentalized architecture (CORE/FETCH/TOOL-EXEC). It then proposes concrete upgrades to
ThreeGate based on Wardgates strengths.
Sources:
- Wardgate repository + README (credential-isolated API proxying + policy-gated remote execution “conclaves”). :contentReference[oaicite:1]{index=1}
- Wardgate documentation (policies, audit logging, approvals, sensitive data filtering, conclaves). :contentReference[oaicite:2]{index=2}
---
## 1. Wardgate Summary (as implemented)
Wardgate is a security gateway for AI agents with two primary functions:
1) Credential-isolated API proxying (agent never sees real credentials)
2) Policy-gated remote command execution in isolated environments ("conclaves")
It includes audit logging, approval workflows, and sensitive data filtering. :contentReference[oaicite:3]{index=3}
Key design posture:
- Central policy enforcement point (allow/deny/ask) for external actions
- Conclave execution is mediated by a gateway and policy checks
- Output is streamed and logged (with filtering features where applicable) :contentReference[oaicite:4]{index=4}
---
## 2. ThreeGate Summary (current)
ThreeGate aims to harden a local research assistant by splitting responsibilities:
- CORE (analysis/writing): no network, no tool execution
- FETCH (retrieval): allowlisted egress only; produces sanitized “Research Packets”
- TOOL-EXEC (execution): bounded, approval-gated execution (Monty/WASM/ERA)
ThreeGates primary control plane is artifact validation + network segmentation + policy files mounted read-only.
---
## 3. Key Differences
### 3.1 Security Boundary Focus
- Wardgate: protects *credentials and external side effects* by routing actions through a gateway policy engine. :contentReference[oaicite:5]{index=5}
- ThreeGate: protects *agent cognition and tool abuse* by compartmentalization and one-way handoffs.
### 3.2 Execution Model
- Wardgate: remote command execution via gateway-to-conclave channel (conclave-side exec agent); policies evaluate command intent. :contentReference[oaicite:6]{index=6}
- ThreeGate: local execution via constrained runtimes and strict Tool Request schemas; default forbids shell constructs.
### 3.3 Prompt Injection Mitigation
- Wardgate: primarily reduces blast radius of “bad actions” by gating APIs/commands, but does not inherently enforce a multi-instance “browse vs reason” split.
- ThreeGate: explicitly splits browse/sanitize vs reason/synthesize and treats all fetched content as hostile data.
### 3.4 Secret Handling
- Wardgate: “agent never sees secrets” is a first-class principle via proxying. :contentReference[oaicite:7]{index=7}
- ThreeGate: currently focuses on network isolation and content sanitization; secrets are out of scope unless we add “actuator roles” later.
---
## 4. Wardgate Insights to Incorporate into ThreeGate
### 4.1 Dynamic Grants (Short-lived capability expansions)
Wardgate docs describe policy + approvals as core functions (and community discussions highlight human kill-switch semantics). :contentReference[oaicite:8]{index=8}
ThreeGate improvement:
- Add a “Dynamic Grant” artifact type that temporarily authorizes a narrowly scoped exception:
- allow one redirect hop
- allow one domain for N minutes
- allow one tool template execution once
- Grants must be explicit, auditable, time-bounded, and revocable.
### 4.2 Ask/Approve as a first-class action across lanes
Wardgate includes approval workflows. :contentReference[oaicite:9]{index=9}
ThreeGate improvement:
- Extend approval gates beyond TOOL-EXEC to:
- FETCH exceptions (redirect hops, content-type exceptions)
- Output release gates (suspected sensitive content in tool output)
### 4.3 Output caps and local allowlists (defense-in-depth)
Wardgate emphasizes gating and filtering (and generally includes auditing, approvals, and filtering). :contentReference[oaicite:10]{index=10}
ThreeGate improvement:
- Standardize resource and output caps across Monty/WASM/ERA:
- max stdout/stderr bytes
- max number of output files
- max total output bytes
- Add allowlisted tool manifests (e.g., wasm tool manifest) rather than “any module under a directory”.
### 4.4 Tool Templates instead of arbitrary commands
Wardgate positions “templates” as a safer way to expose actions (agent provides args; system owns the template). :contentReference[oaicite:11]{index=11}
ThreeGate improvement:
- Introduce Tool Templates:
- operator-defined, reviewed, versioned
- structured arguments (typed JSON), not shell lines
- backend chosen by template (monty/wasm/era)
- This reduces validator friction and reduces parsing ambiguity.
### 4.5 Sensitive data filtering is defense-in-depth, not primary control
Wardgate includes “sensitive data filtering”. :contentReference[oaicite:12]{index=12}
But filtering is brittle (encoding, partial secrets, structured data).
ThreeGate improvement:
- Add content redaction as *optional* output filter for:
- tool outputs released to CORE
- fetched packets released to CORE
- Keep primary controls as: separation-of-duties + allowlists + explicit approvals.
---
## 5. Risks / Vulnerabilities Still Present in Wardgate-style designs
### 5.1 Gateway compromise becomes catastrophic
Wardgate concentrates credentials and enforcement; docs emphasize a gateway role. :contentReference[oaicite:13]{index=13}
A compromised gateway is a high-impact failure.
ThreeGate lesson:
- minimize secret-bearing components
- isolate secret store from orchestrator
- prefer short-lived tokens and per-capability keys
### 5.2 Policy complexity (regex/wildcards) can create bypasses
Policy engines are only as safe as their least restrictive rule.
ThreeGate lesson:
- “policy linting” + unit tests for denies
- prefer templates/typed args over regex matching on raw command strings
### 5.3 Command parsing mismatch risk (shell semantics)
Any design that attempts to parse shell/pipelines risks mismatch between parser and actual shell semantics.
ThreeGate posture:
- keep “no shell metacharacters” default
- allow pipelines only via templates that execute without a shell
### 5.4 Long-lived exec channels expand attack surface
Conclave channels are operationally convenient but require strong authn, rotation, and output controls.
ThreeGate lesson:
- if adding “remote conclave execution,” use mTLS, short-lived credentials, and strict output caps.
---
## 6. Conclusion
Wardgates strongest transferable ideas for ThreeGate are:
- dynamic grants
- ask/approval as a first-class control
- tool templates (typed args)
- consistent output caps and allowlist manifests
Wardgate is complementary to ThreeGate:
- ThreeGate hardens cognition/data flow (prompt injection resistance)
- Wardgate hardens actuators/secrets (credential and side-effect isolation)