ThreeGate/docs/threat-model.md

# Threat Model

This document defines the threat model for ThreeGate, including assets, adversaries, attack surfaces, mitigations, and explicit out-of-scope threats.

ThreeGate is designed for **single-user local operation** and prioritizes structural containment over behavioral promises.

---

## 1. Assets to Protect

### Primary Assets
- **User data**: notes, drafts, PDFs, research corpora, local documents
- **Secrets**: API keys, tokens, credentials, SSH keys, cookies
- **System integrity**: host OS, container images, configs, policy files
- **Assistant integrity**: component separation, network isolation, validation pipelines
- **Provenance**: citations, source traces, execution logs (auditability)

### Secondary Assets
- Model weights and caches (integrity and confidentiality)
- Execution results and intermediate artifacts
- System availability (denial of service is relevant but not primary)

---

## 2. Adversaries and Capabilities

### A. Malicious Content Provider
- Controls a webpage, PDF, or document that FETCH retrieves or user ingests
- Attempts **indirect prompt injection** to cause unsafe actions

Capabilities:
- Embed malicious instructions and deceptive content
- Craft content to manipulate citations and reasoning
- Provide poisoned research artifacts

### B. Malicious User (or User Mistake)
- Provides prompts that request unsafe actions
- Pastes untrusted code for execution
- Misconfigures allowlists or mounts

Capabilities:
- Trigger tool requests
- Place files into ingestion directories
- Approve execution unintentionally

### C. Supply-Chain Attacker
- Tampered container images, dependencies, ERA binary, or model weights

Capabilities:
- Replace artifacts at build or update time
- Introduce malicious binaries or scripts

### D. Network Attacker
- Attempts MITM, DNS poisoning, or proxy abuse
- Tries to induce exfiltration through allowed domains

Capabilities:
- Manipulate network paths
- Exploit weak TLS validation or DNS configuration

---

## 3. Security Goals

### G1: Prevent Untrusted Content from Triggering Action
Untrusted documents must not cause execution, installation, persistence, or exfiltration.

### G2: Minimize Blast Radius of Compromise
A compromise of any single component must not yield end-to-end authority.

### G3: Preserve Auditability
Key actions must be attributable, logged, and reviewable:
- Fetch operations and sources
- Packets accepted vs quarantined
- Execution requests and approvals
- Execution results and metadata

### G4: Enforce Least Privilege by Construction
Topology and filesystem permissions must ensure least privilege even if the model misbehaves.

---

## 4. Attack Surfaces

### CORE
- Prompt injection via Research Packets and local documents
- Attempts to coerce policy violations (“ignore rules”, “run commands”, etc.)
- Attempts to encode tool requests to bypass human review

### FETCH
- Malicious websites attempting instruction injection
- Response content masquerading as policy, commands, or credentials
- Proxy bypass attempts, domain confusion attacks

### TOOL-EXEC
- Malicious code in execution requests (intended or unintended)
- Attempted sandbox escape (microVM/container breakout)
- Attempts to write unexpected outputs or encode exfiltration payloads

### Shared
- Handoff directories (malformed artifacts, schema bypass)
- Proxy allowlist and DNS resolution
- Container runtime configuration drift

---

## 5. Key Mitigations (Mapped to Threats)

### M1: Compartmentalization (CORE/FETCH/TOOL-EXEC)
Mitigates end-to-end compromise by ensuring no single component:
- both browses and executes
- both reasons and acts

### M2: Network Topology Enforcement
- CORE has no internet route
- FETCH only via allowlisted proxy
- TOOL-EXEC no network by default

Mitigates exfiltration and unauthorized retrieval.

### M3: Deterministic Validation + Quarantine
- Research Packets must match strict schema
- Tool results must match strict schema
- Rejections go to quarantine; CORE never consumes them

Mitigates indirect injection and “format smuggling.”

### M4: Human Approval Gate for Execution
- CORE may draft requests, but cannot execute
- Human must promote execution requests into TOOL-EXEC
- Every execution is logged

Mitigates automated tool abuse.

### M5: Read-Only Policy Mounts and Immutable Configuration
- Policy files mounted read-only into containers
- Configuration changes require explicit operator action

Mitigates self-modification and persistence via prompt.

### M6: Supply-Chain Hygiene (recommended)
- Pin image digests
- Verify releases (hash/signature where possible)
- Keep minimal base images
- Prefer reproducible builds

Mitigates tampered artifacts.

---

## 6. Explicit Out-of-Scope Threats

ThreeGate does not attempt to mitigate:
- Hardware fault induction (e.g., RowHammer)
- Microarchitectural side channels
- Kernel/firmware compromise
- Hostile multi-tenant co-residency scenarios

These threats are not aligned with the intended single-user local operating assumptions.

---

## 7. Residual Risks

Even with compartmentalization, residual risks include:
- User approving unsafe execution requests
- Allowlist misconfiguration enabling exfiltration channels
- Supply-chain compromise of container images or binaries
- Weak local host hygiene (unpatched kernel, insecure Docker daemon)

ThreeGate reduces consequences, but cannot replace operator diligence.

---

## 8. Security Posture Summary

ThreeGate assumes model fallibility and focuses on:
- strict separation of duties
- deterministic validation
- constrained connectivity
- human-gated execution
- auditable workflows