5.5 KiB
Threat Model
This document defines the threat model for ThreeGate, including assets, adversaries, attack surfaces, mitigations, and explicit out-of-scope threats.
ThreeGate is designed for single-user local operation and prioritizes structural containment over behavioral promises.
1. Assets to Protect
Primary Assets
- User data: notes, drafts, PDFs, research corpora, local documents
- Secrets: API keys, tokens, credentials, SSH keys, cookies
- System integrity: host OS, container images, configs, policy files
- Assistant integrity: component separation, network isolation, validation pipelines
- Provenance: citations, source traces, execution logs (auditability)
Secondary Assets
- Model weights and caches (integrity and confidentiality)
- Execution results and intermediate artifacts
- System availability (denial of service is relevant but not primary)
2. Adversaries and Capabilities
A. Malicious Content Provider
- Controls a webpage, PDF, or document that FETCH retrieves or user ingests
- Attempts indirect prompt injection to cause unsafe actions
Capabilities:
- Embed malicious instructions and deceptive content
- Craft content to manipulate citations and reasoning
- Provide poisoned research artifacts
B. Malicious User (or User Mistake)
- Provides prompts that request unsafe actions
- Pastes untrusted code for execution
- Misconfigures allowlists or mounts
Capabilities:
- Trigger tool requests
- Place files into ingestion directories
- Approve execution unintentionally
C. Supply-Chain Attacker
- Tampered container images, dependencies, ERA binary, or model weights
Capabilities:
- Replace artifacts at build or update time
- Introduce malicious binaries or scripts
D. Network Attacker
- Attempts MITM, DNS poisoning, or proxy abuse
- Tries to induce exfiltration through allowed domains
Capabilities:
- Manipulate network paths
- Exploit weak TLS validation or DNS configuration
3. Security Goals
G1: Prevent Untrusted Content from Triggering Action
Untrusted documents must not cause execution, installation, persistence, or exfiltration.
G2: Minimize Blast Radius of Compromise
A compromise of any single component must not yield end-to-end authority.
G3: Preserve Auditability
Key actions must be attributable, logged, and reviewable:
- Fetch operations and sources
- Packets accepted vs quarantined
- Execution requests and approvals
- Execution results and metadata
G4: Enforce Least Privilege by Construction
Topology and filesystem permissions must ensure least privilege even if the model misbehaves.
4. Attack Surfaces
CORE
- Prompt injection via Research Packets and local documents
- Attempts to coerce policy violations (“ignore rules”, “run commands”, etc.)
- Attempts to encode tool requests to bypass human review
FETCH
- Malicious websites attempting instruction injection
- Response content masquerading as policy, commands, or credentials
- Proxy bypass attempts, domain confusion attacks
TOOL-EXEC
- Malicious code in execution requests (intended or unintended)
- Attempted sandbox escape (microVM/container breakout)
- Attempts to write unexpected outputs or encode exfiltration payloads
Shared
- Handoff directories (malformed artifacts, schema bypass)
- Proxy allowlist and DNS resolution
- Container runtime configuration drift
5. Key Mitigations (Mapped to Threats)
M1: Compartmentalization (CORE/FETCH/TOOL-EXEC)
Mitigates end-to-end compromise by ensuring no single component:
- both browses and executes
- both reasons and acts
M2: Network Topology Enforcement
- CORE has no internet route
- FETCH only via allowlisted proxy
- TOOL-EXEC no network by default
Mitigates exfiltration and unauthorized retrieval.
M3: Deterministic Validation + Quarantine
- Research Packets must match strict schema
- Tool results must match strict schema
- Rejections go to quarantine; CORE never consumes them
Mitigates indirect injection and “format smuggling.”
M4: Human Approval Gate for Execution
- CORE may draft requests, but cannot execute
- Human must promote execution requests into TOOL-EXEC
- Every execution is logged
Mitigates automated tool abuse.
M5: Read-Only Policy Mounts and Immutable Configuration
- Policy files mounted read-only into containers
- Configuration changes require explicit operator action
Mitigates self-modification and persistence via prompt.
M6: Supply-Chain Hygiene (recommended)
- Pin image digests
- Verify releases (hash/signature where possible)
- Keep minimal base images
- Prefer reproducible builds
Mitigates tampered artifacts.
6. Explicit Out-of-Scope Threats
ThreeGate does not attempt to mitigate:
- Hardware fault induction (e.g., RowHammer)
- Microarchitectural side channels
- Kernel/firmware compromise
- Hostile multi-tenant co-residency scenarios
These threats are not aligned with the intended single-user local operating assumptions.
7. Residual Risks
Even with compartmentalization, residual risks include:
- User approving unsafe execution requests
- Allowlist misconfiguration enabling exfiltration channels
- Supply-chain compromise of container images or binaries
- Weak local host hygiene (unpatched kernel, insecure Docker daemon)
ThreeGate reduces consequences, but cannot replace operator diligence.
8. Security Posture Summary
ThreeGate assumes model fallibility and focuses on:
- strict separation of duties
- deterministic validation
- constrained connectivity
- human-gated execution
- auditable workflows