184 lines
5.5 KiB
Markdown
184 lines
5.5 KiB
Markdown
# Threat Model
|
|
|
|
This document defines the threat model for ThreeGate, including assets, adversaries, attack surfaces, mitigations, and explicit out-of-scope threats.
|
|
|
|
ThreeGate is designed for **single-user local operation** and prioritizes structural containment over behavioral promises.
|
|
|
|
---
|
|
|
|
## 1. Assets to Protect
|
|
|
|
### Primary Assets
|
|
- **User data**: notes, drafts, PDFs, research corpora, local documents
|
|
- **Secrets**: API keys, tokens, credentials, SSH keys, cookies
|
|
- **System integrity**: host OS, container images, configs, policy files
|
|
- **Assistant integrity**: component separation, network isolation, validation pipelines
|
|
- **Provenance**: citations, source traces, execution logs (auditability)
|
|
|
|
### Secondary Assets
|
|
- Model weights and caches (integrity and confidentiality)
|
|
- Execution results and intermediate artifacts
|
|
- System availability (denial of service is relevant but not primary)
|
|
|
|
---
|
|
|
|
## 2. Adversaries and Capabilities
|
|
|
|
### A. Malicious Content Provider
|
|
- Controls a webpage, PDF, or document that FETCH retrieves or user ingests
|
|
- Attempts **indirect prompt injection** to cause unsafe actions
|
|
|
|
Capabilities:
|
|
- Embed malicious instructions and deceptive content
|
|
- Craft content to manipulate citations and reasoning
|
|
- Provide poisoned research artifacts
|
|
|
|
### B. Malicious User (or User Mistake)
|
|
- Provides prompts that request unsafe actions
|
|
- Pastes untrusted code for execution
|
|
- Misconfigures allowlists or mounts
|
|
|
|
Capabilities:
|
|
- Trigger tool requests
|
|
- Place files into ingestion directories
|
|
- Approve execution unintentionally
|
|
|
|
### C. Supply-Chain Attacker
|
|
- Tampered container images, dependencies, ERA binary, or model weights
|
|
|
|
Capabilities:
|
|
- Replace artifacts at build or update time
|
|
- Introduce malicious binaries or scripts
|
|
|
|
### D. Network Attacker
|
|
- Attempts MITM, DNS poisoning, or proxy abuse
|
|
- Tries to induce exfiltration through allowed domains
|
|
|
|
Capabilities:
|
|
- Manipulate network paths
|
|
- Exploit weak TLS validation or DNS configuration
|
|
|
|
---
|
|
|
|
## 3. Security Goals
|
|
|
|
### G1: Prevent Untrusted Content from Triggering Action
|
|
Untrusted documents must not cause execution, installation, persistence, or exfiltration.
|
|
|
|
### G2: Minimize Blast Radius of Compromise
|
|
A compromise of any single component must not yield end-to-end authority.
|
|
|
|
### G3: Preserve Auditability
|
|
Key actions must be attributable, logged, and reviewable:
|
|
- Fetch operations and sources
|
|
- Packets accepted vs quarantined
|
|
- Execution requests and approvals
|
|
- Execution results and metadata
|
|
|
|
### G4: Enforce Least Privilege by Construction
|
|
Topology and filesystem permissions must ensure least privilege even if the model misbehaves.
|
|
|
|
---
|
|
|
|
## 4. Attack Surfaces
|
|
|
|
### CORE
|
|
- Prompt injection via Research Packets and local documents
|
|
- Attempts to coerce policy violations (“ignore rules”, “run commands”, etc.)
|
|
- Attempts to encode tool requests to bypass human review
|
|
|
|
### FETCH
|
|
- Malicious websites attempting instruction injection
|
|
- Response content masquerading as policy, commands, or credentials
|
|
- Proxy bypass attempts, domain confusion attacks
|
|
|
|
### TOOL-EXEC
|
|
- Malicious code in execution requests (intended or unintended)
|
|
- Attempted sandbox escape (microVM/container breakout)
|
|
- Attempts to write unexpected outputs or encode exfiltration payloads
|
|
|
|
### Shared
|
|
- Handoff directories (malformed artifacts, schema bypass)
|
|
- Proxy allowlist and DNS resolution
|
|
- Container runtime configuration drift
|
|
|
|
---
|
|
|
|
## 5. Key Mitigations (Mapped to Threats)
|
|
|
|
### M1: Compartmentalization (CORE/FETCH/TOOL-EXEC)
|
|
Mitigates end-to-end compromise by ensuring no single component:
|
|
- both browses and executes
|
|
- both reasons and acts
|
|
|
|
### M2: Network Topology Enforcement
|
|
- CORE has no internet route
|
|
- FETCH only via allowlisted proxy
|
|
- TOOL-EXEC no network by default
|
|
|
|
Mitigates exfiltration and unauthorized retrieval.
|
|
|
|
### M3: Deterministic Validation + Quarantine
|
|
- Research Packets must match strict schema
|
|
- Tool results must match strict schema
|
|
- Rejections go to quarantine; CORE never consumes them
|
|
|
|
Mitigates indirect injection and “format smuggling.”
|
|
|
|
### M4: Human Approval Gate for Execution
|
|
- CORE may draft requests, but cannot execute
|
|
- Human must promote execution requests into TOOL-EXEC
|
|
- Every execution is logged
|
|
|
|
Mitigates automated tool abuse.
|
|
|
|
### M5: Read-Only Policy Mounts and Immutable Configuration
|
|
- Policy files mounted read-only into containers
|
|
- Configuration changes require explicit operator action
|
|
|
|
Mitigates self-modification and persistence via prompt.
|
|
|
|
### M6: Supply-Chain Hygiene (recommended)
|
|
- Pin image digests
|
|
- Verify releases (hash/signature where possible)
|
|
- Keep minimal base images
|
|
- Prefer reproducible builds
|
|
|
|
Mitigates tampered artifacts.
|
|
|
|
---
|
|
|
|
## 6. Explicit Out-of-Scope Threats
|
|
|
|
ThreeGate does not attempt to mitigate:
|
|
- Hardware fault induction (e.g., RowHammer)
|
|
- Microarchitectural side channels
|
|
- Kernel/firmware compromise
|
|
- Hostile multi-tenant co-residency scenarios
|
|
|
|
These threats are not aligned with the intended single-user local operating assumptions.
|
|
|
|
---
|
|
|
|
## 7. Residual Risks
|
|
|
|
Even with compartmentalization, residual risks include:
|
|
- User approving unsafe execution requests
|
|
- Allowlist misconfiguration enabling exfiltration channels
|
|
- Supply-chain compromise of container images or binaries
|
|
- Weak local host hygiene (unpatched kernel, insecure Docker daemon)
|
|
|
|
ThreeGate reduces consequences, but cannot replace operator diligence.
|
|
|
|
---
|
|
|
|
## 8. Security Posture Summary
|
|
|
|
ThreeGate assumes model fallibility and focuses on:
|
|
- strict separation of duties
|
|
- deterministic validation
|
|
- constrained connectivity
|
|
- human-gated execution
|
|
- auditable workflows
|
|
|