ThreeGate/docs/why-this-is-safer.md

3.0 KiB
Raw Blame History

Why ThreeGate Is Safer Than Agent-Based Systems

This document explains why the ThreeGate architecture materially reduces risk compared to common agent and tool-using AI frameworks.


The Core Problem with Agents

Most agent frameworks combine:

  • Untrusted input ingestion
  • Reasoning
  • Tool execution
  • Network access
  • Persistent state

…inside a single loop.

If prompt injection succeeds — and it eventually will — the model can immediately act with real-world authority.

This is known as the confused deputy problem.


ThreeGates Structural Advantage

ThreeGate prevents confused deputies by separating authority.

Capability FETCH CORE TOOL-EXEC
Internet access Yes (restricted) No No (default)
Reasoning No Yes No
Execution No No Yes (gated)
Persistence Minimal Limited None (default)

No component has enough authority to cause harm on its own.


Prompt Injection Is Assumed, Not Denied

ThreeGate assumes:

  • Prompt injection cannot be perfectly prevented
  • Indirect injection via documents and web pages is common
  • Behavioral safeguards alone are insufficient

Therefore:

  • All external content is treated as data, not instructions
  • Outputs are constrained and validated
  • Consequences are limited by topology

Tool Use Is the Primary Risk Multiplier

Execution is where AI systems most often fail catastrophically.

ThreeGate:

  • Makes execution optional
  • Requires explicit human approval
  • Sandboxes execution in an isolated environment
  • Treats execution output as hostile input

This dramatically reduces blast radius compared to agent loops that auto-execute.


Network Access Is Physically Constrained

Many systems rely on the model to “decide responsibly” when using the network.

ThreeGate instead:

  • Removes network access entirely from CORE
  • Forces FETCH through an allowlisted proxy
  • Defaults TOOL-EXEC to no network

This is security by topology, not trust.


Residual Risk Is Explicitly Scoped

ThreeGate does not claim to defend against:

  • Hardware fault induction (e.g., RowHammer)
  • Microarchitectural side channels
  • Kernel or firmware exploits
  • Hostile multi-tenant environments

The system is designed for single-user local operation and documents its threat boundaries clearly.


Why This Matters

ThreeGate demonstrates a crucial shift in thinking:

Safety does not come from making models behave better.
Safety comes from making misbehavior inconsequential.

By breaking the agent loop into gated components, ThreeGate enables powerful assistance without granting unbounded authority.


Summary

ThreeGate is safer because it:

  • Eliminates confused deputies
  • Treats all external input as hostile
  • Separates reasoning from action
  • Makes execution rare and auditable
  • Enforces policy at the OS and network level

This is not an optimization of existing agent designs.
It is a different class of system.