Updated README.md for project.

2026-02-09 21:34:29 -05:00 · 2026-02-09 21:34:29 -05:00 · 3423028db8
parent ecba73903d
commit 3423028db8
1 changed files with 146 additions and 71 deletions
--- a/README.md
+++ b/README.md
@ -1,111 +1,186 @@
 # ThreeGate

-**ThreeGate** is a compartmentalized architecture for building **secure, local AI assistants** that perform goal-directed tasks *without* relying on autonomous agents or trusting large language models to behave safely.
+**ThreeGate** is a security-first architecture for local, agent-assisted research and analysis.

-ThreeGate separates **thinking**, **retrieval**, and **execution** into distinct, least-privileged components with enforced trust boundaries.
+It is designed to support *useful* agent behavior while strictly limiting blast radius by enforcing
+**three independent security gates** between reasoning, retrieval, and execution.

-> If prompt injection is inevitable, safety must come from structure.
+The system is intentionally conservative: capability is earned incrementally, audited at every boundary,
+and never co-located with reasoning.

 ---

-## What ThreeGate Is
+## Core Design Goal

-ThreeGate is:
+> Enable powerful local research assistance **without** granting an AI system uncontrolled execution,
+> network access, or persistence.

- A **reference architecture** for secure local assistants
- A **defense-in-depth design** against prompt injection, tool abuse, and data exfiltration
- A **human-governed system**, not an autonomous agent
- Designed for **single-user, local operation**
- Explicitly extensible to multiple roles (research, policy analysis, data science, auditing)
+ThreeGate is especially suited to:
+- academic and technical research assistants
+- policy analysis
+- code review and synthesis
+- data transformation and ranking tasks
+
+It is *not* intended to run autonomous agents with self-directed persistence or open-ended tool use.

 ---

-## What ThreeGate Is Not
+## The Three Gates (Non-Negotiable)

-ThreeGate is **not**:
+### Gate 1 — **CORE (Reasoning & Synthesis)**

- An autonomous agent framework
- A self-modifying system
- A browsing-and-executing AI loop
- A cloud-first or multi-tenant platform
- A system that trusts LLM outputs without validation
+- No network access
+- No execution capability
+- Consumes *only validated artifacts*
+- Produces analysis, summaries, and *Tool Requests* for human approval
+
+CORE is the only place where LLM reasoning occurs.

 ---

-## Core Insight
+### Gate 2 — **FETCH (Controlled Retrieval)**

-Most unsafe AI systems fail because they allow a single component to:
+- HTTPS only
+- Strict domain allowlist
+- Proxy-enforced egress
+- Size-capped and content-typed retrieval
+- Emits **Research Packets** only (never executable instructions)

-> **Read untrusted input, reason about it, and immediately act on the world.**
-
-ThreeGate prevents this by enforcing **three independent gates**:
-
-1. **FETCH** — retrieves untrusted external content
-2. **CORE** — performs reasoning and synthesis
-3. **TOOL-EXEC** — executes code, only when explicitly approved
-
-No component crosses more than one gate.
+FETCH treats *all external content as hostile data*.

 ---

-## High-Level Architecture
+### Gate 3 — **TOOL-EXEC (Constrained Execution)**

-     Internet
-         ↑
-    [ Managed Proxy ]
-         ↑
-     FETCH (retrieval)
-         ↓
-    Research Packets
-         ↓
-     CORE (analysis)
-         ↓
- (optional, human-approved)
-	 ↓
- TOOL-EXEC (sandboxed execution)
+Execution is split into **two distinct backends**:
+
+#### TOOL-EXEC-Lite (Monty)
+- Python-subset interpreter
+- No filesystem
+- No environment
+- No network
+- No subprocess
+- No external functions (by default)
+- Stdio-only inputs/outputs
+
+Used for:
+- JSON transformations
+- ranking/scoring
+- small algorithms
+- validation helpers
+
+This is the **default execution lane**.
+
+#### TOOL-EXEC-Heavy (ERA microVM)
+- Full isolation via microVM
+- Used only when Monty cannot express the task
+- Requires explicit justification
+
+Both execution lanes:
+- Require human-approved Tool Requests
+- Emit **Tool Results** as immutable artifacts
+- Are never allowed to feed results back into FETCH or execute recursively

 ---

-## Initial Target Role
-
-The first concrete role implemented using ThreeGate is a:
-
-**Secure Local Research Assistant**
-
-Capabilities:
- Scholarly retrieval (controlled, allowlisted)
- Analysis and writing
- Optional sandboxed computation
- No autonomous browsing or execution
-
---
-
-## Repository Structure (Initial)
+## Data Flow (One-Way, Audited)

 ```
-ThreeGate/
-├── README.md
-├── docs/
-│ ├── architecture.md
-│ ├── threat-model.md
-│ └── why-this-is-safer.md
+
+External Sources
+↓
+FETCH
+↓  (Research Packets)
+handoff/
+↓
+CORE
+↓  (Approved Tool Requests)
+TOOL-EXEC
+↓  (Tool Results)
+handoff/
+↓
+CORE
+
+```
+
+No component both **decides** and **acts**.
+
+---
+
+## Artifact Types
+
+### Research Packet
+- Markdown + strict front matter
+- Metadata, bounded excerpts, provenance
+- No instructions, no executable content
+- Validated before CORE ingestion
+
+### Tool Request
+- Human-approved
+- Backend-specific (`ERA` or `monty`)
+- Declarative constraints
+- No self-modifying behavior
+
+### Tool Result
+- Immutable output
+- Captured stdout/stderr
+- Content hashes
+- Treated as untrusted data by CORE
+
+---
+
+## Security Principles (Do Not Violate)
+
+- No reasoning component may execute code
+- No execution component may reason or fetch
+- Network access is centralized and audited
+- Redirects are never trusted without re-validation
+- All cross-gate artifacts are hostile by default
+- Escalation of capability is a security change
+
+---
+
+## Repository Structure (Key Paths)
+
+```
+
+core/                 # CORE consumers (read-only)
+fetch/                # FETCH retrievers (proxy-bound)
+tool-exec/
+monty/              # TOOL-EXEC-Lite (pure compute)
+era/                # TOOL-EXEC-Heavy (microVM)
+tools/                # Validators and shared helpers
+policy/               # Human-readable enforcement rules
+infra/                # Docker, proxy, firewall scaffolding
+docs/                 # Architecture and operator guides
+
 ```

 ---

 ## Status

-This repository is in **early specification and reference implementation phase**.
+This repository currently provides:
+- Full FETCH scaffolding with allowlisted, size-capped retrieval
+- Crossref DOI metadata ingestion
+- Redirect-safe URL fetching
+- Monty execution backend (pure compute)
+- ERA execution stubs
+- Validators enforcing backend-specific rules

-The design is intentionally conservative. Convenience features are added *only* when they preserve trust boundaries.
+It is suitable for **local research workflows** and **controlled experimentation**.

 ---

-## License & Philosophy
+## Philosophy

-ThreeGate favors:
- Explicit over implicit authority
- Structural safety over behavioral promises
- Human-in-the-loop over automation
+ThreeGate assumes:
+- LLMs are powerful but non-deterministic
+- External content is adversarial by default
+- Execution is the highest-risk capability
+- Separation of duties beats clever sandboxing
+
+The goal is not to build an autonomous agent.
+The goal is to build a *trustworthy assistant*.
+```

-If a feature weakens a trust boundary, it does not belong here.