90 lines
2.0 KiB
Markdown
90 lines
2.0 KiB
Markdown
# Logging & Metrics (Privacy + Security)
|
||
|
||
ThreeGate benefits from audit logs, but logs can become an exfiltration and privacy liability.
|
||
|
||
This document defines what to log, what not to log, and how to bound retention.
|
||
|
||
---
|
||
|
||
## Goals
|
||
|
||
- Enable debugging and security review
|
||
- Prove that boundaries are enforced
|
||
- Avoid capturing sensitive content or secrets
|
||
- Avoid turning logs into a data lake
|
||
|
||
---
|
||
|
||
## Log What (Recommended)
|
||
|
||
### System events (safe)
|
||
- Container/service start/stop
|
||
- Validator ACCEPT/REJECT with artifact filename and reason codes
|
||
- Proxy allowlist changes (who/when/what)
|
||
- Firewall rules application success/failure
|
||
- Tool execution metadata:
|
||
- request_id
|
||
- backend
|
||
- runtime_sec
|
||
- exit_code
|
||
- artifact hashes (sha256)
|
||
- size of stdout/stderr (bytes)
|
||
|
||
### Retrieval metadata (safe)
|
||
- source_kind
|
||
- source_ref (URL/DOI) *if not sensitive*
|
||
- retrieved_utc
|
||
- content_type
|
||
- byte count
|
||
- redirect chain hosts (not full query strings)
|
||
|
||
---
|
||
|
||
## Do NOT Log (Hard Prohibitions)
|
||
|
||
- Full fetched page content / HTML bodies
|
||
- Full PDFs or extracted text
|
||
- Tool stdout/stderr content by default (store as artifacts, not logs)
|
||
- Secrets or tokens
|
||
- Local filesystem paths that reveal private structure (beyond controlled volumes)
|
||
- User prompts if they may contain sensitive content
|
||
|
||
---
|
||
|
||
## Retention
|
||
|
||
- Default: 7–30 days for operational logs
|
||
- Keep artifacts (packets/results) under your normal project retention policy
|
||
- Rotate proxy logs aggressively (high volume)
|
||
|
||
---
|
||
|
||
## Redaction
|
||
|
||
If you must log URLs, consider stripping:
|
||
- query strings (`?x=y`)
|
||
- fragments (`#section`)
|
||
- known tracking parameters
|
||
|
||
---
|
||
|
||
## Metrics (Minimal)
|
||
|
||
- count_validations_accept / reject
|
||
- count_fetch_requests, bytes_fetched
|
||
- count_tool_runs by backend
|
||
- mean runtime_sec by backend
|
||
- quarantine counts (packets/requests/results)
|
||
|
||
---
|
||
|
||
## Summary
|
||
|
||
Audit metadata is useful; content logging is dangerous.
|
||
|
||
Prefer:
|
||
- hashed artifacts + deterministic validators
|
||
- small, structured logs
|
||
- strict retention + rotation
|
||
|