ThreeGate/docs/logging_metrics.md

90 lines
2.0 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Logging & Metrics (Privacy + Security)
ThreeGate benefits from audit logs, but logs can become an exfiltration and privacy liability.
This document defines what to log, what not to log, and how to bound retention.
---
## Goals
- Enable debugging and security review
- Prove that boundaries are enforced
- Avoid capturing sensitive content or secrets
- Avoid turning logs into a data lake
---
## Log What (Recommended)
### System events (safe)
- Container/service start/stop
- Validator ACCEPT/REJECT with artifact filename and reason codes
- Proxy allowlist changes (who/when/what)
- Firewall rules application success/failure
- Tool execution metadata:
- request_id
- backend
- runtime_sec
- exit_code
- artifact hashes (sha256)
- size of stdout/stderr (bytes)
### Retrieval metadata (safe)
- source_kind
- source_ref (URL/DOI) *if not sensitive*
- retrieved_utc
- content_type
- byte count
- redirect chain hosts (not full query strings)
---
## Do NOT Log (Hard Prohibitions)
- Full fetched page content / HTML bodies
- Full PDFs or extracted text
- Tool stdout/stderr content by default (store as artifacts, not logs)
- Secrets or tokens
- Local filesystem paths that reveal private structure (beyond controlled volumes)
- User prompts if they may contain sensitive content
---
## Retention
- Default: 730 days for operational logs
- Keep artifacts (packets/results) under your normal project retention policy
- Rotate proxy logs aggressively (high volume)
---
## Redaction
If you must log URLs, consider stripping:
- query strings (`?x=y`)
- fragments (`#section`)
- known tracking parameters
---
## Metrics (Minimal)
- count_validations_accept / reject
- count_fetch_requests, bytes_fetched
- count_tool_runs by backend
- mean runtime_sec by backend
- quarantine counts (packets/requests/results)
---
## Summary
Audit metadata is useful; content logging is dangerous.
Prefer:
- hashed artifacts + deterministic validators
- small, structured logs
- strict retention + rotation