# Configuration RoleMesh Gateway loads configuration from a YAML file (default: `configs/models.yaml`). Set `ROLE_MESH_CONFIG` to override. ## Top-level schema ```yaml version: 1 default_model: writer gateway: host: 0.0.0.0 port: 8000 auth: client_api_keys: ["..."] node_api_keys: ["..."] models: : type: proxy | discovered openai_model_name: ... ``` - `` is what clients pass as `model` in `/v1/chat/completions`. - `openai_model_name` is the model id returned by `/v1/models` (usually same as alias). ## Roles are aliases, not a fixed list RoleMesh Gateway does not reserve a built-in set of roles. - The keys under `models:` are your project-specific role names - Clients send those keys in the OpenAI `model` field - You can rename or replace the sample roles entirely - Different projects can use different role layouts with the same gateway Example custom role set: ```yaml models: researcher: type: proxy openai_model_name: researcher proxy_url: http://127.0.0.1:8011 summarizer: type: proxy openai_model_name: summarizer proxy_url: http://127.0.0.1:8012 security-reviewer: type: proxy openai_model_name: security-reviewer proxy_url: http://127.0.0.1:8013 ``` ## Where the actual model weights are selected This depends on the backend pattern. ### For `type: proxy` The gateway alias does **not** point directly to a weight file. It points to an already-running inference server: ```yaml models: writer: type: proxy proxy_url: http://127.0.0.1:8012 ``` The actual model weights are chosen by that upstream server, not by RoleMesh Gateway. Examples: - `llamafile --server -m /path/to/model.gguf ...` - `llama-server -m /path/to/model.gguf ...` - Ollama with `defaults.model: dolphin3:latest` | Upstream type | Where weights/model are chosen | RoleMesh fields involved | | --- | --- | --- | | `llamafile --server` | backend startup CLI, usually `-m /path/to/model.gguf` | `proxy_url` | | `llama-server` | backend startup CLI, usually `-m /path/to/model.gguf` | `proxy_url` | | Ollama | request JSON `model`, optionally injected by the gateway | `proxy_url`, `defaults.model` | ### For `type: discovered` The gateway still does not point directly to a weight file. It points to a role served by a registered node. The actual weight file is defined on the node side, usually in the node-agent config: ```yaml models: - model_id: "planner-gguf" path: "/models/SomePlannerModel.Q5_K_M.gguf" roles: ["planner"] ``` In that setup: - gateway alias -> discovered role - discovered role -> registered node + concrete upstream model ID - node-agent `path` -> actual weight file on disk ## Proxy models Route to a fixed upstream (any host reachable from the gateway): ```yaml models: writer: type: proxy openai_model_name: writer proxy_url: http://127.0.0.1:8012 defaults: temperature: 0.6 ``` Notes: - The model alias (`writer` above) is what the client sends in `model`. - `openai_model_name` is what the gateway returns from `GET /v1/models`. - `proxy_url` is the actual upstream backend to call. - `defaults` are only applied when the incoming request does not already set those keys. ## Discovered models Route to a dynamically registered model instance that can satisfy the role: ```yaml models: reviewer: type: discovered openai_model_name: reviewer role: reviewer strategy: round_robin ``` Supported discovered-node strategies: - `round_robin`: rotate requests across fresh matching nodes - `random`: choose a fresh matching node at random for each request ### Registering nodes Nodes register to `POST /v1/nodes/register`: ```json { "node_id": "gpu-box-1", "base_url": "http://10.0.0.12:8014", "served_models": [ { "model_id": "qwen3-8b", "roles": ["reviewer", "planner"], "meta": {"family": "Qwen3", "quant": "Q5_K_M"} }, { "model_id": "qwen2.5-coder-14b", "roles": ["coder"], "meta": {"family": "Qwen2.5-Coder", "quant": "Q5_K_M"} } ], "meta": {"gpu": "Tesla P40", "notes": "llama-server on GPU0"} } ``` `served_models` is now the preferred registration schema. - `model_id`: concrete model name the upstream node expects in the forwarded OpenAI request - `roles`: workflow roles that this model can satisfy - `meta`: optional operator-facing metadata Legacy flat `roles` registration is still accepted for compatibility, but it is treated as a fallback where `model_id == role`. If `auth.client_api_keys` is set (non-empty), callers of `/v1/models` and `/v1/chat/completions` must provide an API key. If `auth.node_api_keys` is set (non-empty), node agents calling `/v1/nodes/register` and `/v1/nodes/heartbeat` must provide a node key. Supported headers: - Clients: `Authorization: Bearer ` or `X-Api-Key: ` - Nodes: `Authorization: Bearer ` or `X-RoleMesh-Node-Key: ` ## Availability behavior Configured aliases are not automatically assumed healthy. - `GET /v1/models` probes configured upstreams and only returns aliases that are currently reachable - unavailable aliases are included separately in `rolemesh.unavailable_models` - `GET /ready` returns success only when the configured `default_model` is currently usable - discovered nodes are only considered routable while they are fresh - gateway metadata marks stale registered nodes so operators can distinguish them from healthy nodes This is especially important when a config contains multiple optional roles but only some backends are up. For discovered-node freshness, the gateway uses the `ROLE_MESH_NODE_STALE_AFTER_S` environment variable. Default: `30`. ## Quick example ```yaml version: 1 default_model: planner auth: client_api_keys: ["change-me-client-key"] node_api_keys: ["change-me-node-key"] models: planner: type: proxy openai_model_name: planner proxy_url: http://127.0.0.1:8011 defaults: temperature: 0 max_tokens: 128 writer: type: proxy openai_model_name: writer proxy_url: http://127.0.0.1:8012 defaults: temperature: 0.6 max_tokens: 256 ``` ## Base config plus local override Recommended pattern: - keep tracked repo config generic - keep machine-specific values in a separate local YAML - merge the local YAML at launch Gateway: ```bash rolemesh-gateway --config configs/models.example.yaml --config-override configs/models.local.yaml ``` Node agent: ```bash rolemesh-node-agent --config configs/node_agent.example.yaml --config-override configs/node_agent.local.yaml ``` The merge is recursive for mappings: - nested dictionaries are merged - lists and scalar values are replaced by the override file This is useful for separating: - real model weight paths - local host IPs - local API keys - local `llama-server` paths