6.7 KiB
Configuration
RoleMesh Gateway loads configuration from a YAML file (default: configs/models.yaml).
Set ROLE_MESH_CONFIG to override.
Top-level schema
version: 1
default_model: writer
gateway:
host: 0.0.0.0
port: 8000
auth:
client_api_keys: ["..."]
node_api_keys: ["..."]
models:
<alias>:
type: proxy | discovered
openai_model_name: <string>
...
<alias>is what clients pass asmodelin/v1/chat/completions.openai_model_nameis the model id returned by/v1/models(usually same as alias).
Roles are aliases, not a fixed list
RoleMesh Gateway does not reserve a built-in set of roles.
- The keys under
models:are your project-specific role names - Clients send those keys in the OpenAI
modelfield - You can rename or replace the sample roles entirely
- Different projects can use different role layouts with the same gateway
Example custom role set:
models:
researcher:
type: proxy
openai_model_name: researcher
proxy_url: http://127.0.0.1:8011
summarizer:
type: proxy
openai_model_name: summarizer
proxy_url: http://127.0.0.1:8012
security-reviewer:
type: proxy
openai_model_name: security-reviewer
proxy_url: http://127.0.0.1:8013
Where the actual model weights are selected
This depends on the backend pattern.
For type: proxy
The gateway alias does not point directly to a weight file. It points to an already-running inference server:
models:
writer:
type: proxy
proxy_url: http://127.0.0.1:8012
The actual model weights are chosen by that upstream server, not by RoleMesh Gateway.
Examples:
llamafile --server -m /path/to/model.gguf ...llama-server -m /path/to/model.gguf ...- Ollama with
defaults.model: dolphin3:latest
| Upstream type | Where weights/model are chosen | RoleMesh fields involved |
|---|---|---|
llamafile --server |
backend startup CLI, usually -m /path/to/model.gguf |
proxy_url |
llama-server |
backend startup CLI, usually -m /path/to/model.gguf |
proxy_url |
| Ollama | request JSON model, optionally injected by the gateway |
proxy_url, defaults.model |
For type: discovered
The gateway still does not point directly to a weight file. It points to a role served by a registered node. The actual weight file is defined on the node side, usually in the node-agent config:
models:
- model_id: "planner-gguf"
path: "/models/SomePlannerModel.Q5_K_M.gguf"
roles: ["planner"]
In that setup:
- gateway alias -> discovered role
- discovered role -> registered node + concrete upstream model ID
- node-agent
path-> actual weight file on disk
Proxy models
Route to a fixed upstream (any host reachable from the gateway):
models:
writer:
type: proxy
openai_model_name: writer
proxy_url: http://127.0.0.1:8012
defaults:
temperature: 0.6
Notes:
- The model alias (
writerabove) is what the client sends inmodel. openai_model_nameis what the gateway returns fromGET /v1/models.proxy_urlis the actual upstream backend to call.defaultsare only applied when the incoming request does not already set those keys.
Discovered models
Route to a dynamically registered model instance that can satisfy the role:
models:
reviewer:
type: discovered
openai_model_name: reviewer
role: reviewer
strategy: round_robin
Supported discovered-node strategies:
round_robin: rotate requests across fresh matching nodesrandom: choose a fresh matching node at random for each request
Registering nodes
Nodes register to POST /v1/nodes/register:
{
"node_id": "gpu-box-1",
"base_url": "http://10.0.0.12:8014",
"served_models": [
{
"model_id": "qwen3-8b",
"roles": ["reviewer", "planner"],
"meta": {"family": "Qwen3", "quant": "Q5_K_M"}
},
{
"model_id": "qwen2.5-coder-14b",
"roles": ["coder"],
"meta": {"family": "Qwen2.5-Coder", "quant": "Q5_K_M"}
}
],
"meta": {"gpu": "Tesla P40", "notes": "llama-server on GPU0"}
}
served_models is now the preferred registration schema.
model_id: concrete model name the upstream node expects in the forwarded OpenAI requestroles: workflow roles that this model can satisfymeta: optional operator-facing metadata
Legacy flat roles registration is still accepted for compatibility, but it is treated as a fallback where
model_id == role.
If auth.client_api_keys is set (non-empty), callers of /v1/models and /v1/chat/completions must provide an API key.
If auth.node_api_keys is set (non-empty), node agents calling /v1/nodes/register and /v1/nodes/heartbeat must provide a node key.
Supported headers:
- Clients:
Authorization: Bearer <key>orX-Api-Key: <key> - Nodes:
Authorization: Bearer <node_key>orX-RoleMesh-Node-Key: <node_key>
Availability behavior
Configured aliases are not automatically assumed healthy.
GET /v1/modelsprobes configured upstreams and only returns aliases that are currently reachable- unavailable aliases are included separately in
rolemesh.unavailable_models GET /readyreturns success only when the configureddefault_modelis currently usable- discovered nodes are only considered routable while they are fresh
- gateway metadata marks stale registered nodes so operators can distinguish them from healthy nodes
This is especially important when a config contains multiple optional roles but only some backends are up.
For discovered-node freshness, the gateway uses the ROLE_MESH_NODE_STALE_AFTER_S environment variable.
Default: 30.
Quick example
version: 1
default_model: planner
auth:
client_api_keys: ["change-me-client-key"]
node_api_keys: ["change-me-node-key"]
models:
planner:
type: proxy
openai_model_name: planner
proxy_url: http://127.0.0.1:8011
defaults:
temperature: 0
max_tokens: 128
writer:
type: proxy
openai_model_name: writer
proxy_url: http://127.0.0.1:8012
defaults:
temperature: 0.6
max_tokens: 256
Base config plus local override
Recommended pattern:
- keep tracked repo config generic
- keep machine-specific values in a separate local YAML
- merge the local YAML at launch
Gateway:
rolemesh-gateway --config configs/models.example.yaml --config-override configs/models.local.yaml
Node agent:
rolemesh-node-agent --config configs/node_agent.example.yaml --config-override configs/node_agent.local.yaml
The merge is recursive for mappings:
- nested dictionaries are merged
- lists and scalar values are replaced by the override file
This is useful for separating:
- real model weight paths
- local host IPs
- local API keys
- local
llama-serverpaths