RoleMesh-Gateway/docs/CONFIG.md

4.0 KiB

Configuration

RoleMesh Gateway loads configuration from a YAML file (default: configs/models.yaml). Set ROLE_MESH_CONFIG to override.

Top-level schema

version: 1
default_model: writer
gateway:
  host: 0.0.0.0
  port: 8000
auth:
  client_api_keys: ["..."]
  node_api_keys: ["..."]
models:
  <alias>:
    type: proxy | discovered
    openai_model_name: <string>
    ...
  • <alias> is what clients pass as model in /v1/chat/completions.
  • openai_model_name is the model id returned by /v1/models (usually same as alias).

Roles are aliases, not a fixed list

RoleMesh Gateway does not reserve a built-in set of roles.

  • The keys under models: are your project-specific role names
  • Clients send those keys in the OpenAI model field
  • You can rename or replace the sample roles entirely
  • Different projects can use different role layouts with the same gateway

Example custom role set:

models:
  researcher:
    type: proxy
    openai_model_name: researcher
    proxy_url: http://127.0.0.1:8011
  summarizer:
    type: proxy
    openai_model_name: summarizer
    proxy_url: http://127.0.0.1:8012
  security-reviewer:
    type: proxy
    openai_model_name: security-reviewer
    proxy_url: http://127.0.0.1:8013

Proxy models

Route to a fixed upstream (any host reachable from the gateway):

models:
  writer:
    type: proxy
    openai_model_name: writer
    proxy_url: http://127.0.0.1:8012
    defaults:
      temperature: 0.6

Notes:

  • The model alias (writer above) is what the client sends in model.
  • openai_model_name is what the gateway returns from GET /v1/models.
  • proxy_url is the actual upstream backend to call.
  • defaults are only applied when the incoming request does not already set those keys.

Discovered models

Route to a dynamically registered node that claims the role:

models:
  reviewer:
    type: discovered
    openai_model_name: reviewer
    role: reviewer
    strategy: round_robin

Supported discovered-node strategies:

  • round_robin: rotate requests across fresh matching nodes
  • random: choose a fresh matching node at random for each request

Registering nodes

Nodes register to POST /v1/nodes/register:

{
  "node_id": "gpu-box-1",
  "base_url": "http://10.0.0.12:8014",
  "roles": ["reviewer", "planner"],
  "meta": {"gpu": "Tesla P40", "notes": "llama-server on GPU0"}
}

If auth.client_api_keys is set (non-empty), callers of /v1/models and /v1/chat/completions must provide an API key.

If auth.node_api_keys is set (non-empty), node agents calling /v1/nodes/register and /v1/nodes/heartbeat must provide a node key.

Supported headers:

  • Clients: Authorization: Bearer <key> or X-Api-Key: <key>
  • Nodes: Authorization: Bearer <node_key> or X-RoleMesh-Node-Key: <node_key>

Availability behavior

Configured aliases are not automatically assumed healthy.

  • GET /v1/models probes configured upstreams and only returns aliases that are currently reachable
  • unavailable aliases are included separately in rolemesh.unavailable_models
  • GET /ready returns success only when the configured default_model is currently usable
  • discovered nodes are only considered routable while they are fresh
  • gateway metadata marks stale registered nodes so operators can distinguish them from healthy nodes

This is especially important when a config contains multiple optional roles but only some backends are up.

For discovered-node freshness, the gateway uses the ROLE_MESH_NODE_STALE_AFTER_S environment variable. Default: 30.

Quick example

version: 1
default_model: planner
auth:
  client_api_keys: ["change-me-client-key"]
  node_api_keys: ["change-me-node-key"]
models:
  planner:
    type: proxy
    openai_model_name: planner
    proxy_url: http://127.0.0.1:8011
    defaults:
      temperature: 0
      max_tokens: 128
  writer:
    type: proxy
    openai_model_name: writer
    proxy_url: http://127.0.0.1:8012
    defaults:
      temperature: 0.6
      max_tokens: 256