4.0 KiB

Raw Blame History

Configuration

RoleMesh Gateway loads configuration from a YAML file (default: configs/models.yaml). Set ROLE_MESH_CONFIG to override.

Top-level schema

version: 1
default_model: writer
gateway:
  host: 0.0.0.0
  port: 8000
auth:
  client_api_keys: ["..."]
  node_api_keys: ["..."]
models:
  <alias>:
    type: proxy | discovered
    openai_model_name: <string>
    ...

<alias> is what clients pass as model in /v1/chat/completions.
openai_model_name is the model id returned by /v1/models (usually same as alias).

Roles are aliases, not a fixed list

RoleMesh Gateway does not reserve a built-in set of roles.

The keys under models: are your project-specific role names
Clients send those keys in the OpenAI model field
You can rename or replace the sample roles entirely
Different projects can use different role layouts with the same gateway

Example custom role set:

models:
  researcher:
    type: proxy
    openai_model_name: researcher
    proxy_url: http://127.0.0.1:8011
  summarizer:
    type: proxy
    openai_model_name: summarizer
    proxy_url: http://127.0.0.1:8012
  security-reviewer:
    type: proxy
    openai_model_name: security-reviewer
    proxy_url: http://127.0.0.1:8013

Proxy models

Route to a fixed upstream (any host reachable from the gateway):

models:
  writer:
    type: proxy
    openai_model_name: writer
    proxy_url: http://127.0.0.1:8012
    defaults:
      temperature: 0.6

Notes:

The model alias (writer above) is what the client sends in model.
openai_model_name is what the gateway returns from GET /v1/models.
proxy_url is the actual upstream backend to call.
defaults are only applied when the incoming request does not already set those keys.

Discovered models

Route to a dynamically registered node that claims the role:

models:
  reviewer:
    type: discovered
    openai_model_name: reviewer
    role: reviewer
    strategy: round_robin

Supported discovered-node strategies:

round_robin: rotate requests across fresh matching nodes
random: choose a fresh matching node at random for each request

Registering nodes

Nodes register to POST /v1/nodes/register:

{
  "node_id": "gpu-box-1",
  "base_url": "http://10.0.0.12:8014",
  "roles": ["reviewer", "planner"],
  "meta": {"gpu": "Tesla P40", "notes": "llama-server on GPU0"}
}

If auth.client_api_keys is set (non-empty), callers of /v1/models and /v1/chat/completions must provide an API key.

If auth.node_api_keys is set (non-empty), node agents calling /v1/nodes/register and /v1/nodes/heartbeat must provide a node key.

Supported headers:

Clients: Authorization: Bearer <key> or X-Api-Key: <key>
Nodes: Authorization: Bearer <node_key> or X-RoleMesh-Node-Key: <node_key>

Availability behavior

Configured aliases are not automatically assumed healthy.

GET /v1/models probes configured upstreams and only returns aliases that are currently reachable
unavailable aliases are included separately in rolemesh.unavailable_models
GET /ready returns success only when the configured default_model is currently usable
discovered nodes are only considered routable while they are fresh
gateway metadata marks stale registered nodes so operators can distinguish them from healthy nodes

This is especially important when a config contains multiple optional roles but only some backends are up.

For discovered-node freshness, the gateway uses the ROLE_MESH_NODE_STALE_AFTER_S environment variable. Default: 30.

Quick example

version: 1
default_model: planner
auth:
  client_api_keys: ["change-me-client-key"]
  node_api_keys: ["change-me-node-key"]
models:
  planner:
    type: proxy
    openai_model_name: planner
    proxy_url: http://127.0.0.1:8011
    defaults:
      temperature: 0
      max_tokens: 128
  writer:
    type: proxy
    openai_model_name: writer
    proxy_url: http://127.0.0.1:8012
    defaults:
      temperature: 0.6
      max_tokens: 256

4.0 KiB Raw Blame History