2.5 KiB

Raw Permalink Blame History

RoleMesh Gateway

RoleMesh Gateway is a lightweight OpenAI-compatible API gateway for routing chat-completions requests to multiple locally hosted LLM backends (e.g., llama.cpp llama-server) by role (planner, writer, coder, reviewer, …).

It is designed for agentic workflows that benefit from using different models for different steps, and for deployments where different machines host different models (e.g., GPU box for fast inference, big RAM CPU box for large models).

What you get

OpenAI-compatible endpoints:
- GET /v1/models
- POST /v1/chat/completions (streaming and non-streaming)
- GET /health and GET /ready
Model registry from configs/models.yaml
Optional node registration so remote machines can announce role backends to the gateway
Robust proxying with explicit httpx timeouts (no “hang forever”)
Structured logging with request IDs

Quick start (proxy mode)

Create a venv and install:

python -m venv .venv
source .venv/bin/activate
pip install -e .

Copy the example config:

cp configs/models.example.yaml configs/models.yaml

Run the gateway:

ROLE_MESH_CONFIG=configs/models.yaml uvicorn rolemesh_gateway.main:app --host 0.0.0.0 --port 8000

Smoke test:

bash scripts/smoke_test.sh http://127.0.0.1:8000

Multi-host (node registration)

If you want machines to host backends and “register” them dynamically, run a tiny node agent on each backend host (or just call the registration endpoint from your own tooling).

Gateway endpoint: POST /v1/nodes/register
Node payload describes which roles it serves and the base URL to reach its OpenAI-compatible backend.

See: docs/DEPLOYMENT.md and docs/CONFIG.md.

Status

This repository is a preliminary scaffold:

Proxying to OpenAI-compatible upstreams works.
Registration and load-selection are implemented (basic round-robin), but persistence and auth are TODOs.

License

MIT. See LICENSE.

Node Agent (per-host)

This repo also includes a RoleMesh Node Agent (rolemesh-node-agent) that can manage persistent llama.cpp servers (one per GPU) and report inventory/metrics back to the gateway.

Sample config: configs/node_agent.example.yaml
Docs: docs/NODE_AGENT.md

Safe-by-default binding

Gateway and node-agent default to binding on 127.0.0.1 to avoid accidental exposure. Bind only to private/LAN or VPN interfaces and firewall ports if you need remote access.

2.5 KiB Raw Permalink Blame History