4.6 KiB
GenieHive Translation Support
This note describes the control-plane and node configuration needed to support translation clients such as SciSiteForge.
GenieHive already exposes the core transport needed for translation:
POST /v1/chat/completions- client API keys
- role-based routing
- OpenAI-compatible upstream services
Translation support is mostly a matter of configuration discipline.
Control Plane
The control plane should provide a translation-oriented role or directly addressable model that the client can target.
Recommended control-plane changes:
- Add a dedicated role, for example
scientific_translator. - Keep it as
operation: "chat". - Use a conservative prompt policy that returns translation only.
- Prefer a stable, instruction-following model family.
- Keep the role in the loaded role catalog so it appears in
/v1/modelsand route-resolution output.
Example role entry:
- role_id: "scientific_translator"
display_name: "Scientific Translator"
description: "Translation-oriented chat route for site localization"
operation: "chat"
modality: "text"
prompt_policy:
system_prompt: "Translate faithfully. Preserve meaning, structure, citations, and technical terms. Return only the translation."
request_policy:
body_defaults:
temperature: 0.1
routing_policy:
preferred_families: ["Qwen3", "Mistral", "Llama"]
min_context: 8192
require_loaded: true
What matters operationally:
- the role must resolve to a healthy chat service
- the role should stay loaded on a model with enough context for page-sized paragraph batches
- the control plane should not silently route translation requests to a low-context or partially loaded fallback unless that is explicitly intended
Auth and Exposure
Keep the same separation used for other GenieHive clients:
X-Api-Keyfor client requestsX-GenieHive-Node-Keyfor node registration and heartbeats
If the control plane is exposed beyond localhost, prefer a reverse proxy and keep the upstream control port private.
Node Requirements
A node that is meant to serve translation traffic should expose one or more healthy chat services that can accept small, repeated requests.
Recommended node configuration:
- chat service kind:
chat - runtime: any OpenAI-compatible upstream that GenieHive can route to
- assets: a loaded instruction-following model
- observed latency and throughput: populated so scoring can prefer the right node
accept_requests: true
Example service snippet:
services:
- service_id: "atlas-01/chat/qwen3-8b"
kind: "chat"
endpoint: "http://127.0.0.1:18091"
runtime:
engine: "llama.cpp"
launcher: "managed"
assets:
- asset_id: "qwen3-8b-q4km"
loaded: true
state:
health: "healthy"
load_state: "loaded"
accept_requests: true
observed:
p50_latency_ms: 900
tokens_per_sec: 40
For translation, loaded state matters more than raw capacity. A node that is nominally available but not loaded is a poor default target for a localization job that will touch many pages.
Node-Side Practices
Use the node agent to keep the registry current:
- heartbeat frequently enough that the control plane sees the service as fresh
- publish loaded assets honestly
- keep queue and latency metrics current when possible
- separate translation services from other high-latency or experimental routes
If a translation model is available through multiple runtimes, prefer the one that keeps response shape stable and context handling predictable.
Routing Advice
For translation clients, the most useful route behavior is usually:
- a translation role with a stable model family preference
require_loaded: true- enough context to keep paragraph-level requests coherent
- predictable prompt policy, not aggressive prompt rewriting
That keeps the client config simple. The client can point to a role alias and let GenieHive pick the actual service.
What Not to Do
- Do not rely on a transient model alias unless you are willing to update the client config when the alias changes.
- Do not expose raw upstream model endpoints directly to the translation client if GenieHive is already in the path.
- Do not route translation through a node that cannot maintain enough context for the content size you expect.
Minimal Support Checklist
- translation role present in the role catalog
- client API key enabled
- node API key enabled
- at least one healthy chat service with a loaded model
- route resolution confirms the translation role resolves to that service
- client can reach the control plane or reverse proxy