4.6 KiB

Raw Blame History

GenieHive Translation Support

This note describes the control-plane and node configuration needed to support translation clients such as SciSiteForge.

GenieHive already exposes the core transport needed for translation:

POST /v1/chat/completions
client API keys
role-based routing
OpenAI-compatible upstream services

Translation support is mostly a matter of configuration discipline.

Control Plane

The control plane should provide a translation-oriented role or directly addressable model that the client can target.

Recommended control-plane changes:

Add a dedicated role, for example scientific_translator.
Keep it as operation: "chat".
Use a conservative prompt policy that returns translation only.
Prefer a stable, instruction-following model family.
Keep the role in the loaded role catalog so it appears in /v1/models and route-resolution output.

Example role entry:

- role_id: "scientific_translator"
  display_name: "Scientific Translator"
  description: "Translation-oriented chat route for site localization"
  operation: "chat"
  modality: "text"
  prompt_policy:
    system_prompt: "Translate faithfully. Preserve meaning, structure, citations, and technical terms. Return only the translation."
    request_policy:
      body_defaults:
        temperature: 0.1
  routing_policy:
    preferred_families: ["Qwen3", "Mistral", "Llama"]
    min_context: 8192
    require_loaded: true

What matters operationally:

the role must resolve to a healthy chat service
the role should stay loaded on a model with enough context for page-sized paragraph batches
the control plane should not silently route translation requests to a low-context or partially loaded fallback unless that is explicitly intended

Auth and Exposure

Keep the same separation used for other GenieHive clients:

X-Api-Key for client requests
X-GenieHive-Node-Key for node registration and heartbeats

If the control plane is exposed beyond localhost, prefer a reverse proxy and keep the upstream control port private.

Node Requirements

A node that is meant to serve translation traffic should expose one or more healthy chat services that can accept small, repeated requests.

Recommended node configuration:

chat service kind: chat
runtime: any OpenAI-compatible upstream that GenieHive can route to
assets: a loaded instruction-following model
observed latency and throughput: populated so scoring can prefer the right node
accept_requests: true

Example service snippet:

services:
  - service_id: "atlas-01/chat/qwen3-8b"
    kind: "chat"
    endpoint: "http://127.0.0.1:18091"
    runtime:
      engine: "llama.cpp"
      launcher: "managed"
    assets:
      - asset_id: "qwen3-8b-q4km"
        loaded: true
    state:
      health: "healthy"
      load_state: "loaded"
      accept_requests: true
    observed:
      p50_latency_ms: 900
      tokens_per_sec: 40

For translation, loaded state matters more than raw capacity. A node that is nominally available but not loaded is a poor default target for a localization job that will touch many pages.

Node-Side Practices

Use the node agent to keep the registry current:

heartbeat frequently enough that the control plane sees the service as fresh
publish loaded assets honestly
keep queue and latency metrics current when possible
separate translation services from other high-latency or experimental routes

If a translation model is available through multiple runtimes, prefer the one that keeps response shape stable and context handling predictable.

Routing Advice

For translation clients, the most useful route behavior is usually:

a translation role with a stable model family preference
require_loaded: true
enough context to keep paragraph-level requests coherent
predictable prompt policy, not aggressive prompt rewriting

That keeps the client config simple. The client can point to a role alias and let GenieHive pick the actual service.

What Not to Do

Do not rely on a transient model alias unless you are willing to update the client config when the alias changes.
Do not expose raw upstream model endpoints directly to the translation client if GenieHive is already in the path.
Do not route translation through a node that cannot maintain enough context for the content size you expect.

Minimal Support Checklist

translation role present in the role catalog
client API key enabled
node API key enabled
at least one healthy chat service with a loaded model
route resolution confirms the translation role resolves to that service
client can reach the control plane or reverse proxy

4.6 KiB Raw Blame History