GenieHive is a generative AI router, starting with presenting an OpenAI API-compatible endpoint for clients to interact with, while their requests are routed appropriately among one or more nodes that register running servers with the control host. From running multiple LLMs on a single host to doing that across a distributed cluster, GenieHive aims to make it easier to actually use local AI.

Go to file

welsberr e4f8b14437 Add smoke test, enable Ollama discovery in singlebox config, update demo doc scripts/smoke_test.py: end-to-end validation script covering health, cluster state, model catalog, route resolution, non-streaming chat (role + direct asset), streaming chat (SSE validation + reasoning-strip check), embeddings, and Ollama discovery metrics. Auto-detects targets from /v1/models; accepts --chat-role, --chat-asset, --embed-asset overrides. Exit 0 if all pass/skip, exit 1 on any failure. configs/node.singlebox.ollama.example.yaml: add discover_protocol: "ollama" to both services so the config works out of the box for Ollama discovery testing without manual edits. docs/llm_demo.md: update Current Readiness to reflect v1 complete feature set; add Smoke Test section; add New Capabilities section covering streaming, routing strategies, Ollama discovery, and role catalogs. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>		2026-04-27 15:13:31 -04:00
configs	Add smoke test, enable Ollama discovery in singlebox config, update demo doc	2026-04-27 15:13:31 -04:00
docs	Add smoke test, enable Ollama discovery in singlebox config, update demo doc	2026-04-27 15:13:31 -04:00
scripts	Add smoke test, enable Ollama discovery in singlebox config, update demo doc	2026-04-27 15:13:31 -04:00
src	P1–P2 complete: routing strategies, streaming, discovery, observed metrics + role catalogs	2026-04-27 14:12:54 -04:00
tests	P1–P2 complete: routing strategies, streaming, discovery, observed metrics + role catalogs	2026-04-27 14:12:54 -04:00
.gitignore	Initial commit	2026-04-07 13:17:28 -04:00
CONTRIBUTING.md	Initial commit	2026-04-07 13:17:28 -04:00
LICENSE	Initial commit	2026-04-07 13:10:24 -04:00
Makefile	Initial commit	2026-04-07 13:17:28 -04:00
README.md	Add benchmarked route matching and request shaping	2026-04-07 14:45:32 -04:00
pyproject.toml	Initial commit	2026-04-07 13:17:28 -04:00

README.md

GenieHive

GenieHive is a local-first control plane for heterogeneous generative AI services running across one or more hosts.

V1 scope:

chat completions
embeddings
transcription

Core goals:

register hosts and services
track health, inventory, and observed performance
expose a stable client-facing API
support direct model addressing and higher-level role addressing
route requests to healthy loaded services first

Repository layout:

docs/architecture.md: system overview and v1 scope
docs/roadmap.md: current milestones and near-term priorities
docs/schemas.md: canonical data models
docs/deployment.md: intended deployment approach
docs/demo.md: first end-to-end control-plus-node demo flow
docs/llm_demo.md: detailed master/peer/client LLM demo runbook
docs/reverse_proxy.md: safer external exposure patterns
configs/: example control-plane, node, and role configs
scripts/: small launch and inspection helpers
src/geniehive_control/: control-plane package
src/geniehive_node/: node-agent package

There is now a documented single-machine path as well as the cluster-oriented path, so GenieHive can be exercised as a useful local router even without multiple hosts.

This repository is intended as the clean successor to narrower local gateway experiments. OpenAI-compatible routing remains important, but it is treated as one client facade within a broader cluster control-plane design.

Development

Local development setup:

cd /home/netuser/bin/geniehive
python -m venv .venv
. .venv/bin/activate
pip install -e '.[dev]'

Common commands:

make test
make smoke
make health

Benchmark workflow:

PYTHONPATH=src python scripts/run_benchmark_workload.py \
  --base-url http://127.0.0.1:8800 \
  --api-key change-me-client-key \
  --model general_assistant \
  --workload chat.short_reasoning \
  --output /tmp/geniehive-bench.json

PYTHONPATH=src python scripts/ingest_benchmark_report.py /tmp/geniehive-bench.json \
  --base-url http://127.0.0.1:8800 \
  --api-key change-me-client-key

Repository conventions:

local runtime state lives under state/ and should not be committed
example configs under configs/ should remain runnable
operator scripts under scripts/ are part of the supported workflow