Documented use of subscription GenAI or Ollama as an easier LLM path.

2026-03-17 18:49:14 -04:00 · 2026-03-17 18:49:14 -04:00 · aac1e5c8bc
parent 0f905b5a22
commit aac1e5c8bc
10 changed files with 406 additions and 7 deletions
--- a/README.md
+++ b/README.md
@ -198,6 +198,23 @@ python -m didactopus.model_bench

 It evaluates local-model adequacy for the `mentor`, `practice`, and `evaluator` roles using the MIT OCW skill bundle as grounded context.

+### Easiest LLM setup paths
+
+If you want live LLM-backed Didactopus behavior without the complexity of RoleMesh, start with one of these:
+
+1. `ollama` for simple local use
+2. `openai_compatible` for simple hosted use
+3. `rolemesh` only if you need routing and multi-model orchestration
+
+The two low-friction starting configs are:
+
+- `configs/config.ollama.example.yaml`
+- `configs/config.openai-compatible.example.yaml`
+
+For setup details, see:
+
+- `docs/model-provider-setup.md`
+
 ## What Is In This Repository

 - `src/didactopus/`
@ -451,6 +468,7 @@ What remains heuristic or lightweight:
 - [docs/roadmap.md](docs/roadmap.md)
 - [docs/learner-accessibility.md](docs/learner-accessibility.md)
 - [docs/local-model-benchmark.md](docs/local-model-benchmark.md)
+- [docs/model-provider-setup.md](docs/model-provider-setup.md)
 - [docs/course-to-pack.md](docs/course-to-pack.md)
 - [docs/learning-graph.md](docs/learning-graph.md)
 - [docs/agentic-learner-loop.md](docs/agentic-learner-loop.md)
--- a/configs/config.ollama.example.yaml
+++ b/configs/config.ollama.example.yaml
@ -0,0 +1,24 @@
+review:
+  default_reviewer: "Wesley R. Elsberry"
+  write_promoted_pack: true
+
+bridge:
+  host: "127.0.0.1"
+  port: 8765
+  registry_path: "workspace_registry.json"
+  default_workspace_root: "workspaces"
+
+model_provider:
+  provider: "ollama"
+  ollama:
+    base_url: "http://127.0.0.1:11434/v1"
+    api_key: "ollama"
+    # Set this to a model you have already pulled with `ollama pull ...`.
+    default_model: "llama3.2:3b"
+    role_to_model:
+      mentor: "llama3.2:3b"
+      learner: "llama3.2:3b"
+      practice: "llama3.2:3b"
+      project_advisor: "llama3.2:3b"
+      evaluator: "llama3.2:3b"
+    timeout_seconds: 90.0
--- a/configs/config.openai-compatible.example.yaml
+++ b/configs/config.openai-compatible.example.yaml
@ -0,0 +1,26 @@
+review:
+  default_reviewer: "Wesley R. Elsberry"
+  write_promoted_pack: true
+
+bridge:
+  host: "127.0.0.1"
+  port: 8765
+  registry_path: "workspace_registry.json"
+  default_workspace_root: "workspaces"
+
+model_provider:
+  provider: "openai_compatible"
+  openai_compatible:
+    # For OpenAI itself, leave this as https://api.openai.com/v1
+    # For another OpenAI-compatible hosted service, change the base URL and model names.
+    base_url: "https://api.openai.com/v1"
+    api_key: "set-me-via-env-or-local-config"
+    default_model: "gpt-4.1-mini"
+    role_to_model:
+      mentor: "gpt-4.1-mini"
+      learner: "gpt-4.1-mini"
+      practice: "gpt-4.1-mini"
+      project_advisor: "gpt-4.1-mini"
+      evaluator: "gpt-4.1-mini"
+    timeout_seconds: 60.0
+    auth_scheme: "bearer"
--- a/docs/faq.md
+++ b/docs/faq.md
@ -107,6 +107,31 @@ There are now two learner paths in the repo.

 So the deterministic learner is still active, but it is no longer the only learner-style path shown in the repository.

+## What is the easiest way to use a live LLM with Didactopus?
+
+Start with either:
+
+- `configs/config.ollama.example.yaml` for simple local use
+- `configs/config.openai-compatible.example.yaml` for simple hosted use
+
+RoleMesh is still supported, but it is now the advanced option for users who actually need routing and multiple backends.
+
+The simplest local command shape is:
+
+```bash
+python -m didactopus.learner_session_demo --config configs/config.ollama.example.yaml
+```
+
+The simplest hosted command shape is:
+
+```bash
+python -m didactopus.learner_session_demo --config configs/config.openai-compatible.example.yaml
+```
+
+For the full setup notes, see:
+
+- `docs/model-provider-setup.md`
+
 ## Can I still use it as a personal mentor even though the learner is synthetic?

 Yes, if you think of the current repo as a structured learning workbench rather than a chat product.
--- a/docs/model-provider-setup.md
+++ b/docs/model-provider-setup.md
@ -0,0 +1,105 @@
+# Model Provider Setup
+
+Didactopus now supports three main model-provider paths:
+
+- `ollama`
+  - easiest local setup for most single users
+- `openai_compatible`
+  - simplest hosted setup when you want a common online API
+- `rolemesh`
+  - more flexible routing for technically oriented users, labs, and libraries
+
+## Recommended Order
+
+For ease of adoption, use these in this order:
+
+1. `ollama`
+2. `openai_compatible`
+3. `rolemesh`
+
+## Option 1: Ollama
+
+This is the easiest local path for most users.
+
+Use:
+
+- `configs/config.ollama.example.yaml`
+
+Minimal setup:
+
+1. Install Ollama.
+2. Pull a model you want to use.
+3. Start or verify the local Ollama service.
+4. Point Didactopus at `configs/config.ollama.example.yaml`.
+
+Example commands:
+
+```bash
+ollama pull llama3.2:3b
+python -m didactopus.learner_session_demo --config configs/config.ollama.example.yaml
+```
+
+If you want a different local model, change:
+
+- `model_provider.ollama.default_model`
+- `model_provider.ollama.role_to_model`
+
+Use one model for every role at first. Split roles only if you have a reason to do so.
+
+## Option 2: OpenAI-compatible hosted service
+
+This is the easiest hosted path.
+
+Use:
+
+- `configs/config.openai-compatible.example.yaml`
+
+This works for:
+
+- OpenAI itself
+- any hosted service that accepts OpenAI-style `POST /v1/chat/completions`
+
+Typical setup:
+
+1. Create a local copy of `configs/config.openai-compatible.example.yaml`.
+2. Set `base_url`, `api_key`, and `default_model`.
+3. Keep one model for all roles to start with.
+
+Example:
+
+```bash
+python -m didactopus.learner_session_demo --config configs/config.openai-compatible.example.yaml
+```
+
+## Option 3: RoleMesh Gateway
+
+RoleMesh is still useful, but it is no longer the easiest path to recommend to most users.
+
+Choose it when you need:
+
+- role-specific routing
+- multiple local or remote backends
+- heterogeneous compute placement
+- a shared service for a library, lab, or multi-user setup
+
+See:
+
+- `docs/rolemesh-integration.md`
+
+## Which commands use the provider?
+
+Any Didactopus path that calls the model provider can use these configurations, including:
+
+- `python -m didactopus.learner_session_demo`
+- `python -m didactopus.rolemesh_demo`
+- `python -m didactopus.model_bench`
+- `python -m didactopus.ocw_rolemesh_transcript_demo`
+
+The transcript demo name still references RoleMesh because that was the original live-LLM path, but the general learner-session and benchmark flows are the easier places to start.
+
+## Practical Advice
+
+- Start with one model for all roles.
+- Prefer smaller fast models over bigger slow ones at first.
+- Use the benchmark harness before trusting a model for learner-facing guidance.
+- Use RoleMesh only when you actually need routing or multi-model orchestration.
--- a/docs/rolemesh-integration.md
+++ b/docs/rolemesh-integration.md
@ -1,6 +1,14 @@
 # RoleMesh Integration

-RoleMesh Gateway is an appropriate dependency for local-LLM-backed Didactopus usage.
+RoleMesh Gateway is an appropriate dependency for local-LLM-backed Didactopus usage, but it should be treated as the advanced path rather than the default path for most users.
+
+If ease of use is your priority, start with:
+
+- `docs/model-provider-setup.md`
+- `configs/config.ollama.example.yaml`
+- `configs/config.openai-compatible.example.yaml`
+
+Use RoleMesh when you need routing flexibility, multiple backends, or shared infrastructure.

 ## Why it fits

--- a/src/didactopus/config.py
+++ b/src/didactopus/config.py
@ -59,10 +59,29 @@ class RoleMeshProviderConfig(BaseModel):
    timeout_seconds: float = 30.0


+class OllamaProviderConfig(BaseModel):
+    base_url: str = os.getenv("DIDACTOPUS_OLLAMA_BASE_URL", "http://127.0.0.1:11434/v1")
+    api_key: str = os.getenv("DIDACTOPUS_OLLAMA_API_KEY", "ollama")
+    default_model: str = os.getenv("DIDACTOPUS_OLLAMA_DEFAULT_MODEL", "llama3.2:3b")
+    role_to_model: dict[str, str] = Field(default_factory=default_role_to_model)
+    timeout_seconds: float = 60.0
+
+
+class OpenAICompatibleProviderConfig(BaseModel):
+    base_url: str = os.getenv("DIDACTOPUS_OPENAI_COMPAT_BASE_URL", "https://api.openai.com/v1")
+    api_key: str = os.getenv("DIDACTOPUS_OPENAI_COMPAT_API_KEY", "")
+    default_model: str = os.getenv("DIDACTOPUS_OPENAI_COMPAT_DEFAULT_MODEL", "gpt-4.1-mini")
+    role_to_model: dict[str, str] = Field(default_factory=default_role_to_model)
+    timeout_seconds: float = 60.0
+    auth_scheme: str = "bearer"
+
+
 class ModelProviderConfig(BaseModel):
    provider: str = "stub"
    local: LocalProviderConfig = Field(default_factory=LocalProviderConfig)
    rolemesh: RoleMeshProviderConfig = Field(default_factory=RoleMeshProviderConfig)
+    ollama: OllamaProviderConfig = Field(default_factory=OllamaProviderConfig)
+    openai_compatible: OpenAICompatibleProviderConfig = Field(default_factory=OpenAICompatibleProviderConfig)


 class AppConfig(BaseModel):
--- a/src/didactopus/model_provider.py
+++ b/src/didactopus/model_provider.py
@ -39,6 +39,10 @@ class ModelProvider:
        provider_name = self.config.provider.lower()
        if provider_name == "rolemesh":
            return self._generate_rolemesh(prompt, role, system_prompt, temperature, max_tokens, status_callback)
+        if provider_name == "ollama":
+            return self._generate_ollama(prompt, role, system_prompt, temperature, max_tokens, status_callback)
+        if provider_name == "openai_compatible":
+            return self._generate_openai_compatible(prompt, role, system_prompt, temperature, max_tokens, status_callback)
        return self._generate_stub(prompt, role)

    def _generate_stub(self, prompt: str, role: str | None) -> ModelResponse:
@ -77,28 +81,119 @@ class ModelProvider:
        if max_tokens is not None:
            payload["max_tokens"] = max_tokens
        body = self._rolemesh_chat_completion(payload)
+        return self._response_from_chat_completion(body, provider="rolemesh", model_name=model_name)
+
+    def _generate_ollama(
+        self,
+        prompt: str,
+        role: str | None,
+        system_prompt: str | None,
+        temperature: float | None,
+        max_tokens: int | None,
+        status_callback: Callable[[str], None] | None,
+    ) -> ModelResponse:
+        ollama = self.config.ollama
+        model_name = ollama.role_to_model.get(role or "", ollama.default_model)
+        if status_callback is not None:
+            status_callback(self.pending_notice(role, model_name))
+        payload = self._build_messages_payload(prompt, system_prompt, model_name, temperature, max_tokens)
+        body = self._chat_completion_request(
+            base_url=ollama.base_url,
+            api_key=ollama.api_key,
+            timeout_seconds=ollama.timeout_seconds,
+            payload=payload,
+            auth_scheme="bearer",
+        )
+        return self._response_from_chat_completion(body, provider="ollama", model_name=model_name)
+
+    def _generate_openai_compatible(
+        self,
+        prompt: str,
+        role: str | None,
+        system_prompt: str | None,
+        temperature: float | None,
+        max_tokens: int | None,
+        status_callback: Callable[[str], None] | None,
+    ) -> ModelResponse:
+        compat = self.config.openai_compatible
+        model_name = compat.role_to_model.get(role or "", compat.default_model)
+        if status_callback is not None:
+            status_callback(self.pending_notice(role, model_name))
+        payload = self._build_messages_payload(prompt, system_prompt, model_name, temperature, max_tokens)
+        body = self._chat_completion_request(
+            base_url=compat.base_url,
+            api_key=compat.api_key,
+            timeout_seconds=compat.timeout_seconds,
+            payload=payload,
+            auth_scheme=compat.auth_scheme,
+        )
+        return self._response_from_chat_completion(body, provider="openai_compatible", model_name=model_name)
+
+    def _build_messages_payload(
+        self,
+        prompt: str,
+        system_prompt: str | None,
+        model_name: str,
+        temperature: float | None,
+        max_tokens: int | None,
+    ) -> dict:
+        messages = []
+        if system_prompt:
+            messages.append({"role": "system", "content": system_prompt})
+        messages.append({"role": "user", "content": prompt})
+        payload = {
+            "model": model_name,
+            "messages": messages,
+        }
+        if temperature is not None:
+            payload["temperature"] = temperature
+        if max_tokens is not None:
+            payload["max_tokens"] = max_tokens
+        return payload
+
+    def _response_from_chat_completion(self, body: dict, *, provider: str, model_name: str) -> ModelResponse:
        choices = body.get("choices", [])
        if not choices:
-            raise RuntimeError("RoleMesh returned no choices.")
+            raise RuntimeError(f"{provider} returned no choices.")
        message = choices[0].get("message", {})
        text = message.get("content", "")
        if not isinstance(text, str):
            text = str(text)
-        return ModelResponse(text=text, provider="rolemesh", model_name=model_name)
+        return ModelResponse(text=text, provider=provider, model_name=model_name)

    def _rolemesh_chat_completion(self, payload: dict) -> dict:
        rolemesh = self.config.rolemesh
-        url = rolemesh.base_url.rstrip("/") + "/v1/chat/completions"
+        return self._chat_completion_request(
+            base_url=rolemesh.base_url,
+            api_key=rolemesh.api_key,
+            timeout_seconds=rolemesh.timeout_seconds,
+            payload=payload,
+            auth_scheme="x-api-key",
+        )
+
+    def _chat_completion_request(
+        self,
+        *,
+        base_url: str,
+        api_key: str,
+        timeout_seconds: float,
+        payload: dict,
+        auth_scheme: str,
+    ) -> dict:
+        url = base_url.rstrip("/") + "/chat/completions" if base_url.rstrip("/").endswith("/v1") else base_url.rstrip("/") + "/v1/chat/completions"
        headers = {
            "Content-Type": "application/json",
        }
-        if rolemesh.api_key:
-            headers["X-Api-Key"] = rolemesh.api_key
+        if api_key:
+            if auth_scheme == "x-api-key":
+                headers["X-Api-Key"] = api_key
+            else:
+                headers["Authorization"] = f"Bearer {api_key}"
        req = request.Request(
            url,
            data=json.dumps(payload).encode("utf-8"),
            headers=headers,
            method="POST",
        )
-        with request.urlopen(req, timeout=rolemesh.timeout_seconds) as response:
+        with request.urlopen(req, timeout=timeout_seconds) as response:
            return json.loads(response.read().decode("utf-8"))
--- a/tests/test_config.py
+++ b/tests/test_config.py
@ -16,3 +16,17 @@ def test_load_rolemesh_config() -> None:
    assert config.model_provider.rolemesh.role_to_model["mentor"] == "planner"
    assert config.model_provider.rolemesh.role_to_model["learner"] == "writer"
    assert set(config.model_provider.rolemesh.role_to_model) == set(role_ids())
+
+
+def test_load_ollama_config() -> None:
+    config = load_config(Path("configs/config.ollama.example.yaml"))
+    assert config.model_provider.provider == "ollama"
+    assert config.model_provider.ollama.base_url.endswith("/v1")
+    assert set(config.model_provider.ollama.role_to_model) == set(role_ids())
+
+
+def test_load_openai_compatible_config() -> None:
+    config = load_config(Path("configs/config.openai-compatible.example.yaml"))
+    assert config.model_provider.provider == "openai_compatible"
+    assert config.model_provider.openai_compatible.base_url == "https://api.openai.com/v1"
+    assert set(config.model_provider.openai_compatible.role_to_model) == set(role_ids())
--- a/tests/test_model_provider.py
+++ b/tests/test_model_provider.py
@ -69,6 +69,71 @@ def test_rolemesh_provider_emits_pending_notice() -> None:
    assert seen == ["Didactopus is evaluating the work before replying. Model: reviewer."]


+def test_ollama_provider_uses_role_mapping() -> None:
+    config = ModelProviderConfig.model_validate(
+        {
+            "provider": "ollama",
+            "ollama": {
+                "base_url": "http://127.0.0.1:11434/v1",
+                "api_key": "ollama",
+                "default_model": "llama3.2:3b",
+                "role_to_model": {"mentor": "llama3.2:3b", "practice": "qwen2.5:3b"},
+            },
+        }
+    )
+    provider = ModelProvider(config)
+
+    def fake_chat(*, base_url: str, api_key: str, timeout_seconds: float, payload: dict, auth_scheme: str) -> dict:
+        assert base_url == "http://127.0.0.1:11434/v1"
+        assert api_key == "ollama"
+        assert payload["model"] == "qwen2.5:3b"
+        assert auth_scheme == "bearer"
+        return {"choices": [{"message": {"content": "Ollama practice response"}}]}
+
+    provider._chat_completion_request = fake_chat  # type: ignore[method-assign]
+    response = provider.generate(
+        "Generate a practice task.",
+        role="practice",
+        system_prompt="System prompt",
+    )
+    assert response.provider == "ollama"
+    assert response.model_name == "qwen2.5:3b"
+    assert response.text == "Ollama practice response"
+
+
+def test_openai_compatible_provider_uses_bearer_auth() -> None:
+    config = ModelProviderConfig.model_validate(
+        {
+            "provider": "openai_compatible",
+            "openai_compatible": {
+                "base_url": "https://api.openai.com/v1",
+                "api_key": "demo-key",
+                "default_model": "gpt-4.1-mini",
+                "role_to_model": {"mentor": "gpt-4.1-mini"},
+                "auth_scheme": "bearer",
+            },
+        }
+    )
+    provider = ModelProvider(config)
+
+    def fake_chat(*, base_url: str, api_key: str, timeout_seconds: float, payload: dict, auth_scheme: str) -> dict:
+        assert base_url == "https://api.openai.com/v1"
+        assert api_key == "demo-key"
+        assert payload["model"] == "gpt-4.1-mini"
+        assert auth_scheme == "bearer"
+        return {"choices": [{"message": {"content": "Hosted mentor response"}}]}
+
+    provider._chat_completion_request = fake_chat  # type: ignore[method-assign]
+    response = provider.generate(
+        "Orient the learner.",
+        role="mentor",
+        system_prompt="System prompt",
+    )
+    assert response.provider == "openai_compatible"
+    assert response.model_name == "gpt-4.1-mini"
+    assert response.text == "Hosted mentor response"
+
+
 def test_evaluator_prompt_requires_checking_existing_caveats() -> None:
    prompt = evaluator_system_prompt().lower()
    assert "before saying something is missing" in prompt