Documented use of subscription GenAI or Ollama as an easier LLM path.

2026-03-17 18:49:14 -04:00 · 2026-03-17 18:49:14 -04:00 · aac1e5c8bc
parent 0f905b5a22
commit aac1e5c8bc
10 changed files with 406 additions and 7 deletions
--- a/README.md
+++ b/README.md
@ -198,6 +198,23 @@ python -m didactopus.model_bench
 It evaluates local-model adequacy for the `mentor`, `practice`, and `evaluator` roles using the MIT OCW skill bundle as grounded context.
 ### Easiest LLM setup paths
 If you want live LLM-backed Didactopus behavior without the complexity of RoleMesh, start with one of these:
 1. `ollama` for simple local use
 2. `openai_compatible` for simple hosted use
 3. `rolemesh` only if you need routing and multi-model orchestration
 The two low-friction starting configs are:
 - `configs/config.ollama.example.yaml`
 - `configs/config.openai-compatible.example.yaml`
 For setup details, see:
 - `docs/model-provider-setup.md`
 ## What Is In This Repository
 - `src/didactopus/`
@ -451,6 +468,7 @@ What remains heuristic or lightweight:
 - [docs/roadmap.md](docs/roadmap.md)
 - [docs/learner-accessibility.md](docs/learner-accessibility.md)
 - [docs/local-model-benchmark.md](docs/local-model-benchmark.md)
 - [docs/model-provider-setup.md](docs/model-provider-setup.md)
 - [docs/course-to-pack.md](docs/course-to-pack.md)
 - [docs/learning-graph.md](docs/learning-graph.md)
 - [docs/agentic-learner-loop.md](docs/agentic-learner-loop.md)
--- a/configs/config.ollama.example.yaml
+++ b/configs/config.ollama.example.yaml
@ -0,0 +1,24 @@
 review:
  default_reviewer: "Wesley R. Elsberry"
  write_promoted_pack: true
 bridge:
  host: "127.0.0.1"
  port: 8765
  registry_path: "workspace_registry.json"
  default_workspace_root: "workspaces"
 model_provider:
  provider: "ollama"
  ollama:
    base_url: "http://127.0.0.1:11434/v1"
    api_key: "ollama"
    # Set this to a model you have already pulled with `ollama pull ...`.
    default_model: "llama3.2:3b"
    role_to_model:
      mentor: "llama3.2:3b"
      learner: "llama3.2:3b"
      practice: "llama3.2:3b"
      project_advisor: "llama3.2:3b"
      evaluator: "llama3.2:3b"
    timeout_seconds: 90.0
--- a/configs/config.openai-compatible.example.yaml
+++ b/configs/config.openai-compatible.example.yaml
@ -0,0 +1,26 @@
 review:
  default_reviewer: "Wesley R. Elsberry"
  write_promoted_pack: true
 bridge:
  host: "127.0.0.1"
  port: 8765
  registry_path: "workspace_registry.json"
  default_workspace_root: "workspaces"
 model_provider:
  provider: "openai_compatible"
  openai_compatible:
    # For OpenAI itself, leave this as https://api.openai.com/v1
    # For another OpenAI-compatible hosted service, change the base URL and model names.
    base_url: "https://api.openai.com/v1"
    api_key: "set-me-via-env-or-local-config"
    default_model: "gpt-4.1-mini"
    role_to_model:
      mentor: "gpt-4.1-mini"
      learner: "gpt-4.1-mini"
      practice: "gpt-4.1-mini"
      project_advisor: "gpt-4.1-mini"
      evaluator: "gpt-4.1-mini"
    timeout_seconds: 60.0
    auth_scheme: "bearer"
--- a/docs/faq.md
+++ b/docs/faq.md
@ -107,6 +107,31 @@ There are now two learner paths in the repo.
 So the deterministic learner is still active, but it is no longer the only learner-style path shown in the repository.
 ## What is the easiest way to use a live LLM with Didactopus?
 Start with either:
 - `configs/config.ollama.example.yaml` for simple local use
 - `configs/config.openai-compatible.example.yaml` for simple hosted use
 RoleMesh is still supported, but it is now the advanced option for users who actually need routing and multiple backends.
 The simplest local command shape is:
 ```bash
 python -m didactopus.learner_session_demo --config configs/config.ollama.example.yaml
 ```
 The simplest hosted command shape is:
 ```bash
 python -m didactopus.learner_session_demo --config configs/config.openai-compatible.example.yaml
 ```
 For the full setup notes, see:
 - `docs/model-provider-setup.md`
 ## Can I still use it as a personal mentor even though the learner is synthetic?
 Yes, if you think of the current repo as a structured learning workbench rather than a chat product.
--- a/docs/model-provider-setup.md
+++ b/docs/model-provider-setup.md
@ -0,0 +1,105 @@
 # Model Provider Setup
 Didactopus now supports three main model-provider paths:
 - `ollama`
  - easiest local setup for most single users
 - `openai_compatible`
  - simplest hosted setup when you want a common online API
 - `rolemesh`
  - more flexible routing for technically oriented users, labs, and libraries
 ## Recommended Order
 For ease of adoption, use these in this order:
 1. `ollama`
 2. `openai_compatible`
 3. `rolemesh`
 ## Option 1: Ollama
 This is the easiest local path for most users.
 Use:
 - `configs/config.ollama.example.yaml`
 Minimal setup:
 1. Install Ollama.
 2. Pull a model you want to use.
 3. Start or verify the local Ollama service.
 4. Point Didactopus at `configs/config.ollama.example.yaml`.
 Example commands:
 ```bash
 ollama pull llama3.2:3b
 python -m didactopus.learner_session_demo --config configs/config.ollama.example.yaml
 ```
 If you want a different local model, change:
 - `model_provider.ollama.default_model`
 - `model_provider.ollama.role_to_model`
 Use one model for every role at first. Split roles only if you have a reason to do so.
 ## Option 2: OpenAI-compatible hosted service
 This is the easiest hosted path.
 Use:
 - `configs/config.openai-compatible.example.yaml`
 This works for:
 - OpenAI itself
 - any hosted service that accepts OpenAI-style `POST /v1/chat/completions`
 Typical setup:
 1. Create a local copy of `configs/config.openai-compatible.example.yaml`.
 2. Set `base_url`, `api_key`, and `default_model`.
 3. Keep one model for all roles to start with.
 Example:
 ```bash
 python -m didactopus.learner_session_demo --config configs/config.openai-compatible.example.yaml
 ```
 ## Option 3: RoleMesh Gateway
 RoleMesh is still useful, but it is no longer the easiest path to recommend to most users.
 Choose it when you need:
 - role-specific routing
 - multiple local or remote backends
 - heterogeneous compute placement
 - a shared service for a library, lab, or multi-user setup
 See:
 - `docs/rolemesh-integration.md`
 ## Which commands use the provider?
 Any Didactopus path that calls the model provider can use these configurations, including:
 - `python -m didactopus.learner_session_demo`
 - `python -m didactopus.rolemesh_demo`
 - `python -m didactopus.model_bench`
 - `python -m didactopus.ocw_rolemesh_transcript_demo`
 The transcript demo name still references RoleMesh because that was the original live-LLM path, but the general learner-session and benchmark flows are the easier places to start.
 ## Practical Advice
 - Start with one model for all roles.
 - Prefer smaller fast models over bigger slow ones at first.
 - Use the benchmark harness before trusting a model for learner-facing guidance.
 - Use RoleMesh only when you actually need routing or multi-model orchestration.
--- a/docs/rolemesh-integration.md
+++ b/docs/rolemesh-integration.md
@ -1,6 +1,14 @@
 # RoleMesh Integration
-RoleMesh Gateway is an appropriate dependency for local-LLM-backed Didactopus usage.
+RoleMesh Gateway is an appropriate dependency for local-LLM-backed Didactopus usage, but it should be treated as the advanced path rather than the default path for most users.
 If ease of use is your priority, start with:
 - `docs/model-provider-setup.md`
 - `configs/config.ollama.example.yaml`
 - `configs/config.openai-compatible.example.yaml`
 Use RoleMesh when you need routing flexibility, multiple backends, or shared infrastructure.
 ## Why it fits
--- a/src/didactopus/config.py
+++ b/src/didactopus/config.py
@ -59,10 +59,29 @@ class RoleMeshProviderConfig(BaseModel):
    timeout_seconds: float = 30.0
 class OllamaProviderConfig(BaseModel):
    base_url: str = os.getenv("DIDACTOPUS_OLLAMA_BASE_URL", "http://127.0.0.1:11434/v1")
    api_key: str = os.getenv("DIDACTOPUS_OLLAMA_API_KEY", "ollama")
    default_model: str = os.getenv("DIDACTOPUS_OLLAMA_DEFAULT_MODEL", "llama3.2:3b")
    role_to_model: dict[str, str] = Field(default_factory=default_role_to_model)
    timeout_seconds: float = 60.0
 class OpenAICompatibleProviderConfig(BaseModel):
    base_url: str = os.getenv("DIDACTOPUS_OPENAI_COMPAT_BASE_URL", "https://api.openai.com/v1")
    api_key: str = os.getenv("DIDACTOPUS_OPENAI_COMPAT_API_KEY", "")
    default_model: str = os.getenv("DIDACTOPUS_OPENAI_COMPAT_DEFAULT_MODEL", "gpt-4.1-mini")
    role_to_model: dict[str, str] = Field(default_factory=default_role_to_model)
    timeout_seconds: float = 60.0
    auth_scheme: str = "bearer"
 class ModelProviderConfig(BaseModel):
    provider: str = "stub"
    local: LocalProviderConfig = Field(default_factory=LocalProviderConfig)
    rolemesh: RoleMeshProviderConfig = Field(default_factory=RoleMeshProviderConfig)
    ollama: OllamaProviderConfig = Field(default_factory=OllamaProviderConfig)
    openai_compatible: OpenAICompatibleProviderConfig = Field(default_factory=OpenAICompatibleProviderConfig)
 class AppConfig(BaseModel):
--- a/src/didactopus/model_provider.py
+++ b/src/didactopus/model_provider.py
@ -39,6 +39,10 @@ class ModelProvider:
        provider_name = self.config.provider.lower()
        if provider_name == "rolemesh":
            return self._generate_rolemesh(prompt, role, system_prompt, temperature, max_tokens, status_callback)
        if provider_name == "ollama":
            return self._generate_ollama(prompt, role, system_prompt, temperature, max_tokens, status_callback)
        if provider_name == "openai_compatible":
            return self._generate_openai_compatible(prompt, role, system_prompt, temperature, max_tokens, status_callback)
        return self._generate_stub(prompt, role)
    def _generate_stub(self, prompt: str, role: str | None) -> ModelResponse:
@ -77,28 +81,119 @@ class ModelProvider:
        if max_tokens is not None:
            payload["max_tokens"] = max_tokens
        body = self._rolemesh_chat_completion(payload)
        return self._response_from_chat_completion(body, provider="rolemesh", model_name=model_name)
    def _generate_ollama(
        self,
        prompt: str,
        role: str | None,
        system_prompt: str | None,
        temperature: float | None,
        max_tokens: int | None,
        status_callback: Callable[[str], None] | None,
    ) -> ModelResponse:
        ollama = self.config.ollama
        model_name = ollama.role_to_model.get(role or "", ollama.default_model)
        if status_callback is not None:
            status_callback(self.pending_notice(role, model_name))
        payload = self._build_messages_payload(prompt, system_prompt, model_name, temperature, max_tokens)
        body = self._chat_completion_request(
            base_url=ollama.base_url,
            api_key=ollama.api_key,
            timeout_seconds=ollama.timeout_seconds,
            payload=payload,
            auth_scheme="bearer",
        )
        return self._response_from_chat_completion(body, provider="ollama", model_name=model_name)
    def _generate_openai_compatible(
        self,
        prompt: str,
        role: str | None,
        system_prompt: str | None,
        temperature: float | None,
        max_tokens: int | None,
        status_callback: Callable[[str], None] | None,
    ) -> ModelResponse:
        compat = self.config.openai_compatible
        model_name = compat.role_to_model.get(role or "", compat.default_model)
        if status_callback is not None:
            status_callback(self.pending_notice(role, model_name))
        payload = self._build_messages_payload(prompt, system_prompt, model_name, temperature, max_tokens)
        body = self._chat_completion_request(
            base_url=compat.base_url,
            api_key=compat.api_key,
            timeout_seconds=compat.timeout_seconds,
            payload=payload,
            auth_scheme=compat.auth_scheme,
        )
        return self._response_from_chat_completion(body, provider="openai_compatible", model_name=model_name)
    def _build_messages_payload(
        self,
        prompt: str,
        system_prompt: str | None,
        model_name: str,
        temperature: float | None,
        max_tokens: int | None,
    ) -> dict:
        messages = []
        if system_prompt:
            messages.append({"role": "system", "content": system_prompt})
        messages.append({"role": "user", "content": prompt})
        payload = {
            "model": model_name,
            "messages": messages,
        }
        if temperature is not None:
            payload["temperature"] = temperature
        if max_tokens is not None:
            payload["max_tokens"] = max_tokens
        return payload
    def _response_from_chat_completion(self, body: dict, *, provider: str, model_name: str) -> ModelResponse:
        choices = body.get("choices", [])
        if not choices:
-            raise RuntimeError("RoleMesh returned no choices.")
+            raise RuntimeError(f"{provider} returned no choices.")
        message = choices[0].get("message", {})
        text = message.get("content", "")
        if not isinstance(text, str):
            text = str(text)
-        return ModelResponse(text=text, provider="rolemesh", model_name=model_name)
+        return ModelResponse(text=text, provider=provider, model_name=model_name)
    def _rolemesh_chat_completion(self, payload: dict) -> dict:
        rolemesh = self.config.rolemesh
-        url = rolemesh.base_url.rstrip("/") + "/v1/chat/completions"
+        return self._chat_completion_request(
            base_url=rolemesh.base_url,
            api_key=rolemesh.api_key,
            timeout_seconds=rolemesh.timeout_seconds,
            payload=payload,
            auth_scheme="x-api-key",
        )
    def _chat_completion_request(
        self,
        *,
        base_url: str,
        api_key: str,
        timeout_seconds: float,
        payload: dict,
        auth_scheme: str,
    ) -> dict:
        url = base_url.rstrip("/") + "/chat/completions" if base_url.rstrip("/").endswith("/v1") else base_url.rstrip("/") + "/v1/chat/completions"
        headers = {
            "Content-Type": "application/json",
        }
-        if rolemesh.api_key:
+        if api_key:
-            headers["X-Api-Key"] = rolemesh.api_key
+            if auth_scheme == "x-api-key":
                headers["X-Api-Key"] = api_key
            else:
                headers["Authorization"] = f"Bearer {api_key}"
        req = request.Request(
            url,
            data=json.dumps(payload).encode("utf-8"),
            headers=headers,
            method="POST",
        )
-        with request.urlopen(req, timeout=rolemesh.timeout_seconds) as response:
+        with request.urlopen(req, timeout=timeout_seconds) as response:
            return json.loads(response.read().decode("utf-8"))
--- a/tests/test_config.py
+++ b/tests/test_config.py
@ -16,3 +16,17 @@ def test_load_rolemesh_config() -> None:
    assert config.model_provider.rolemesh.role_to_model["mentor"] == "planner"
    assert config.model_provider.rolemesh.role_to_model["learner"] == "writer"
    assert set(config.model_provider.rolemesh.role_to_model) == set(role_ids())
 def test_load_ollama_config() -> None:
    config = load_config(Path("configs/config.ollama.example.yaml"))
    assert config.model_provider.provider == "ollama"
    assert config.model_provider.ollama.base_url.endswith("/v1")
    assert set(config.model_provider.ollama.role_to_model) == set(role_ids())
 def test_load_openai_compatible_config() -> None:
    config = load_config(Path("configs/config.openai-compatible.example.yaml"))
    assert config.model_provider.provider == "openai_compatible"
    assert config.model_provider.openai_compatible.base_url == "https://api.openai.com/v1"
    assert set(config.model_provider.openai_compatible.role_to_model) == set(role_ids())
--- a/tests/test_model_provider.py
+++ b/tests/test_model_provider.py
@ -69,6 +69,71 @@ def test_rolemesh_provider_emits_pending_notice() -> None:
    assert seen == ["Didactopus is evaluating the work before replying. Model: reviewer."]
 def test_ollama_provider_uses_role_mapping() -> None:
    config = ModelProviderConfig.model_validate(
        {
            "provider": "ollama",
            "ollama": {
                "base_url": "http://127.0.0.1:11434/v1",
                "api_key": "ollama",
                "default_model": "llama3.2:3b",
                "role_to_model": {"mentor": "llama3.2:3b", "practice": "qwen2.5:3b"},
            },
        }
    )
    provider = ModelProvider(config)
    def fake_chat(*, base_url: str, api_key: str, timeout_seconds: float, payload: dict, auth_scheme: str) -> dict:
        assert base_url == "http://127.0.0.1:11434/v1"
        assert api_key == "ollama"
        assert payload["model"] == "qwen2.5:3b"
        assert auth_scheme == "bearer"
        return {"choices": [{"message": {"content": "Ollama practice response"}}]}
    provider._chat_completion_request = fake_chat  # type: ignore[method-assign]
    response = provider.generate(
        "Generate a practice task.",
        role="practice",
        system_prompt="System prompt",
    )
    assert response.provider == "ollama"
    assert response.model_name == "qwen2.5:3b"
    assert response.text == "Ollama practice response"
 def test_openai_compatible_provider_uses_bearer_auth() -> None:
    config = ModelProviderConfig.model_validate(
        {
            "provider": "openai_compatible",
            "openai_compatible": {
                "base_url": "https://api.openai.com/v1",
                "api_key": "demo-key",
                "default_model": "gpt-4.1-mini",
                "role_to_model": {"mentor": "gpt-4.1-mini"},
                "auth_scheme": "bearer",
            },
        }
    )
    provider = ModelProvider(config)
    def fake_chat(*, base_url: str, api_key: str, timeout_seconds: float, payload: dict, auth_scheme: str) -> dict:
        assert base_url == "https://api.openai.com/v1"
        assert api_key == "demo-key"
        assert payload["model"] == "gpt-4.1-mini"
        assert auth_scheme == "bearer"
        return {"choices": [{"message": {"content": "Hosted mentor response"}}]}
    provider._chat_completion_request = fake_chat  # type: ignore[method-assign]
    response = provider.generate(
        "Orient the learner.",
        role="mentor",
        system_prompt="System prompt",
    )
    assert response.provider == "openai_compatible"
    assert response.model_name == "gpt-4.1-mini"
    assert response.text == "Hosted mentor response"
 def test_evaluator_prompt_requires_checking_existing_caveats() -> None:
    prompt = evaluator_system_prompt().lower()
    assert "before saying something is missing" in prompt