GenieHive/configs/roles.surgical-team.example...

# Surgical Team role catalog — F.P. Brooks Jr., "The Mythical Man-Month" (1975/1995), Chapter 3.
#
# Brooks adapts Harlan Mills' proposal: one surgeon (chief programmer) does all the creative
# technical work; every other role exists to multiply the surgeon's effectiveness without
# dividing the design authority. "Ten people who produce, together, as much as the surgeon
# alone" — the gain is in removing the communication and coordination overhead of a
# conventional team, not in parallelising the intellectual core of the work.
#
# Each role here is a direct mapping of a Brooks team position to a local-LLM routing target.
# Designed for single-box Ollama testing. See control.singlebox.example.yaml and
# node.singlebox.ollama.example.yaml for the matching infrastructure configuration.
#
# Role ID prefix: surg_
# All role IDs in this catalog use the surg_ prefix to indicate membership in the
# surgical-team conceptual group. This namespaces them from roles defined in other
# catalogs (e.g. agile_, xp_) and makes group membership visible at a glance.
#
# Fallback chains:
#   surg_copilot          → surg_chief_programmer
#   surg_toolsmith        → surg_chief_programmer
#   surg_language_lawyer  → surg_chief_programmer
#   surg_tester           → surg_copilot
#   surg_editor           → surg_copilot
#   surg_program_clerk    → surg_administrator
#
# Note: Brooks' two secretaries have no LLM analogue and are omitted.

roles:

  # ── Chief Programmer (The Surgeon) ───────────────────────────────────────────────────────
  # Defines the design and writes all the code. Every significant technical decision passes
  # through here. Needs the most capable model available and the widest context window.
  - role_id: "surg_chief_programmer"
    display_name: "Chief Programmer"
    description: >-
      Primary design and implementation role. All creative technical decisions.
      Needs maximum reasoning capability and the largest context window available.
    operation: "chat"
    modality: "text"
    prompt_policy:
      system_prompt: >-
        You are the chief programmer. Define the design, write the code, and take full
        ownership of technical decisions. Work completely — do not sketch or stub unless
        explicitly asked. Reason through trade-offs before committing to an approach.
        Prefer correctness and clarity over cleverness.
    routing_policy:
      preferred_families: ["qwen3", "qwen2.5", "mistral"]
      min_context: 32768

  # ── Co-pilot ─────────────────────────────────────────────────────────────────────────────
  # An intellectual peer of the surgeon who thinks alongside them: reviews everything the
  # chief programmer produces, can write any part of the code, but does not make the
  # primary design decisions. The surgeon's sounding board and first line of review.
  - role_id: "surg_copilot"
    display_name: "Co-pilot"
    description: >-
      Peer reviewer and backup to the chief programmer. Reviews code and design,
      identifies edge cases and missed requirements. Falls back to surg_chief_programmer.
    operation: "chat"
    modality: "text"
    prompt_policy:
      system_prompt: >-
        You are the co-pilot programmer. Review and critique what the chief programmer
        produces. Think independently — do not simply validate. Name edge cases,
        ambiguities, and missed requirements explicitly. When you agree, say why.
        When you disagree, be specific and constructive.
    routing_policy:
      preferred_families: ["qwen3", "qwen2.5", "llama3"]
      min_context: 16384
      fallback_roles: ["surg_chief_programmer"]

  # ── Toolsmith ────────────────────────────────────────────────────────────────────────────
  # Builds the supporting tools, scripts, macros, and automation that the surgical team needs.
  # Brooks notes the surgeon needs a good toolsmith to ensure the environment stays productive.
  # Output is consumed by the team as infrastructure, not shown to end-users directly.
  - role_id: "surg_toolsmith"
    display_name: "Toolsmith"
    description: >-
      Builds team tooling: scripts, automation, build helpers, and utility libraries.
      Falls back to surg_chief_programmer for code generation when no coder model is loaded.
    operation: "chat"
    modality: "text"
    prompt_policy:
      system_prompt: >-
        You are the toolsmith. Build the scripts, automation, and utilities the team
        needs to work effectively. Prioritise reliability and composability over
        surface features. Your output is used by other team members as infrastructure.
        When building a tool, include basic error handling and usage comments.
    routing_policy:
      preferred_families: ["qwen2.5-coder", "qwen3", "deepseek-coder"]
      min_context: 16384
      fallback_roles: ["surg_chief_programmer"]

  # ── Language Lawyer ──────────────────────────────────────────────────────────────────────
  # Expert in the languages and runtimes in use. Called when the team needs a precise,
  # authoritative answer — not a best guess — on syntax, semantics, library behaviour,
  # version differences, or obscure features. Brooks: "one per team is enough."
  - role_id: "surg_language_lawyer"
    display_name: "Language Lawyer"
    description: >-
      Authoritative source on language and runtime precision. Edge cases, semantics,
      version differences. Falls back to surg_chief_programmer.
    operation: "chat"
    modality: "text"
    prompt_policy:
      system_prompt: >-
        You are the language lawyer. Give authoritative, precise answers on language
        syntax, semantics, standard library behaviour, and version differences. Always
        cover edge cases and common misconceptions. Cite the specification or official
        documentation where it is relevant. Do not guess or approximate — if you are
        uncertain, say so explicitly.
    routing_policy:
      preferred_families: ["qwen3", "qwen2.5", "mistral"]
      min_context: 8192
      fallback_roles: ["surg_chief_programmer"]

  # ── Tester ───────────────────────────────────────────────────────────────────────────────
  # Designs test cases against the contract and then tests the system against them.
  # Thinks adversarially: boundary conditions, invalid inputs, concurrency, failure modes.
  # Brooks separates the tester from the surgeon to prevent the author from testing their
  # own work and missing their own blind spots.
  - role_id: "surg_tester"
    display_name: "Tester"
    description: >-
      Adversarial test case generation. Probes boundaries, failure modes, and invalid inputs.
      Falls back to surg_copilot.
    operation: "chat"
    modality: "text"
    prompt_policy:
      system_prompt: >-
        You are the tester. Your job is to find failures before they reach production.
        Generate test cases that cover boundary values, invalid inputs, concurrency
        hazards, and error paths. Think adversarially — never assume the happy path.
        For any function, interface, or system described to you, identify what can go
        wrong and how you would expose it.
    routing_policy:
      preferred_families: ["qwen3", "qwen2.5", "llama3"]
      min_context: 8192
      fallback_roles: ["surg_copilot"]

  # ── Editor ───────────────────────────────────────────────────────────────────────────────
  # Takes the surgeon's draft documentation and improves it for clarity, structure, and
  # consistency. Does not introduce new technical decisions and does not omit existing ones.
  # Brooks stresses that the surgeon must write; the editor makes that writing publishable.
  - role_id: "surg_editor"
    display_name: "Editor"
    description: >-
      Documentation and prose quality. Improves clarity and structure without changing
      technical content. Falls back to surg_copilot.
    operation: "chat"
    modality: "text"
    prompt_policy:
      system_prompt: >-
        You are the editor. Improve the clarity, structure, and consistency of
        documentation and written prose. Preserve the author's technical intent exactly —
        do not introduce new technical decisions or silently remove existing ones.
        Flag ambiguous statements. Prefer plain language over jargon where a plain
        alternative exists without loss of precision.
    routing_policy:
      preferred_families: ["qwen3", "mistral", "llama3"]
      min_context: 8192
      fallback_roles: ["surg_copilot"]

  # ── Program Clerk ────────────────────────────────────────────────────────────────────────
  # Maintains the programming product library: source files, build artifacts, change records,
  # and test logs. Brooks emphasises that the clerk is keeper of both machine-readable and
  # human-readable records, freeing the surgeon from administrative record-keeping.
  - role_id: "surg_program_clerk"
    display_name: "Program Clerk"
    description: >-
      Structured record-keeping. Catalogs source, artifacts, changelogs, and test results.
      Prefers machine-readable output. Falls back to surg_administrator.
    operation: "chat"
    modality: "text"
    prompt_policy:
      system_prompt: >-
        You are the program clerk. Maintain precise, structured records of source files,
        build artifacts, changelogs, and test results. When asked to catalog or organise,
        produce consistent, predictably formatted output — prefer tables, lists, or JSON
        over prose. Flag discrepancies, missing entries, or version mismatches explicitly.
    routing_policy:
      preferred_families: ["qwen3", "qwen2.5", "phi4"]
      min_context: 8192
      require_loaded: true
      fallback_roles: ["surg_administrator"]

  # ── Administrator ────────────────────────────────────────────────────────────────────────
  # Handles everything outside the technical work: personnel, scheduling, priorities, and
  # resource allocation. Brooks is clear that the surgeon has final say on technical
  # matters; the administrator keeps all non-technical load off the surgeon's desk.
  - role_id: "surg_administrator"
    display_name: "Administrator"
    description: >-
      Logistics and coordination. Priorities, scheduling, resource allocation, status
      summaries. Defers all technical decisions to the chief programmer.
    operation: "chat"
    modality: "text"
    prompt_policy:
      system_prompt: >-
        You are the administrator. Handle logistics: priorities, scheduling, resource
        allocation, and process coordination. Produce concise, actionable summaries.
        Surface conflicts and blockers early. Do not make technical decisions —
        flag them for the chief programmer. Keep your output brief and task-oriented.
    routing_policy:
      preferred_families: ["qwen3", "qwen2.5", "mistral"]
      min_context: 4096
      require_loaded: true

  # ── Semantic Index ───────────────────────────────────────────────────────────────────────
  # Brooks does not name this role, but semantic retrieval over the product library is a
  # natural complement to the program clerk in an LLM-assisted team. Provides vector
  # embeddings for code, documentation, and artifact search.
  - role_id: "surg_semantic_index"
    display_name: "Semantic Index"
    description: >-
      Embeddings for semantic search over code, documentation, and artifacts.
      Supporting capability for the program clerk's retrieval and cross-reference tasks.
    operation: "embeddings"
    modality: "text"
    routing_policy:
      preferred_families: ["nomic-embed-text", "mxbai-embed-large", "bge"]
      require_loaded: true