Initial porting branch commit

2025-09-24 01:33:04 -04:00 · 2025-09-24 01:33:04 -04:00 · d78688b486
parent d3d0b9ba83
commit d78688b486
14 changed files with 454 additions and 0 deletions
--- a/porting/.gitignore
+++ b/porting/.gitignore
@ -0,0 +1,7 @@
+CONFIG.yml
+*.tmp
+*.bak
+*.log
+logs/*
+!logs/.gitkeep
+
--- a/porting/README.md
+++ b/porting/README.md
@ -0,0 +1,66 @@
+# Porting Toolkit (MabeLabRS)
+
+This directory contains scripts and prompts to LLM-assist the C++ → Rust port from MABE2.
+
+## Quick start
+
+1. Copy the example config and edit:
+   ```bash
+   cp porting/CONFIG.example.yml porting/CONFIG.yml
+
+Choose an LLM backend in CONFIG.yml:
+
+backend: ollama (default: http://localhost:11434
+, e.g., qwen2.5-coder:32b)
+
+backend: openai (set OPENAI_API_KEY env var)
+
+Run a translation task (header → Rust traits):
+
+python3 porting/tools/translate_one.py \
+  --mode header-to-traits \
+  --input /path/to/MABE2/include/mabe/something.hpp \
+  --skeleton crates/mabelabrs-core/src/something.rs \
+  --out crates/mabelabrs-core/src/something.rs
+
+
+Try an impl pass (C++ .cpp → fill the Rust skeleton):
+
+python3 porting/tools/translate_one.py \
+  --mode impl-pass \
+  --input /path/to/MABE2/source/something.cpp \
+  --skeleton crates/mabelabrs-core/src/something.rs \
+  --out crates/mabelabrs-core/src/something.rs
+
+
+## Compile + test the workspace: ##
+
+python3 porting/tools/compile_test.py
+
+### Modes ###
+
+header-to-traits: Produces Rust traits/structs with signatures + doc comments (no logic).
+
+impl-pass: Fills in function bodies to match C++ semantics.
+
+unit-tests: Generates unit tests (you provide a short spec).
+
+review-fixit: Feed compiler/test failures back to the model for a targeted fix.
+
+## Logs & provenance ##
+
+All LLM prompts/responses are saved under porting/logs/ with timestamps to preserve provenance.
+
+## Determinism reminder ##
+
+Keep RNGs passed explicitly; avoid global state. See DETERMINISM.md and TRANSLATION_GLOSSARY.md at repo root.
+
+## Safety ##
+
+Never commit secrets. API keys only via environment variables.
+
+Generated code must compile and include tests in the same PR.
+
+
+---
+
--- a/porting/TRANSLATION_PLAYBOOK.md
+++ b/porting/TRANSLATION_PLAYBOOK.md
@ -0,0 +1,28 @@
+# Translation Playbook
+
+This playbook outlines repeatable steps for translating C++ to Rust using LLMs.
+
+## 0) Choose a small vertical slice
+- One header + corresponding implementation.
+- Confirm dependencies are already scaffolded (traits exist).
+
+## 1) Header → Rust traits/structs (no logic)
+- Input: C++ header.
+- Extras: glossary, style, determinism docs, an example skeleton from a similar module.
+- Output: compilable Rust module with `todo!()` bodies, doc comments preserved.
+
+## 2) Implementation pass
+- Input: C++ .cpp (or .hpp inline bodies), the Rust skeleton from step 1.
+- Ask: preserve semantics; avoid global state; return `Result<_, _>` on fallible paths.
+- Output: implemented functions + unit tests.
+
+## 3) Compile & test
+- Run `cargo build`, `cargo clippy -D warnings`, `cargo test`.
+- If failures: feed *only* the relevant compiler error lines and the failing snippet into a `review-fixit` prompt.
+
+## 4) Parity (optional)
+- If MABE2 compiled locally, run tiny deterministic scenarios on both and compare key metrics (seed, steps, mean fitness).
+- Record tolerances for float diffs.
+
+## 5) PR discipline
+- One module per PR; include tests and brief README note summarizing changes and any TODOs
--- a/porting/fixtures/tiny_config.json
+++ b/porting/fixtures/tiny_config.json
@ -0,0 +1,10 @@
+{
+  "seed": 12345,
+  "world": {
+    "topology": "grid_10x10"
+  },
+  "run": {
+    "steps": 50
+  }
+}
+
--- a/porting/prompts/00_header_to_traits.prompt.md
+++ b/porting/prompts/00_header_to_traits.prompt.md
@ -0,0 +1,27 @@
+System:
+You are translating C++ headers from the MABE2 artificial-life framework into idiomatic, safe Rust for the MabeLabRS project. Follow the provided glossary, style, and determinism policies exactly.
+
+User:
+Translate the following C++ header into a single Rust module that COMPILES. Goals:
+- Public API only (traits/structs/type aliases), no logic (use `todo!()` where needed).
+- Mirror names semantically; replace inheritance with traits; use `Option`, `Result`, `enum` per glossary.
+- Add `///` doc comments copied/adapted from C++ comments.
+- Derives: `Debug, Clone, PartialEq, Eq` where sensible; `serde` derives for config/state structs.
+- No `unsafe`. No global mutable state.
+
+Glossary (C++→Rust):
+{{GLOSSARY}}
+
+Style:
+{{STYLE}}
+
+Determinism:
+{{DETERMINISM}}
+
+Existing Rust skeleton (if any):
+{{SKELETON}}
+
+C++ header to translate:
+```cpp
+{{SOURCE_CHUNK}}
+
--- a/porting/prompts/01_impl_pass.prompt.md
+++ b/porting/prompts/01_impl_pass.prompt.md
@ -0,0 +1,33 @@
+System:
+You complete Rust implementations to match C++ semantics precisely, keeping APIs stable.
+
+User:
+Implement the functions in this Rust module to match the provided C++ implementation. Requirements:
+- Keep signatures and types from the skeleton.
+- Preserve semantics; explicit RNG passed in where randomness is used.
+- Avoid panics; return `Result` on fallible paths.
+- Add/extend unit tests under `#[cfg(test)]` using given spec.
+
+Glossary:
+{{GLOSSARY}}
+
+Style:
+{{STYLE}}
+
+Determinism:
+{{DETERMINISM}}
+
+Current Rust module:
+```rust
+{{SKELETON}}
+C++ impl (reference semantics):
+
+cpp
+Copy code
+{{SOURCE_CHUNK}}
+Test spec (pseudo or C++ tests):
+
+text
+Copy code
+{{TEST_SPEC}}
+Return the full updated Rust module.
--- a/porting/prompts/02_unit_tests.prompt.md
+++ b/porting/prompts/02_unit_tests.prompt.md
@ -0,0 +1,18 @@
+System:
+You write focused, deterministic Rust tests.
+
+User:
+Write Rust unit tests for the module below, covering edge cases and invariants.
+
+Determinism:
+{{DETERMINISM}}
+
+Module under test:
+```rust
+{{SKELETON}}
+Spec (given behaviors & invariants):
+
+text
+Copy code
+{{TEST_SPEC}}
+Return only the #[cfg(test)] test module.
--- a/porting/prompts/99_review_fixit.prompt.md
+++ b/porting/prompts/99_review_fixit.prompt.md
@ -0,0 +1,18 @@
+System:
+You are a strict Rust reviewer fixing compilation and test failures with minimal edits.
+
+User:
+Given the Rust snippet and the **actual** compiler/test errors, propose code changes to fix them while preserving semantics and policies. Return the corrected snippet only—no commentary.
+
+Policies:
+{{STYLE}}
+{{DETERMINISM}}
+
+Snippet:
+```rust
+{{SNIPPET}}
+Errors:
+
+text
+Copy code
+{{ERRORS}}
--- a/porting/tools/chunker.py
+++ b/porting/tools/chunker.py
@ -0,0 +1,17 @@
+from typing import List
+
+def chunk_text(s: str, max_chars: int) -> List[str]:
+    if len(s) <= max_chars:
+        return [s]
+    chunks = []
+    start = 0
+    while start < len(s):
+        end = min(len(s), start + max_chars)
+        # try to split on function boundary or newline
+        newline = s.rfind("\n", start, end)
+        if newline == -1 or newline <= start + 1000:
+            newline = end
+        chunks.append(s[start:newline])
+        start = newline
+    return chunks
+
--- a/porting/tools/compile_test.py
+++ b/porting/tools/compile_test.py
@ -0,0 +1,28 @@
+import subprocess
+import yaml
+import os
+import sys
+
+def run(cmd: str) -> int:
+    print(f"$ {cmd}", flush=True)
+    p = subprocess.run(cmd, shell=True)
+    return p.returncode
+
+def main():
+    cfg_path = "porting/CONFIG.yml"
+    if not os.path.exists(cfg_path):
+        cfg_path = "porting/CONFIG.example.yml"
+    with open(cfg_path, "r", encoding="utf-8") as f:
+        cfg = yaml.safe_load(f)
+
+    rc = 0
+    for key in ("build_cmd", "clippy_cmd", "test_cmd"):
+        cmd = cfg["compile"].get(key)
+        if cmd:
+            rc = run(cmd)
+            if rc != 0:
+                sys.exit(rc)
+
+if __name__ == "__main__":
+    main()
+
--- a/porting/tools/llm_clients.py
+++ b/porting/tools/llm_clients.py
@ -0,0 +1,61 @@
+import os
+import json
+import time
+import uuid
+from typing import Dict, Any, Optional
+import requests
+
+class LLMClient:
+    def __init__(self, cfg: Dict[str, Any], logs_dir: str):
+        self.cfg = cfg
+        self.logs_dir = logs_dir
+        os.makedirs(self.logs_dir, exist_ok=True)
+
+    def _log(self, role: str, content: str, uid: str):
+        path = os.path.join(self.logs_dir, f"{time.strftime('%Y%m%d-%H%M%S')}-{uid}-{role}.md")
+        with open(path, "w", encoding="utf-8") as f:
+            f.write(content)
+
+    def chat(self, system: str, user: str) -> str:
+        uid = str(uuid.uuid4())[:8]
+        backend = self.cfg.get("backend", "ollama")
+        if backend == "ollama":
+            host = self.cfg["ollama"]["host"]
+            model = self.cfg["ollama"]["model"]
+            payload = {
+                "model": model,
+                "messages": [
+                    {"role": "system", "content": system},
+                    {"role": "user", "content": user},
+                ],
+                "stream": False,
+            }
+            self._log("system", system, uid)
+            self._log("user", user, uid)
+            r = requests.post(f"{host}/v1/chat/completions", json=payload, timeout=600)
+            r.raise_for_status()
+            data = r.json()
+            content = data["choices"][0]["message"]["content"]
+            self._log("assistant", content, uid)
+            return content
+
+        elif backend == "openai":
+            import openai  # requires openai>=1.0
+            openai.api_key = os.environ.get("OPENAI_API_KEY")
+            if not openai.api_key:
+                raise RuntimeError("Missing OPENAI_API_KEY")
+            model = self.cfg["openai"]["model"]
+            messages = [
+                {"role": "system", "content": system},
+                {"role": "user", "content": user},
+            ]
+            self._log("system", system, uid)
+            self._log("user", user, uid)
+            client = openai.OpenAI()
+            resp = client.chat.completions.create(model=model, messages=messages)
+            content = resp.choices[0].message.content
+            self._log("assistant", content, uid)
+            return content
+
+        else:
+            raise ValueError(f"Unknown backend: {backend}")
--- a/porting/tools/prompt_inject.py
+++ b/porting/tools/prompt_inject.py
@ -0,0 +1,7 @@
+from string import Template
+
+def render_template(tmpl_str: str, **kwargs) -> str:
+    # Simple ${VAR} replacement with safe defaults
+    safe_kwargs = {k: ("" if v is None else v) for k, v in kwargs.items()}
+    return Template(tmpl_str).substitute(**safe_kwargs)
+
--- a/porting/tools/summarize_failures.py
+++ b/porting/tools/summarize_failures.py
@ -0,0 +1,17 @@
+import re
+import sys
+
+def extract_errors(s: str, limit: int = 120):
+    lines = s.splitlines()
+    errs = []
+    for ln in lines:
+        if any(tok in ln for tok in ("error[E", "panicked at", "FAILED", "error:")):
+            errs.append(ln)
+            if len(errs) >= limit:
+                break
+    return "\n".join(errs)
+
+if __name__ == "__main__":
+    text = sys.stdin.read()
+    print(extract_errors(text))
+
--- a/porting/tools/translate_one.py
+++ b/porting/tools/translate_one.py
@ -0,0 +1,117 @@
+#!/usr/bin/env python3
+import argparse, os, sys, subprocess, yaml
+from llm_clients import LLMClient
+from prompt_inject import render_template
+from chunker import chunk_text
+
+def read(path):
+    with open(path, "r", encoding="utf-8") as f:
+        return f.read()
+
+def write(path, content):
+    os.makedirs(os.path.dirname(path), exist_ok=True)
+    with open(path, "w", encoding="utf-8") as f:
+        f.write(content)
+
+def load_cfg():
+    cfg_path = "porting/CONFIG.yml"
+    if not os.path.exists(cfg_path):
+        cfg_path = "porting/CONFIG.example.yml"
+    with open(cfg_path, "r", encoding="utf-8") as f:
+        return yaml.safe_load(f)
+
+def main():
+    p = argparse.ArgumentParser()
+    p.add_argument("--mode", required=True, choices=["header-to-traits", "impl-pass", "unit-tests", "review-fixit"])
+    p.add_argument("--input", required=True, help="Path to C++ source (or error log for review-fixit)")
+    p.add_argument("--skeleton", default=None, help="Existing Rust module to guide/patch")
+    p.add_argument("--out", required=True, help="Output Rust file (or tests module for unit-tests)")
+    p.add_argument("--test-spec", default=None, help="Text spec or path to spec")
+    args = p.parse_args()
+
+    cfg = load_cfg()
+    client = LLMClient(cfg, cfg["logs"]["dir"])
+
+    glossary = read(cfg["project"]["glossary_path"])
+    style = read(cfg["project"]["style_path"])
+    determinism = read(cfg["project"]["determinism_path"])
+
+    def load_prompt(name):
+        return read(cfg["prompts"][name])
+
+    system = ""
+    if args.mode == "header-to-traits":
+        tmpl = load_prompt("header_to_traits")
+        src = read(args.input)
+        skel = read(args.skeleton) if args.skeleton and os.path.exists(args.skeleton) else ""
+        user = render_template(
+            tmpl,
+            GLOSSARY=glossary,
+            STYLE=style,
+            DETERMINISM=determinism,
+            SKELETON=skel,
+            SOURCE_CHUNK=src,
+        )
+        resp = client.chat(system, user)
+        write(args.out, resp)
+
+    elif args.mode == "impl-pass":
+        tmpl = load_prompt("impl_pass")
+        src = read(args.input)
+        skel = read(args.skeleton) if args.skeleton and os.path.exists(args.skeleton) else ""
+        test_spec = read(args.test_spec) if args.test_spec and os.path.exists(args.test_spec) else (args.test_spec or "")
+        user = render_template(
+            tmpl,
+            GLOSSARY=glossary,
+            STYLE=style,
+            DETERMINISM=determinism,
+            SKELETON=skel,
+            SOURCE_CHUNK=src,
+            TEST_SPEC=test_spec,
+        )
+        resp = client.chat(system, user)
+        write(args.out, resp)
+
+    elif args.mode == "unit-tests":
+        tmpl = load_prompt("unit_tests")
+        skel = read(args.skeleton) if args.skeleton and os.path.exists(args.skeleton) else ""
+        test_spec = read(args.test_spec) if args.test_spec and os.path.exists(args.test_spec) else (args.test_spec or "")
+        user = render_template(
+            tmpl,
+            DETERMINISM=determinism,
+            SKELETON=skel,
+            TEST_SPEC=test_spec,
+        )
+        resp = client.chat(system, user)
+        # Append or create tests in the same file:
+        if os.path.exists(args.out):
+            write(args.out, read(args.out) + "\n\n" + resp)
+        else:
+            write(args.out, resp)
+
+    elif args.mode == "review-fixit":
+        tmpl = load_prompt("review_fixit")
+        errors = read(args.input)
+        skel = read(args.skeleton) if args.skeleton and os.path.exists(args.skeleton) else ""
+        user = render_template(
+            tmpl,
+            STYLE=style,
+            DETERMINISM=determinism,
+            SNIPPET=skel,
+            ERRORS=errors,
+        )
+        resp = client.chat(system, user)
+        write(args.out, resp)
+
+    else:
+        print(f"Unknown mode: {args.mode}", file=sys.stderr)
+        sys.exit(2)
+
+    # Compile after generation if configured
+    bcmd = cfg["compile"].get("build_cmd")
+    if bcmd:
+        subprocess.run(bcmd, shell=True)
+
+if __name__ == "__main__":
+    main()
+