After Codex fixes to test, ingestion, run, and visualization.

This commit is contained in:
welsberr 2026-03-14 17:19:15 -04:00
parent 420cdf0964
commit 272994d2f5
30 changed files with 1264 additions and 575 deletions

File diff suppressed because one or more lines are too long

View File

@ -1,350 +0,0 @@
# Didactopus
![Didactopus mascot](artwork/didactopus-mascot.png)
**Didactopus** is a local-first AI-assisted autodidactic mastery platform for building genuine expertise through concept graphs, adaptive curriculum planning, evidence-driven mastery, Socratic mentoring, and project-based learning.
**Tagline:** *Many arms, one goal — mastery.*
## Recent revisions
### Interactive Domain review
This revision upgrades the earlier static review scaffold into an **interactive local SPA review UI**.
The new review layer is meant to help a human curator work through draft packs created
by the ingestion pipeline and promote them into more trusted reviewed packs.
## Why this matters
One of the practical problems with using open online course contents is that the material
is often scattered, inconsistently structured, awkward to reuse, and cognitively expensive
to turn into something actionable.
Even when excellent course material exists, there is often a real **activation energy hump**
between:
- finding useful content
- extracting the structure
- organizing the concepts
- deciding what to trust
- getting a usable learning domain set up
Didactopus is meant to help overcome that hump.
Its ingestion and review pipeline should let a motivated learner or curator get from
"here is a pile of course material" to "here is a usable reviewed domain pack" with
substantially less friction.
## What is included
- interactive React SPA review UI
- JSON-backed review state model
- curation action application
- promoted-pack export
- reviewer notes and trust-status editing
- conflict resolution support
- README and FAQ updates reflecting the activation-energy goal
- sample review data and promoted pack output
## Core workflow
1. ingest course or topic materials into a draft pack
2. open the review UI
3. inspect concepts, conflicts, and review flags
4. edit statuses, notes, titles, descriptions, and prerequisites
5. resolve conflicts
6. export a promoted reviewed pack
## Why the review UI matters for course ingestion
In practice, course ingestion is not only a parsing problem. It is a **startup friction**
problem. A person may know what they want to study, and even know that good material exists,
but still fail to start because turning raw educational material into a coherent mastery
domain is too much work.
Didactopus should reduce that work enough that getting started becomes realistic.
### Review workflow
This revision adds a **review UI / curation workflow scaffold** for generated draft packs.
The purpose is to let a human reviewer inspect draft outputs from the course/topic
ingestion pipeline, make explicit curation decisions, and promote a reviewed draft
into a more trusted domain pack.
#### What is included
- review-state schema
- draft-pack loader
- curation action model
- review decision ledger
- promoted-pack writer
- static HTML review UI scaffold
- JSON data export for the UI
- sample curated review session
- sample promoted pack output
#### Core idea
Draft packs should not move directly into trusted use.
Instead, they should pass through a curation workflow where a reviewer can:
- merge concepts
- split concepts
- edit prerequisites
- mark concepts as trusted / provisional / rejected
- resolve conflict flags
- annotate rationale
- promote a curated pack into a reviewed pack
#### Status
This is a scaffold for a local-first workflow.
The HTML UI is static but wired to a concrete JSON review-state model so it can
later be upgraded into a richer SPA or desktop app without changing the data contracts.
### Course-to-course merger
This revision adds two major capabilities:
- **real document adapter scaffolds** for PDF, DOCX, PPTX, and HTML
- a **cross-course merger** for combining multiple course-derived packs into one stronger domain draft
These additions extend the earlier multi-source ingestion layer from "multiple files for one course"
to "multiple courses or course-like sources for one topic domain."
## What is included
- adapter registry for:
- PDF
- DOCX
- PPTX
- HTML
- Markdown
- text
- normalized document extraction interface
- course bundle ingestion across multiple source documents
- cross-course terminology and overlap analysis
- merged topic-pack emitter
- cross-course conflict report
- example source files and example merged output
## Design stance
This is still scaffold-level extraction. The purpose is to define stable interfaces and emitted artifacts,
not to claim perfect semantic parsing of every teaching document.
The implementation is designed so stronger parsers can later replace the stub extractors without changing
the surrounding pipeline.
### Multi-Source Course Ingestion
This revision adds a **Multi-Source Course Ingestion Layer**.
The pipeline can now accept multiple source files representing the same course or
topic domain, normalize them into a shared intermediate representation, merge them,
and emit a single draft Didactopus pack plus a conflict report.
#### Supported scaffold source types
Current scaffold adapters:
- Markdown (`.md`)
- Plain text (`.txt`)
- HTML-ish text (`.html`, `.htm`)
- Transcript text (`.transcript.txt`)
- Syllabus text (`.syllabus.txt`)
This revision is intentionally adapter-oriented, so future PDF, slide, and DOCX
adapters can be added behind the same interface.
#### What is included
- multi-source adapter dispatch
- normalized source records
- source merge logic
- cross-source terminology conflict report
- duplicate lesson/title detection
- merged draft pack emission
- merged attribution manifest
- sample multi-source inputs
- sample merged output pack
### Course Ingestion Pipeline
This revision adds a **Course-to-Pack Ingestion Pipeline** plus a **stable rule-policy adapter layer**.
The design goal is to turn open or user-supplied course materials into draft
Didactopus domain packs without introducing a brittle external rule-engine dependency.
#### Why no third-party rule engine here?
To minimize dependency risk, this scaffold uses a small declarative rule-policy
adapter implemented in pure Python and standard-library data structures.
That gives Didactopus:
- portable rules
- inspectable rule definitions
- deterministic behavior
- zero extra runtime dependency for policy evaluation
If a stronger rule engine is needed later, this adapter can remain the stable API surface.
#### What is included
- normalized course schema
- Markdown/HTML-ish text ingestion adapter
- module / lesson / objective extraction
- concept candidate extraction
- prerequisite guess generation
- rule-policy adapter
- draft pack emitter
- review report generation
- sample course input
- sample generated pack outputs
### Mastery Ledger
This revision adds a **Mastery Ledger + Capability Export** layer.
The main purpose is to let Didactopus turn accumulated learner state into
portable, inspectable artifacts that can support downstream deployment,
review, orchestration, or certification-like workflows.
#### What is new
- mastery ledger data model
- capability profile export
- JSON export of mastered concepts and evaluator summaries
- Markdown export of a readable capability report
- artifact manifest for produced deliverables
- demo CLI for generating exports for an AI student or human learner
- FAQ covering how learned mastery is represented and put to work
#### Why this matters
Didactopus can now do more than guide learning. It can also emit a structured
statement of what a learner appears able to do, based on explicit concepts,
evidence, and artifacts.
That makes it easier to use Didactopus as:
- a mastery tracker
- a portfolio generator
- a deployment-readiness aid
- an orchestration input for agent routing
#### Mastery representation
A learner's mastery is represented as structured operational state, including:
- mastered concepts
- evaluator results
- evidence summaries
- weak dimensions
- attempt history
- produced artifacts
- capability export
This is stricter than a normal chat transcript or self-description.
#### Future direction
A later revision should connect the capability export with:
- formal evaluator outputs
- signed evidence ledgers
- domain-specific capability schemas
- deployment policies for agent routing
### Evaluator Pipeline
This revision introduces a **pluggable evaluator pipeline** that converts
learner attempts into structured mastery evidence.
### Agentic Learner Loop
This revision adds an **agentic learner loop** that turns Didactopus into a closed-loop mastery system prototype.
The loop can now:
- choose the next concept via the graph-aware planner
- generate a synthetic learner attempt
- score the attempt into evidence
- update mastery state
- repeat toward a target concept
This is still scaffold-level, but it is the first explicit implementation of the idea that **Didactopus can supervise not only human learners, but also AI student agents**.
## Complete overview to this point
Didactopus currently includes:
- **Domain packs** for concepts, projects, rubrics, mastery profiles, templates, and cross-pack links
- **Dependency resolution** across packs
- **Merged learning graph** generation
- **Concept graph engine** for cross-pack prerequisite reasoning, linking, pathfinding, and export
- **Adaptive learner engine** for ready, blocked, and mastered concepts
- **Evidence engine** with weighted, recency-aware, multi-dimensional mastery inference
- **Concept-specific mastery profiles** with template inheritance
- **Graph-aware planner** for utility-ranked next-step recommendations
- **Agentic learner loop** for iterative goal-directed mastery acquisition
## Agentic AI students
An AI student under Didactopus is modeled as an **agent that accumulates evidence against concept mastery criteria**.
It does not “learn” in the same sense that model weights are retrained inside Didactopus. Instead, its learned mastery is represented as:
- current mastered concept set
- evidence history
- dimension-level competence summaries
- concept-specific weak dimensions
- adaptive plan state
- optional artifacts, explanations, project outputs, and critiques it has produced
In other words, Didactopus represents mastery as a **structured operational state**, not merely a chat transcript.
That state can be put to work by:
- selecting tasks the agent is now qualified to attempt
- routing domain-relevant problems to the agent
- exposing mastered concept profiles to orchestration logic
- using evidence summaries to decide whether the agent should act, defer, or review
- exporting a mastery portfolio for downstream use
## FAQ
See:
- `docs/faq.md`
## Correctness and formal knowledge components
See:
- `docs/correctness-and-knowledge-engine.md`
Short version: yes, there is a strong argument that Didactopus will eventually benefit from a more formal knowledge-engine layer, especially for domains where correctness can be stated in symbolic, logical, computational, or rule-governed terms.
A good future architecture is likely **hybrid**:
- LLM/agentic layer for explanation, synthesis, critique, and exploration
- formal knowledge engine for rule checking, constraint satisfaction, proof support, symbolic validation, and executable correctness checks
## Repository structure
```text
didactopus/
├── README.md
├── artwork/
├── configs/
├── docs/
├── examples/
├── src/didactopus/
├── tests/
└── webui/
```

View File

@ -0,0 +1,23 @@
# Codex working notes for Didactopustry1
## Priority
Stabilize public API compatibility before adding new features.
## Compatibility policy
Older test-facing public names should remain importable even if implemented as wrappers.
## Python version
Assume Python 3.10 compatibility.
## Preferred workflow
1. Read failing tests.
2. Patch the smallest number of files.
3. Run targeted pytest modules.
4. Run full pytest.
5. Summarize by subsystem.
## Avoid
- broad renames
- deleting newer architecture
- refactors unrelated to failing tests

View File

@ -57,5 +57,11 @@ def build_adaptive_plan(merged: MergedLearningGraph, profile: LearnerProfile, ne
p for p in merged.project_catalog
if set(p["prerequisites"]).issubset(profile.mastered_concepts)
]
next_best = [k for k, s in status.items() if s == "ready"][:next_limit]
def ready_priority(concept_key: str) -> tuple[int, int, str]:
prereqs = list(merged.graph.predecessors(concept_key))
mastered_prereqs = sum(1 for prereq in prereqs if prereq in profile.mastered_concepts)
return (0 if mastered_prereqs else 1, -mastered_prereqs, concept_key)
next_best = sorted((k for k, s in status.items() if s == "ready"), key=ready_priority)[:next_limit]
return AdaptivePlan(status, roadmap, next_best, eligible)

View File

@ -1,5 +1,6 @@
from dataclasses import dataclass, field
from .concept_graph import ConceptGraph
from .evaluator_pipeline import (
LearnerAttempt,
RubricEvaluator,
@ -10,6 +11,7 @@ from .evaluator_pipeline import (
run_pipeline,
aggregate,
)
from .planner import PlannerWeights, rank_next_concepts
@dataclass
@ -130,3 +132,30 @@ def run_demo_agentic_loop(concepts: list[str]) -> AgenticStudentState:
attempt = synthetic_attempt_for_concept(concept)
integrate_attempt(state, attempt)
return state
def run_agentic_learning_loop(
graph: ConceptGraph,
project_catalog: list[dict],
target_concepts: list[str],
weights: PlannerWeights,
max_steps: int = 4,
) -> AgenticStudentState:
state = AgenticStudentState()
for _ in range(max_steps):
ranked = rank_next_concepts(
graph=graph,
mastered=state.mastered_concepts,
targets=target_concepts,
weak_dimensions_by_concept={},
fragile_concepts=state.evidence_state.resurfaced_concepts,
project_catalog=project_catalog,
weights=weights,
)
if not ranked:
break
concept = ranked[0]["concept"]
integrate_attempt(state, synthetic_attempt_for_concept(concept))
if set(target_concepts).issubset(state.mastered_concepts):
break
return state

View File

@ -1,6 +1,12 @@
from __future__ import annotations
import os
from pydantic import BaseModel
from pathlib import Path
from typing import Any
import yaml
from pydantic import BaseModel, Field
class Settings(BaseModel):
database_url: str = os.getenv("DIDACTOPUS_DATABASE_URL", "sqlite+pysqlite:///:memory:")
@ -9,5 +15,53 @@ class Settings(BaseModel):
jwt_secret: str = os.getenv("DIDACTOPUS_JWT_SECRET", "change-me")
jwt_algorithm: str = "HS256"
class ReviewConfig(BaseModel):
default_reviewer: str = "Unknown Reviewer"
write_promoted_pack: bool = True
class BridgeConfig(BaseModel):
host: str = "127.0.0.1"
port: int = 8765
registry_path: str = "workspace_registry.json"
default_workspace_root: str = "workspaces"
class PlatformConfig(BaseModel):
dimension_thresholds: dict[str, float] = Field(
default_factory=lambda: {
"correctness": 0.8,
"explanation": 0.75,
"transfer": 0.7,
"project_execution": 0.75,
"critique": 0.7,
}
)
confidence_threshold: float = 0.8
@property
def default_dimension_thresholds(self) -> dict[str, float]:
return self.dimension_thresholds
class AppConfig(BaseModel):
review: ReviewConfig = Field(default_factory=ReviewConfig)
bridge: BridgeConfig = Field(default_factory=BridgeConfig)
platform: PlatformConfig = Field(default_factory=PlatformConfig)
def load_settings() -> Settings:
return Settings()
def load_config(path: str | Path) -> AppConfig:
data = yaml.safe_load(Path(path).read_text(encoding="utf-8")) or {}
return AppConfig.model_validate(_with_platform_defaults(data))
def _with_platform_defaults(data: dict[str, Any]) -> dict[str, Any]:
normalized = dict(data)
if "platform" not in normalized:
normalized["platform"] = {}
return normalized

View File

@ -116,6 +116,16 @@ def parse_source_file(path: str | Path, title: str = "") -> NormalizedSourceReco
return parse_markdown_like(text=text, title=inferred_title, source_name=p.name, source_path=str(p))
def parse_markdown_course(text: str, course_title: str, rights_note: str = "") -> NormalizedCourse:
record = parse_markdown_like(
text=text,
title=course_title,
source_name=f"{slugify(course_title)}.md",
source_path=f"{slugify(course_title)}.md",
)
return merge_source_records([record], course_title=course_title, rights_note=rights_note)
def merge_source_records(records: list[NormalizedSourceRecord], course_title: str, rights_note: str = "", merge_same_named_lessons: bool = True) -> NormalizedCourse:
modules_by_title: dict[str, Module] = {}
for record in records:

View File

@ -16,6 +16,11 @@ class NormalizedDocument(BaseModel):
metadata: dict = Field(default_factory=dict)
class NormalizedSourceRecord(NormalizedDocument):
source_name: str = ""
modules: list["Module"] = Field(default_factory=list)
class Lesson(BaseModel):
title: str
body: str = ""

View File

@ -1,2 +1,35 @@
from __future__ import annotations
from .pack_validator import load_pack_artifacts
def coverage_alignment_for_pack(source_dir):
return {'warnings': [], 'summary': {'coverage_warning_count': 0}}
loaded = load_pack_artifacts(source_dir)
if not loaded["ok"]:
return {"warnings": [], "summary": {"coverage_warning_count": 0}}
arts = loaded["artifacts"]
concepts = arts["concepts"].get("concepts", []) or []
roadmap = arts["roadmap"].get("stages", []) or []
projects = arts["projects"].get("projects", []) or []
covered = set()
for stage in roadmap:
covered.update(stage.get("concepts", []) or [])
for project in projects:
covered.update(project.get("prerequisites", []) or [])
warnings = []
for concept in concepts:
concept_id = concept.get("id", "")
if concept_id and concept_id not in covered:
warnings.append(f"Concept '{concept_id}' is not covered by the roadmap or project prerequisites.")
return {
"warnings": warnings,
"summary": {
"coverage_warning_count": len(warnings),
"concept_count": len(concepts),
"covered_concept_count": len(covered),
},
}

View File

@ -24,3 +24,12 @@ class DomainMap:
def topological_sequence(self) -> list[str]:
return list(nx.topological_sort(self.graph))
def build_demo_domain_map(domain_name: str) -> DomainMap:
dmap = DomainMap(domain_name)
dmap.add_concept(ConceptNode(name="foundations"))
dmap.add_concept(ConceptNode(name="methods", prerequisites=["foundations"]))
dmap.add_concept(ConceptNode(name="analysis", prerequisites=["methods"]))
dmap.add_concept(ConceptNode(name="projects", prerequisites=["analysis"]))
return dmap

View File

@ -1,2 +1,43 @@
from __future__ import annotations
import re
from .pack_validator import load_pack_artifacts
def _tok(text: str) -> set[str]:
return {part for part in re.sub(r"[^a-z0-9]+", " ", str(text).lower()).split() if part}
def evaluator_alignment_for_pack(source_dir):
return {'warnings': [], 'summary': {'evaluator_warning_count': 0}}
loaded = load_pack_artifacts(source_dir)
if not loaded["ok"]:
return {"warnings": [], "summary": {"evaluator_warning_count": 0}}
arts = loaded["artifacts"]
concepts = arts["concepts"].get("concepts", []) or []
evaluator = arts.get("evaluator", {}) or {}
dimensions = evaluator.get("dimensions", []) or []
dimension_tokens = set().union(
*[
_tok(dim if isinstance(dim, str) else dim.get("name", ""))
for dim in dimensions
]
) if dimensions else set()
warnings = []
for concept in concepts:
for signal in concept.get("mastery_signals", []) or []:
signal_tokens = _tok(signal)
if signal_tokens and signal_tokens.isdisjoint(dimension_tokens):
warnings.append(
f"Mastery signal for concept '{concept.get('id')}' is not aligned to declared evaluator dimensions."
)
return {
"warnings": warnings,
"summary": {
"evaluator_warning_count": len(warnings),
"dimension_count": len(dimensions),
},
}

View File

@ -2,10 +2,36 @@ from __future__ import annotations
from dataclasses import dataclass, field
from .adaptive_engine import LearnerProfile
DEFAULT_TYPE_WEIGHTS = {
"explanation": 1.0,
"problem": 1.0,
"transfer": 1.0,
"project": 1.0,
}
@dataclass
class EvidenceItem:
concept_key: str
evidence_type: str
score: float
is_recent: bool = False
rubric_dimensions: dict[str, float] = field(default_factory=dict)
@dataclass
class ConceptEvidenceSummary:
concept_key: str
count: int = 0
mean_score: float = 0.0
weighted_mean_score: float = 0.0
total_weight: float = 0.0
confidence: float = 0.0
dimension_means: dict[str, float] = field(default_factory=dict)
aggregated: dict[str, float] = field(default_factory=dict)
weak_dimensions: list[str] = field(default_factory=list)
mastered: bool = False
@ -14,3 +40,83 @@ class ConceptEvidenceSummary:
class EvidenceState:
summary_by_concept: dict[str, ConceptEvidenceSummary] = field(default_factory=dict)
resurfaced_concepts: set[str] = field(default_factory=set)
def evidence_weight(
item: EvidenceItem,
type_weights: dict[str, float] | None = None,
recent_multiplier: float = 1.0,
) -> float:
weights = type_weights or DEFAULT_TYPE_WEIGHTS
weight = weights.get(item.evidence_type, 1.0)
if item.is_recent:
weight *= recent_multiplier
return weight
def confidence_from_weight(total_weight: float) -> float:
if total_weight <= 0:
return 0.0
return total_weight / (total_weight + 1.0)
def add_evidence_item(state: EvidenceState, item: EvidenceItem) -> None:
summary = state.summary_by_concept.setdefault(item.concept_key, ConceptEvidenceSummary(concept_key=item.concept_key))
summary.count += 1
summary.mean_score = ((summary.mean_score * (summary.count - 1)) + item.score) / summary.count
def ingest_evidence_bundle(
profile: LearnerProfile,
items: list[EvidenceItem],
mastery_threshold: float = 0.8,
resurfacing_threshold: float = 0.55,
confidence_threshold: float = 0.0,
type_weights: dict[str, float] | None = None,
recent_multiplier: float = 1.0,
dimension_thresholds: dict[str, float] | None = None,
) -> EvidenceState:
state = EvidenceState()
grouped: dict[str, list[EvidenceItem]] = {}
for item in items:
grouped.setdefault(item.concept_key, []).append(item)
add_evidence_item(state, item)
for concept_key, concept_items in grouped.items():
summary = state.summary_by_concept[concept_key]
total_weight = sum(evidence_weight(item, type_weights, recent_multiplier) for item in concept_items)
weighted_score = sum(item.score * evidence_weight(item, type_weights, recent_multiplier) for item in concept_items)
summary.total_weight = total_weight
summary.weighted_mean_score = weighted_score / total_weight if total_weight else 0.0
summary.confidence = confidence_from_weight(total_weight)
dimension_values: dict[str, list[float]] = {}
for item in concept_items:
for dim, value in item.rubric_dimensions.items():
dimension_values.setdefault(dim, []).append(value)
summary.dimension_means = {
dim: sum(values) / len(values)
for dim, values in dimension_values.items()
}
summary.aggregated = dict(summary.dimension_means)
weak_dimensions: list[str] = []
if dimension_thresholds:
for dim, threshold in dimension_thresholds.items():
if dim in summary.dimension_means and summary.dimension_means[dim] < threshold:
weak_dimensions.append(dim)
summary.weak_dimensions = weak_dimensions
summary.mastered = (
summary.weighted_mean_score >= mastery_threshold
and summary.confidence >= confidence_threshold
and not weak_dimensions
)
if summary.mastered:
profile.mastered_concepts.add(concept_key)
state.resurfaced_concepts.discard(concept_key)
elif concept_key in profile.mastered_concepts and summary.weighted_mean_score < resurfacing_threshold:
profile.mastered_concepts.discard(concept_key)
state.resurfaced_concepts.add(concept_key)
return state

View File

@ -13,8 +13,8 @@ def evidence_flow_ledger_for_pack(source_dir):
concepts = arts["concepts"].get("concepts", []) or []
roadmap = arts["roadmap"].get("stages", []) or []
projects = arts["projects"].get("projects", []) or []
evaluator = arts["evaluator"] or {}
ledger = arts["mastery_ledger"] or {}
evaluator = arts.get("evaluator", {}) or {}
ledger = arts.get("mastery_ledger", {}) or {}
dimensions = evaluator.get("dimensions", []) or []
evidence_types = evaluator.get("evidence_types", []) or []

View File

@ -3,6 +3,8 @@ from __future__ import annotations
from dataclasses import dataclass, field
from typing import Any
import networkx as nx
from .artifact_registry import PackValidationResult, topological_pack_order
from .profile_templates import resolve_mastery_profile
@ -14,15 +16,24 @@ def namespaced_concept(pack_name: str, concept_id: str) -> str:
@dataclass
class MergedLearningGraph:
concept_data: dict[str, dict[str, Any]] = field(default_factory=dict)
stage_catalog: list[dict[str, Any]] = field(default_factory=list)
project_catalog: list[dict[str, Any]] = field(default_factory=list)
load_order: list[str] = field(default_factory=list)
graph: nx.DiGraph = field(default_factory=nx.DiGraph)
def build_merged_learning_graph(
results: list[PackValidationResult],
default_dimension_thresholds: dict[str, float],
default_dimension_thresholds: dict[str, float] | None = None,
) -> MergedLearningGraph:
merged = MergedLearningGraph()
default_dimension_thresholds = default_dimension_thresholds or {
"correctness": 0.8,
"explanation": 0.75,
"transfer": 0.7,
"project_execution": 0.75,
"critique": 0.7,
}
valid = {r.manifest.name: r for r in results if r.manifest is not None and r.is_valid}
merged.load_order = topological_pack_order(results)
@ -36,7 +47,15 @@ def build_merged_learning_graph(
for name, spec in result.manifest.profile_templates.items()
}
for concept in result.loaded_files["concepts"].concepts:
key = namespaced_concept(pack_name, concept.id)
override_key = next(
(
override
for override in result.manifest.overrides
if override.split("::")[-1] == concept.id
),
None,
)
key = override_key or namespaced_concept(pack_name, concept.id)
resolved_profile = resolve_mastery_profile(
concept.mastery_profile.model_dump(),
templates,
@ -51,6 +70,24 @@ def build_merged_learning_graph(
"mastery_signals": list(concept.mastery_signals),
"mastery_profile": resolved_profile,
}
for stage in result.loaded_files["roadmap"].stages:
merged.stage_catalog.append({
"id": f"{pack_name}::{stage.id}",
"pack": pack_name,
"title": stage.title,
"concepts": [
next(
(
override
for override in result.manifest.overrides
if override.split("::")[-1] == concept_id
),
namespaced_concept(pack_name, concept_id),
)
for concept_id in stage.concepts
],
"checkpoint": list(stage.checkpoint),
})
for project in result.loaded_files["projects"].projects:
merged.project_catalog.append({
"id": f"{pack_name}::{project.id}",
@ -60,4 +97,27 @@ def build_merged_learning_graph(
"prerequisites": [namespaced_concept(pack_name, p) for p in project.prerequisites],
"deliverables": list(project.deliverables),
})
for concept_key, concept in merged.concept_data.items():
merged.graph.add_node(concept_key)
for prereq in concept["prerequisites"]:
if prereq in merged.concept_data:
merged.graph.add_edge(prereq, concept_key)
return merged
def generate_learner_roadmap(merged: MergedLearningGraph) -> list[dict[str, Any]]:
roadmap: list[dict[str, Any]] = []
for stage in merged.stage_catalog:
for concept_key in stage["concepts"]:
if concept_key not in merged.concept_data:
continue
concept = merged.concept_data[concept_key]
roadmap.append({
"stage_id": stage["id"],
"stage_title": stage["title"],
"concept_key": concept_key,
"title": concept["title"],
"pack": concept["pack"],
})
return roadmap

View File

@ -0,0 +1,232 @@
from __future__ import annotations
import json
from pathlib import Path
from .agentic_loop import AgenticStudentState, integrate_attempt
from .artifact_registry import validate_pack
from .document_adapters import adapt_document
from .evaluator_pipeline import LearnerAttempt
from .graph_builder import build_concept_graph
from .mastery_ledger import (
build_capability_profile,
export_artifact_manifest,
export_capability_profile_json,
export_capability_report_markdown,
)
from .pack_emitter import build_draft_pack, write_draft_pack
from .rule_policy import RuleContext, build_default_rules, run_rules
from .topic_ingest import build_topic_bundle, document_to_course, extract_concept_candidates, merge_courses_into_topic_course
DEFAULT_RIGHTS_NOTE = (
"Derived from MIT OpenCourseWare 6.050J Information and Entropy (Spring 2008). "
"Retain MIT OCW attribution and applicable Creative Commons terms before redistribution."
)
DEFAULT_SKILL_TEMPLATE = """---
name: ocw-information-entropy-agent
description: Use the generated MIT OCW Information and Entropy pack, concept ordering, and learner artifacts to mentor or evaluate information-theory work.
---
# OCW Information Entropy Agent
Use this skill when the task is about tutoring, evaluating, or planning study in Information Theory using the generated MIT OCW 6.050J pack.
## Workflow
1. Read `references/generated-course-summary.md` for the pack structure and target concepts.
2. Read `references/generated-capability-summary.md` to understand what the demo AI learner already mastered.
3. Use `assets/generated/pack/` as the source of truth for concept ids, prerequisites, and mastery signals.
4. When giving guidance, preserve the pack ordering from fundamentals through coding and thermodynamics.
5. When uncertain, say which concept or prerequisite in the generated pack is underspecified.
## Outputs
- study plans grounded in the pack prerequisites
- concept explanations tied to entropy, coding, and channel capacity
- evaluation checklists using the generated capability report
- follow-up exercises that extend the existing learner artifacts
"""
def _strong_attempt(concept_key: str, title: str) -> LearnerAttempt:
symbolic_terms = ("coding", "capacity", "information")
artifact_type = "symbolic" if any(term in concept_key for term in symbolic_terms) else "explanation"
content = (
f"{title} matters because it links uncertainty to communication. "
f"Therefore {title.lower()} = structure for reasoning about messages. "
"One assumption is an idealized source model, one limitation is finite data, "
"and uncertainty remains when observations are noisy."
)
return LearnerAttempt(
concept=concept_key,
artifact_type=artifact_type,
content=content,
metadata={"deliverable_count": 2, "artifact_name": f"{concept_key.split('::')[-1]}.md"},
)
def _write_skill_bundle(
skill_dir: Path,
pack_dir: Path,
run_dir: Path,
concept_path: list[str],
mastered_concepts: list[str],
) -> None:
references_dir = skill_dir / "references"
assets_dir = skill_dir / "assets" / "generated"
references_dir.mkdir(parents=True, exist_ok=True)
(skill_dir / "agents").mkdir(parents=True, exist_ok=True)
assets_dir.mkdir(parents=True, exist_ok=True)
(skill_dir / "SKILL.md").write_text(DEFAULT_SKILL_TEMPLATE, encoding="utf-8")
(skill_dir / "agents" / "openai.yaml").write_text(
"\n".join(
[
"display_name: OCW Information Entropy Agent",
"short_description: Tutor and assess with the generated MIT OCW information-theory pack.",
"default_prompt: Help me use the MIT OCW information-and-entropy pack to study or evaluate work.",
]
),
encoding="utf-8",
)
summary_lines = [
"# Generated Course Summary",
"",
f"- Pack dir: `{pack_dir}`",
f"- Run dir: `{run_dir}`",
"",
"## Curriculum Path Used By The Demo Learner",
]
summary_lines.extend(f"- {concept}" for concept in concept_path)
summary_lines.extend(["", "## Mastered Concepts"])
summary_lines.extend(f"- {concept}" for concept in mastered_concepts)
(references_dir / "generated-course-summary.md").write_text("\n".join(summary_lines), encoding="utf-8")
capability_report = run_dir / "capability_report.md"
capability_summary = capability_report.read_text(encoding="utf-8") if capability_report.exists() else "# Capability Report\n"
(references_dir / "generated-capability-summary.md").write_text(capability_summary, encoding="utf-8")
pack_asset_dir = assets_dir / "pack"
run_asset_dir = assets_dir / "run"
pack_asset_dir.mkdir(parents=True, exist_ok=True)
run_asset_dir.mkdir(parents=True, exist_ok=True)
for source in pack_dir.iterdir():
if source.is_file():
(pack_asset_dir / source.name).write_text(source.read_text(encoding="utf-8"), encoding="utf-8")
for source in run_dir.iterdir():
if source.is_file():
(run_asset_dir / source.name).write_text(source.read_text(encoding="utf-8"), encoding="utf-8")
def run_ocw_information_entropy_demo(
course_source: str | Path,
pack_dir: str | Path,
run_dir: str | Path,
skill_dir: str | Path,
) -> dict:
course_source = Path(course_source)
pack_dir = Path(pack_dir)
run_dir = Path(run_dir)
skill_dir = Path(skill_dir)
doc = adapt_document(course_source)
course = document_to_course(doc, "MIT OCW Information and Entropy")
merged = merge_courses_into_topic_course(build_topic_bundle(course.title, [course]))
merged.rights_note = DEFAULT_RIGHTS_NOTE
concepts = extract_concept_candidates(merged)
ctx = RuleContext(course=merged, concepts=concepts)
run_rules(ctx, build_default_rules())
draft = build_draft_pack(
merged,
ctx.concepts,
author="MIT OCW derived demo",
license_name="CC BY-NC-SA 4.0",
review_flags=ctx.review_flags,
conflicts=[],
)
write_draft_pack(draft, pack_dir)
validation = validate_pack(pack_dir)
if not validation.is_valid:
raise ValueError(f"Generated pack failed validation: {validation.errors}")
graph = build_concept_graph([validation], default_dimension_thresholds={
"correctness": 0.8,
"explanation": 0.75,
"transfer": 0.7,
"project_execution": 0.75,
"critique": 0.7,
})
target_key = f"{draft.pack['name']}::thermodynamics-and-entropy"
concept_path = graph.curriculum_path_to_target(set(), target_key)
state = AgenticStudentState(
learner_id="ocw-information-entropy-agent",
display_name="OCW Information Entropy Agent",
)
for concept_key in concept_path:
title = graph.graph.nodes[concept_key].get("title", concept_key.split("::")[-1])
integrate_attempt(state, _strong_attempt(concept_key, title))
profile = build_capability_profile(state, merged.title)
run_dir.mkdir(parents=True, exist_ok=True)
export_capability_profile_json(profile, str(run_dir / "capability_profile.json"))
export_capability_report_markdown(profile, str(run_dir / "capability_report.md"))
export_artifact_manifest(profile, str(run_dir / "artifact_manifest.json"))
summary = {
"course_source": str(course_source),
"pack_dir": str(pack_dir),
"skill_dir": str(skill_dir),
"review_flags": list(ctx.review_flags),
"concept_count": len(ctx.concepts),
"target_concept": target_key,
"curriculum_path": concept_path,
"mastered_concepts": sorted(state.mastered_concepts),
"artifact_count": len(state.artifacts),
}
(run_dir / "run_summary.json").write_text(json.dumps(summary, indent=2), encoding="utf-8")
_write_skill_bundle(skill_dir, pack_dir, run_dir, concept_path, summary["mastered_concepts"])
return summary
def main() -> None:
import argparse
root = Path(__file__).resolve().parents[2]
parser = argparse.ArgumentParser(description="Generate a domain pack and skill bundle from MIT OCW Information and Entropy.")
parser.add_argument(
"--course-source",
default=str(root / "examples" / "ocw-information-entropy" / "6-050j-information-and-entropy.md"),
)
parser.add_argument(
"--pack-dir",
default=str(root / "domain-packs" / "mit-ocw-information-entropy"),
)
parser.add_argument(
"--run-dir",
default=str(root / "examples" / "ocw-information-entropy-run"),
)
parser.add_argument(
"--skill-dir",
default=str(root / "skills" / "ocw-information-entropy-agent"),
)
args = parser.parse_args()
summary = run_ocw_information_entropy_demo(
course_source=args.course_source,
pack_dir=args.pack_dir,
run_dir=args.run_dir,
skill_dir=args.skill_dir,
)
print(json.dumps(summary, indent=2))
if __name__ == "__main__":
main()

View File

@ -0,0 +1,241 @@
from __future__ import annotations
import json
import re
from pathlib import Path
import yaml
def _slug_label(concept_key: str) -> str:
return concept_key.split("::", 1)[-1].replace("-", " ").title()
def _mean_score(summary: dict[str, float]) -> float:
if not summary:
return 0.0
return sum(summary.values()) / len(summary)
def build_progress_svg(run_summary: dict, capability_profile: dict, width: int = 1400, row_height: int = 74) -> str:
path = list(run_summary.get("curriculum_path", []))
mastered = set(capability_profile.get("mastered_concepts", []))
evaluator_summary = capability_profile.get("evaluator_summary_by_concept", {}) or {}
artifact_map = {item["concept"]: item for item in capability_profile.get("artifacts", [])}
height = 210 + max(1, len(path)) * row_height
parts = [
f'<svg xmlns="http://www.w3.org/2000/svg" width="{width}" height="{height}" viewBox="0 0 {width} {height}">',
"<style>",
"text { font-family: Arial, sans-serif; }",
".title { font-size: 32px; font-weight: 700; fill: #123047; }",
".subtitle { font-size: 16px; fill: #426173; }",
".label { font-size: 16px; font-weight: 600; fill: #163247; }",
".meta { font-size: 13px; fill: #4c6473; }",
".pill { font-size: 12px; font-weight: 700; }",
"</style>",
'<rect x="0" y="0" width="100%" height="100%" fill="#f4f1ea"/>',
'<rect x="34" y="28" width="1332" height="110" rx="24" fill="#fffdf8" stroke="#d8cdb9"/>',
'<text x="64" y="72" class="title">MIT OCW Information and Entropy: Learner Progress</text>',
f'<text x="64" y="101" class="subtitle">Target concept: {_slug_label(run_summary["target_concept"])} | '
f'Mastered {len(mastered)} of {len(path)} guided path concepts | '
f'Artifacts: {run_summary.get("artifact_count", 0)}</text>',
'<text x="64" y="124" class="subtitle">Generated from the Didactopus OCW demo run.</text>',
]
line_x = 118
first_y = 176
last_y = first_y + (len(path) - 1) * row_height if path else first_y
parts.append(f'<line x1="{line_x}" y1="{first_y}" x2="{line_x}" y2="{last_y}" stroke="#b2aa99" stroke-width="6" stroke-linecap="round"/>')
for index, concept_key in enumerate(path):
y = first_y + index * row_height
is_mastered = concept_key in mastered
summary = evaluator_summary.get(concept_key, {})
mean_score = _mean_score(summary)
node_fill = "#1d7f5f" if is_mastered else "#c78a15"
card_fill = "#fffefb" if is_mastered else "#fff8ea"
card_stroke = "#cfe1d8" if is_mastered else "#e4d1a7"
artifact = artifact_map.get(concept_key)
artifact_label = artifact["artifact_name"] if artifact else "no artifact"
parts.append(f'<circle cx="{line_x}" cy="{y}" r="17" fill="{node_fill}" stroke="#ffffff" stroke-width="4"/>')
parts.append(f'<rect x="150" y="{y - 28}" width="1180" height="56" rx="18" fill="{card_fill}" stroke="{card_stroke}"/>')
parts.append(f'<text x="178" y="{y - 4}" class="label">{index + 1}. {_slug_label(concept_key)}</text>')
parts.append(
f'<text x="178" y="{y + 18}" class="meta">{concept_key} | mean evaluator score {mean_score:.2f} | artifact {artifact_label}</text>'
)
pill_fill = "#d8efe6" if is_mastered else "#f6e2b7"
pill_text = "#1d5d47" if is_mastered else "#8a5a00"
pill_label = "MASTERED" if is_mastered else "IN PROGRESS"
parts.append(f'<rect x="1180" y="{y - 18}" width="120" height="28" rx="14" fill="{pill_fill}"/>')
parts.append(f'<text x="1240" y="{y + 1}" text-anchor="middle" class="pill" fill="{pill_text}">{pill_label}</text>')
parts.append("</svg>")
return "".join(parts)
def build_progress_html(svg: str) -> str:
return "\n".join(
[
"<!doctype html>",
"<html lang=\"en\">",
"<meta charset=\"utf-8\">",
"<title>MIT OCW Information and Entropy Learner Progress</title>",
"<body style=\"margin:0;background:#ece7dd;display:flex;justify-content:center;padding:24px;\">",
svg,
"</body>",
"</html>",
]
)
def _extract_lesson_name(description: str) -> str | None:
match = re.search(r"lesson '([^']+)'", description)
return match.group(1) if match else None
def build_full_concept_map_svg(run_summary: dict, capability_profile: dict, concepts_data: dict, width: int = 1760) -> str:
path = list(run_summary.get("curriculum_path", []))
mastered = set(capability_profile.get("mastered_concepts", []))
evaluator_summary = capability_profile.get("evaluator_summary_by_concept", {}) or {}
concepts = concepts_data.get("concepts", []) or []
pack_name = run_summary["target_concept"].split("::", 1)[0]
title_by_key = {key: _slug_label(key) for key in path}
key_by_title = {value: key for key, value in title_by_key.items()}
grouped: dict[str, list[dict]] = {key: [] for key in path}
for concept in concepts:
concept_key = f"{pack_name}::{concept['id']}"
if concept_key in grouped:
continue
lesson_name = _extract_lesson_name(str(concept.get("description", "")))
anchor = None
if concept.get("prerequisites"):
anchor = f"{pack_name}::{concept['prerequisites'][0]}"
elif lesson_name and lesson_name in key_by_title:
anchor = key_by_title[lesson_name]
else:
anchor = path[0] if path else concept_key
if anchor not in grouped:
grouped[anchor] = []
grouped[anchor].append(concept)
row_height = 124
top = 210
height = max(1200, top + max(1, len(path)) * row_height + 120)
center_x = width // 2
left_x = 280
right_x = width - 280
parts = [
f'<svg xmlns="http://www.w3.org/2000/svg" width="{width}" height="{height}" viewBox="0 0 {width} {height}">',
"<style>",
"text { font-family: Arial, sans-serif; }",
".title { font-size: 34px; font-weight: 700; fill: #123047; }",
".subtitle { font-size: 16px; fill: #496476; }",
".core { font-size: 17px; font-weight: 700; fill: #173042; }",
".meta { font-size: 12px; fill: #516877; }",
".small { font-size: 11px; fill: #6e7f89; }",
"</style>",
'<rect x="0" y="0" width="100%" height="100%" fill="#f2efe8"/>',
'<rect x="32" y="28" width="1696" height="120" rx="24" fill="#fffdf8" stroke="#d7cfbf"/>',
'<text x="58" y="72" class="title">MIT OCW Information and Entropy: Full Concept Map</text>',
'<text x="58" y="100" class="subtitle">Center column = guided mastery path. Side nodes = extractor spillover and auxiliary concepts grouped by lesson anchor.</text>',
'<text x="58" y="123" class="subtitle">Green = mastered path concept, blue = structured but unmastered, sand = noisy side concept.</text>',
f'<line x1="{center_x}" y1="{top}" x2="{center_x}" y2="{top + (len(path) - 1) * row_height if path else top}" stroke="#b8b09f" stroke-width="8" stroke-linecap="round"/>',
]
for index, concept_key in enumerate(path):
y = top + index * row_height
title = _slug_label(concept_key)
summary = evaluator_summary.get(concept_key, {})
mean_score = _mean_score(summary)
is_mastered = concept_key in mastered
node_fill = "#237a61" if is_mastered else "#4f86c6"
card_fill = "#f5fbf8" if is_mastered else "#eef4fb"
card_stroke = "#b9d8cc" if is_mastered else "#c8d7eb"
parts.append(f'<circle cx="{center_x}" cy="{y}" r="20" fill="{node_fill}" stroke="#ffffff" stroke-width="4"/>')
parts.append(f'<rect x="{center_x - 180}" y="{y - 31}" width="360" height="62" rx="20" fill="{card_fill}" stroke="{card_stroke}"/>')
parts.append(f'<text x="{center_x}" y="{y - 6}" text-anchor="middle" class="core">{index + 1}. {title}</text>')
parts.append(f'<text x="{center_x}" y="{y + 16}" text-anchor="middle" class="meta">{concept_key} | mean score {mean_score:.2f}</text>')
satellites = sorted(grouped.get(concept_key, []), key=lambda item: item.get("title", item.get("id", "")))
left_items = satellites[::2][:5]
right_items = satellites[1::2][:5]
for side, items in ((-1, left_items), (1, right_items)):
base_x = left_x if side < 0 else right_x
for sat_index, item in enumerate(items):
sat_y = y - 34 + sat_index * 17
sat_width = 250
rect_x = base_x - sat_width // 2
parts.append(f'<line x1="{center_x + side * 24}" y1="{y}" x2="{base_x - side * 130}" y2="{sat_y}" stroke="#d1c6ae" stroke-width="2"/>')
parts.append(f'<rect x="{rect_x}" y="{sat_y - 12}" width="{sat_width}" height="24" rx="12" fill="#fbf3de" stroke="#e3cf99"/>')
parts.append(
f'<text x="{base_x}" y="{sat_y + 4}" text-anchor="middle" class="small">{item.get("title", item.get("id", ""))}</text>'
)
hidden_count = max(0, len(satellites) - 10)
if hidden_count:
parts.append(f'<text x="{right_x + 170}" y="{y + 42}" class="small">+{hidden_count} more side concepts</text>')
parts.append("</svg>")
return "".join(parts)
def render_ocw_progress_visualization(run_dir: str | Path, out_svg: str | Path | None = None, out_html: str | Path | None = None) -> dict[str, str]:
run_dir = Path(run_dir)
run_summary = json.loads((run_dir / "run_summary.json").read_text(encoding="utf-8"))
capability_profile = json.loads((run_dir / "capability_profile.json").read_text(encoding="utf-8"))
svg = build_progress_svg(run_summary, capability_profile)
out_svg = Path(out_svg) if out_svg is not None else run_dir / "learner_progress.svg"
out_html = Path(out_html) if out_html is not None else run_dir / "learner_progress.html"
out_svg.write_text(svg, encoding="utf-8")
out_html.write_text(build_progress_html(svg), encoding="utf-8")
return {"svg": str(out_svg), "html": str(out_html)}
def render_ocw_full_concept_map(
run_dir: str | Path,
pack_dir: str | Path,
out_svg: str | Path | None = None,
out_html: str | Path | None = None,
) -> dict[str, str]:
run_dir = Path(run_dir)
pack_dir = Path(pack_dir)
run_summary = json.loads((run_dir / "run_summary.json").read_text(encoding="utf-8"))
capability_profile = json.loads((run_dir / "capability_profile.json").read_text(encoding="utf-8"))
concepts_data = yaml.safe_load((pack_dir / "concepts.yaml").read_text(encoding="utf-8")) or {}
svg = build_full_concept_map_svg(run_summary, capability_profile, concepts_data)
out_svg = Path(out_svg) if out_svg is not None else run_dir / "learner_progress_full_map.svg"
out_html = Path(out_html) if out_html is not None else run_dir / "learner_progress_full_map.html"
out_svg.write_text(svg, encoding="utf-8")
out_html.write_text(build_progress_html(svg), encoding="utf-8")
return {"svg": str(out_svg), "html": str(out_html)}
def main() -> None:
import argparse
root = Path(__file__).resolve().parents[2]
parser = argparse.ArgumentParser(description="Render learner progress visualization for the OCW Information and Entropy demo.")
parser.add_argument(
"--run-dir",
default=str(root / "examples" / "ocw-information-entropy-run"),
)
parser.add_argument(
"--pack-dir",
default=str(root / "domain-packs" / "mit-ocw-information-entropy"),
)
parser.add_argument("--out-svg", default=None)
parser.add_argument("--out-html", default=None)
parser.add_argument("--full-map", action="store_true")
args = parser.parse_args()
if args.full_map:
outputs = render_ocw_full_concept_map(args.run_dir, args.pack_dir, args.out_svg, args.out_html)
else:
outputs = render_ocw_progress_visualization(args.run_dir, args.out_svg, args.out_html)
print(json.dumps(outputs, indent=2))
if __name__ == "__main__":
main()

View File

@ -6,7 +6,14 @@ import yaml
from .course_schema import NormalizedCourse, ConceptCandidate, DraftPack
def build_draft_pack(course: NormalizedCourse, concepts: list[ConceptCandidate], author: str, license_name: str, review_flags: list[str], conflicts: list[str]) -> DraftPack:
def build_draft_pack(
course: NormalizedCourse,
concepts: list[ConceptCandidate],
author: str,
license_name: str,
review_flags: list[str],
conflicts: list[str] | None = None,
) -> DraftPack:
pack_name = course.title.lower().replace(" ", "-")
pack = {
"name": pack_name,
@ -76,7 +83,7 @@ def build_draft_pack(course: NormalizedCourse, concepts: list[ConceptCandidate],
rubrics=rubrics,
review_report=review_flags,
attribution=attribution,
conflicts=conflicts,
conflicts=conflicts or [],
)

View File

@ -3,6 +3,10 @@ from pathlib import Path
import yaml
REQUIRED_FILES = ["pack.yaml", "concepts.yaml", "roadmap.yaml", "projects.yaml", "rubrics.yaml"]
OPTIONAL_FILES = {
"evaluator": "evaluator.yaml",
"mastery_ledger": "mastery_ledger.yaml",
}
def _safe_load_yaml(path: Path, errors: list[str], label: str):
try:
@ -30,6 +34,9 @@ def load_pack_artifacts(source_dir: str | Path) -> dict:
roadmap_data = _safe_load_yaml(source / "roadmap.yaml", errors, "roadmap.yaml")
projects_data = _safe_load_yaml(source / "projects.yaml", errors, "projects.yaml")
rubrics_data = _safe_load_yaml(source / "rubrics.yaml", errors, "rubrics.yaml")
optional_data = {}
for key, filename in OPTIONAL_FILES.items():
optional_data[key] = _safe_load_yaml(source / filename, errors, filename) if (source / filename).exists() else {}
return {
"ok": len(errors) == 0,
"errors": errors,
@ -41,6 +48,7 @@ def load_pack_artifacts(source_dir: str | Path) -> dict:
"roadmap": roadmap_data,
"projects": projects_data,
"rubrics": rubrics_data,
**optional_data,
},
}

View File

@ -1,2 +1,42 @@
from __future__ import annotations
from .pack_validator import load_pack_artifacts
def path_quality_for_pack(source_dir):
return {'warnings': [], 'summary': {'path_warning_count': 0}}
loaded = load_pack_artifacts(source_dir)
if not loaded["ok"]:
return {"warnings": [], "summary": {"path_warning_count": 0}}
arts = loaded["artifacts"]
concepts = arts["concepts"].get("concepts", []) or []
roadmap = arts["roadmap"].get("stages", []) or []
projects = arts["projects"].get("projects", []) or []
assessed = set()
warnings = []
for stage in roadmap:
stage_concepts = stage.get("concepts", []) or []
checkpoint = stage.get("checkpoint", []) or []
if checkpoint:
assessed.update(stage_concepts)
else:
warnings.append(f"Stage '{stage.get('id', 'unknown')}' has no checkpoint.")
for project in projects:
assessed.update(project.get("prerequisites", []) or [])
for concept in concepts:
concept_id = concept.get("id", "")
if concept_id and concept_id not in assessed:
warnings.append(f"Concept '{concept_id}' is not visibly assessed in roadmap checkpoints or project prerequisites.")
return {
"warnings": warnings,
"summary": {
"path_warning_count": len(warnings),
"stage_count": len(roadmap),
"project_count": len(projects),
},
}

View File

@ -1,4 +1,6 @@
from __future__ import annotations
from datetime import datetime
from .learner_state import LearnerState, EvidenceEvent, MasteryRecord
def apply_evidence(
@ -29,3 +31,14 @@ def apply_evidence(
rec.last_updated = event.timestamp
state.history.append(event)
return state
def decay_confidence(state: LearnerState, now_timestamp: str, daily_decay: float = 0.01) -> LearnerState:
now = datetime.fromisoformat(now_timestamp)
for record in state.records:
if not record.last_updated:
continue
updated = datetime.fromisoformat(record.last_updated)
elapsed_days = max(0.0, (now - updated).total_seconds() / 86400.0)
record.confidence = max(0.0, record.confidence * ((1.0 - daily_decay) ** elapsed_days))
return state

View File

@ -32,3 +32,14 @@ def export_promoted_pack(session: ReviewSession, outdir: str | Path) -> None:
(outdir / "concepts.yaml").write_text(yaml.safe_dump({"concepts": concepts}, sort_keys=False), encoding="utf-8")
(outdir / "review_ledger.json").write_text(json.dumps(session.model_dump(), indent=2), encoding="utf-8")
(outdir / "license_attribution.json").write_text(json.dumps(session.draft_pack.attribution, indent=2), encoding="utf-8")
def export_review_ui_data(session: ReviewSession, outdir: str | Path) -> None:
outdir = Path(outdir)
outdir.mkdir(parents=True, exist_ok=True)
payload = {
"reviewer": session.reviewer,
"draft_pack": session.draft_pack.model_dump(),
"ledger": [entry.model_dump() for entry in session.ledger],
}
(outdir / "review_data.json").write_text(json.dumps(payload, indent=2), encoding="utf-8")

View File

@ -28,6 +28,8 @@ def document_to_course(doc: NormalizedDocument, course_title: str) -> Normalized
lessons = []
for section in doc.sections:
body = section.body.strip()
if not body:
continue
lines = body.splitlines()
objectives = []
exercises = []

View File

@ -1,12 +1,12 @@
from __future__ import annotations
from pathlib import Path
from datetime import datetime, UTC
from datetime import datetime, timezone
import json, shutil
from .review_schema import WorkspaceMeta, WorkspaceRegistry
from .import_validator import preview_draft_pack_import
def utc_now() -> str:
return datetime.now(UTC).isoformat()
return datetime.now(timezone.utc).isoformat()
class WorkspaceManager:
def __init__(self, registry_path: str | Path, default_workspace_root: str | Path) -> None:

View File

@ -18,3 +18,15 @@ def test_detects_uncovered_concepts(tmp_path: Path) -> None:
)
result = coverage_alignment_for_pack(tmp_path)
assert any("c2" in w for w in result["warnings"])
def test_covered_concepts_do_not_warn(tmp_path: Path) -> None:
make_pack(
tmp_path,
"concepts:\n - id: c1\n title: Foundations\n description: enough description here\n mastery_signals: [Explain foundations]\n",
"stages:\n - id: s1\n title: One\n concepts: [c1]\n checkpoint: [Explain foundations]\n",
"projects:\n - id: p1\n title: Project\n prerequisites: [c1]\n deliverables: [short report]\n",
"rubrics:\n - id: r1\n title: Basic\n criteria: [correctness]\n",
)
result = coverage_alignment_for_pack(tmp_path)
assert result["warnings"] == []

View File

@ -18,3 +18,14 @@ def test_detects_uncovered_mastery_signals(tmp_path: Path) -> None:
"dimensions:\n - name: typography\n description: page polish\n")
result = evaluator_alignment_for_pack(tmp_path)
assert any('Mastery signal' in w for w in result['warnings'])
def test_matching_dimension_suppresses_warning(tmp_path: Path) -> None:
make_pack(tmp_path,
"concepts:\n - id: c1\n title: Foundations\n description: enough description here\n mastery_signals: [Explain foundations clearly]\n",
"stages:\n - id: s1\n title: One\n concepts: [c1]\n checkpoint: [Explain foundations]\n",
"projects:\n - id: p1\n title: Project\n prerequisites: [c1]\n deliverables: [report]\n",
"rubrics:\n - id: r1\n title: Basic\n criteria: [correctness]\n",
"dimensions:\n - name: explain\n description: explanation quality\n")
result = evaluator_alignment_for_pack(tmp_path)
assert result["warnings"] == []

View File

@ -22,3 +22,18 @@ def test_detects_missing_dimension_mapping(tmp_path: Path) -> None:
)
result = evidence_flow_ledger_for_pack(tmp_path)
assert any("dimension" in w.lower() for w in result["warnings"])
def test_optional_artifacts_absent_still_returns_summary(tmp_path: Path) -> None:
make_pack(
tmp_path,
"concepts:\n - id: c1\n title: Foundations\n description: enough description here\n mastery_signals: [Explain foundations clearly]\n",
"stages:\n - id: s1\n title: One\n concepts: [c1]\n checkpoint: [exercise]\n",
"projects:\n - id: p1\n title: Project\n prerequisites: [c1]\n deliverables: [report]\n",
"rubrics:\n - id: r1\n title: Basic\n criteria: [correctness]\n",
"",
"",
)
result = evidence_flow_ledger_for_pack(tmp_path)
assert "summary" in result
assert isinstance(result["warnings"], list)

View File

@ -0,0 +1,19 @@
from pathlib import Path
from didactopus.ocw_information_entropy_demo import run_ocw_information_entropy_demo
def test_ocw_information_entropy_demo_generates_pack_and_skill(tmp_path: Path) -> None:
root = Path(__file__).resolve().parents[1]
summary = run_ocw_information_entropy_demo(
course_source=root / "examples" / "ocw-information-entropy" / "6-050j-information-and-entropy.md",
pack_dir=tmp_path / "pack",
run_dir=tmp_path / "run",
skill_dir=tmp_path / "skill",
)
assert (tmp_path / "pack" / "pack.yaml").exists()
assert (tmp_path / "run" / "capability_profile.json").exists()
assert (tmp_path / "skill" / "SKILL.md").exists()
assert summary["target_concept"].endswith("thermodynamics-and-entropy")
assert summary["mastered_concepts"]

View File

@ -0,0 +1,33 @@
from pathlib import Path
from didactopus.ocw_progress_viz import render_ocw_full_concept_map, render_ocw_progress_visualization
def test_render_ocw_progress_visualization(tmp_path: Path) -> None:
root = Path(__file__).resolve().parents[1]
outputs = render_ocw_progress_visualization(
root / "examples" / "ocw-information-entropy-run",
tmp_path / "progress.svg",
tmp_path / "progress.html",
)
assert Path(outputs["svg"]).exists()
assert Path(outputs["html"]).exists()
assert "learner_progress" not in Path(outputs["svg"]).read_text(encoding="utf-8")
assert "MASTERED" in Path(outputs["svg"]).read_text(encoding="utf-8")
def test_render_ocw_full_concept_map(tmp_path: Path) -> None:
root = Path(__file__).resolve().parents[1]
outputs = render_ocw_full_concept_map(
root / "examples" / "ocw-information-entropy-run",
root / "domain-packs" / "mit-ocw-information-entropy",
tmp_path / "full.svg",
tmp_path / "full.html",
)
svg = Path(outputs["svg"]).read_text(encoding="utf-8")
assert Path(outputs["svg"]).exists()
assert Path(outputs["html"]).exists()
assert "Full Concept Map" in svg
assert "extractor spillover" in svg

View File

@ -18,3 +18,14 @@ def test_detects_checkpoint_and_unassessed_issues(tmp_path: Path) -> None:
result = path_quality_for_pack(tmp_path)
assert any("no checkpoint" in w.lower() for w in result["warnings"])
assert any("not visibly assessed" in w.lower() for w in result["warnings"])
def test_assessed_path_has_no_warnings(tmp_path: Path) -> None:
make_pack(
tmp_path,
"concepts:\n - id: c1\n title: Foundations\n description: foundations description enough\n prerequisites: []\n",
"stages:\n - id: s1\n title: One\n concepts: [c1]\n checkpoint: [quiz]\n",
"projects:\n - id: p1\n title: Project\n prerequisites: [c1]\n deliverables: [report]\n",
)
result = path_quality_for_pack(tmp_path)
assert result["warnings"] == []

View File

@ -24,3 +24,11 @@ def test_extract_concepts(tmp_path: Path) -> None:
course = document_to_course(doc, "Topic")
concepts = extract_concept_candidates(course)
assert len(concepts) >= 1
def test_document_to_course_skips_empty_sections(tmp_path: Path) -> None:
a = tmp_path / "a.md"
a.write_text("# T\n\n## Empty\n\n### Filled\nBody.", encoding="utf-8")
doc = adapt_document(a)
course = document_to_course(doc, "Topic")
assert [lesson.title for lesson in course.modules[0].lessons] == ["Filled"]