diff --git a/README.md b/README.md index 0297756..54525e0 100644 --- a/README.md +++ b/README.md @@ -1,319 +1,148 @@ # Didactopus -**Didactopus** is a local‑first AI‑assisted autodidactic mastery platform designed to help motivated learners achieve **true expertise** in a chosen domain. +**Didactopus** is a local-first AI-assisted autodidactic mastery platform for building genuine expertise through concept graphs, adaptive curriculum planning, evidence-driven mastery, Socratic mentoring, and project-based learning. -![Didactopus mascot](artwork/didactopus-mascot.png) +**Tagline:** *Many arms, one goal — mastery.* -The system combines: +## Complete overview to this point -• domain knowledge graphs -• mastery‑based learning models -• evidence‑driven assessment -• Socratic mentoring -• adaptive curriculum generation -• project‑based evaluation +Didactopus is designed to support both **human learners** and, potentially, **agentic AI students** that use the same mastery infrastructure to become competent in a target domain. -Didactopus is designed for **serious learning**, not shallow answer generation. +The current architecture includes: -Its core philosophy is: +- **Domain packs** for contributed concepts, projects, rubrics, and mastery profiles +- **Dependency resolution** across packs +- **Merged learning graph** generation +- **Adaptive learner engine** that identifies ready, blocked, and mastered concepts +- **Evidence engine** with weighted, recency-aware, multi-dimensional mastery inference +- **Concept-specific mastery profiles** with template inheritance +- **Concept graph engine** for cross-pack prerequisite reasoning, concept linking, pathfinding, and graph export -> AI should function as a mentor, evaluator, and guide — not a substitute for thinking. +## Core philosophy ---- +Didactopus assumes that real expertise is built through: -# Project Goals +- explanation +- problem solving +- transfer +- critique +- project execution -Didactopus aims to enable learners to: +The AI layer should function as a **mentor, evaluator, and curriculum partner**, not an answer vending machine. -• build deep conceptual understanding -• practice reasoning and explanation -• complete real projects demonstrating competence -• identify weak areas through evidence‑based feedback -• progress through mastery rather than time spent +## Domain packs -The platform is particularly suitable for: +Knowledge enters the system through versioned, shareable **domain packs**. Each pack can contribute: -• autodidacts -• researchers entering new fields -• students supplementing formal education -• interdisciplinary learners -• AI‑assisted self‑study programs - ---- - -# Key Architectural Concepts - -## Domain Packs - -Knowledge is distributed as **domain packs** contributed by the community. - -Each pack can include: - -- concept definitions -- prerequisite graphs -- learning roadmaps +- concepts +- prerequisites +- learning stages - projects - rubrics - mastery profiles +- profile templates +- cross-pack concept links -Example packs: +## Concept graph engine -``` -domain-packs/ - statistics-foundations - bayes-extension - applied-inference -``` +This revision implements a concept graph engine with: -Domain packs are validated, dependency‑checked, and merged into a **unified learning graph**. +- prerequisite reasoning across packs +- cross-pack concept linking +- semantic concept similarity hooks +- automatic curriculum pathfinding +- visualization export for mastery graphs ---- +Concepts are namespaced as `pack-name::concept-id`. -# Learning Graph +### Cross-pack links -Didactopus merges all installed packs into a directed concept graph: +Domain packs may declare conceptual links such as: -``` -Concept A → Concept B → Concept C -``` +- `equivalent_to` +- `related_to` +- `extends` +- `depends_on` -Edges represent prerequisites. +These links enable Didactopus to reason across pack boundaries rather than treating each pack as an isolated island. -The system then generates: +### Semantic similarity -• adaptive learning roadmaps -• next-best concepts to study -• projects unlocked by prerequisite completion +A semantic similarity layer is included as a hook for: ---- +- token overlap similarity +- future embedding-based similarity +- future ontology and LLM-assisted concept alignment -# Evidence‑Driven Mastery +### Curriculum pathfinding -Concept mastery is **inferred from evidence**, not declared. +The concept graph engine supports: -Evidence types include: +- prerequisite chains +- shortest dependency paths +- next-ready concept discovery +- reachability analysis +- curriculum path generation from a learner’s mastery state to a target concept -• explanations -• problem solutions -• transfer tasks -• project deliverables +### Visualization -Evidence contributes weighted scores that determine: +Graphs can be exported to: -• mastery state -• learner confidence -• weak dimensions requiring further practice +- Graphviz DOT +- Cytoscape-style JSON ---- +## Evidence-driven mastery -# Multi‑Dimensional Mastery +Mastery is inferred from evidence such as: -Didactopus tracks multiple competence dimensions: +- explanations +- problem solutions +- transfer tasks +- project artifacts -| Dimension | Meaning | -|---|---| -| correctness | accurate reasoning | -| explanation | ability to explain clearly | -| transfer | ability to apply knowledge | -| project_execution | ability to build artifacts | -| critique | ability to detect errors and assumptions | +Evidence is: -Different concepts can require different combinations of these dimensions. +- weighted by type +- optionally up-weighted for recency +- summarized by competence dimension +- compared against concept-specific mastery profiles ---- +## Multi-dimensional mastery -# Concept Mastery Profiles +Current dimensions include: -Concepts define **mastery profiles** specifying: +- `correctness` +- `explanation` +- `transfer` +- `project_execution` +- `critique` -• required dimensions -• threshold overrides +Different concepts can require different subsets of these dimensions. -Example: +## Agentic AI students -```yaml -mastery_profile: - required_dimensions: - - correctness - - transfer - - critique - dimension_threshold_overrides: - transfer: 0.8 - critique: 0.8 -``` +Didactopus is also architecturally suitable for **AI learner agents**. ---- +An agentic AI student could: -# Mastery Profile Inheritance +1. ingest domain packs +2. traverse the concept graph +3. generate explanations and answers +4. attempt practice tasks +5. critique model outputs +6. complete simulated projects +7. accumulate evidence +8. advance only when concept-specific mastery criteria are satisfied -This revision adds **profile templates** so packs can define reusable mastery models. +## Repository structure -Example: - -```yaml -profile_templates: - foundation_concept: - required_dimensions: - - correctness - - explanation - - critique_concept: - required_dimensions: - - correctness - - transfer - - critique -``` - -Concepts can reference templates: - -```yaml -mastery_profile: - template: critique_concept -``` - -This allows domain packs to remain concise while maintaining consistent evaluation standards. - ---- - -# Adaptive Learning Engine - -The adaptive engine computes: - -• which concepts are ready to study -• which are blocked by prerequisites -• which are already mastered -• which projects are available - -Output includes: - -``` -next_best_concepts -eligible_projects -adaptive_learning_roadmap -``` - ---- - -# Evidence Engine - -The evidence engine: - -• aggregates learner evidence -• computes weighted scores -• tracks confidence -• identifies weak competence dimensions -• updates mastery status - -Later weak performance can **resurface concepts for review**. - ---- - -# Socratic Mentor - -Didactopus includes a mentor layer that: - -• asks probing questions -• challenges reasoning -• generates practice tasks -• proposes projects - -Models can run locally (recommended) or via remote APIs. - ---- - -# Agentic AI Students - -Didactopus is also suitable for **AI‑driven learning agents**. - -A future architecture may include: - -``` -Didactopus Core - │ - ├─ Human Learner - └─ AI Student Agent -``` - -An AI student could: - -1. read domain packs -2. attempt practice tasks -3. produce explanations -4. critique model outputs -5. complete simulated projects -6. accumulate evidence -7. progress through the mastery graph - -Such agents could be used for: - -• automated curriculum testing -• benchmarking AI reasoning -• synthetic expert generation -• evaluation of model capabilities - -Didactopus therefore supports both: - -• human learners -• agentic AI learners - ---- - -# Project Structure - -``` +```text didactopus/ - adaptive_engine/ - artifact_registry/ - evidence_engine/ - learning_graph/ - mentor/ - practice/ - project_advisor/ +├── README.md +├── artwork/ +├── configs/ +├── docs/ +├── domain-packs/ +├── src/didactopus/ +└── tests/ ``` - -Additional directories: - -``` -configs/ -docs/ -domain-packs/ -tests/ -artwork/ -``` - ---- - -# Current Status - -Implemented: - -✓ domain pack validation -✓ dependency resolution -✓ learning graph merge -✓ adaptive roadmap generation -✓ evidence‑driven mastery -✓ multi‑dimensional competence tracking -✓ concept‑specific mastery profiles -✓ profile template inheritance - -Planned next phases: - -• curriculum optimization algorithms -• active‑learning task generation -• automated project evaluation -• distributed pack registry -• visualization tools for learning graphs - ---- - -# Philosophy - -Didactopus is built around a simple principle: - -> Mastery requires thinking, explaining, testing, and building — not merely receiving answers. - -AI can accelerate the process, but genuine learning remains an **active intellectual endeavor**. - ---- - -**Didactopus — many arms, one goal: mastery.** diff --git a/configs/config.example.yaml b/configs/config.example.yaml index ff7dd24..86617eb 100644 --- a/configs/config.example.yaml +++ b/configs/config.example.yaml @@ -4,25 +4,9 @@ model_provider: backend: ollama endpoint: http://localhost:11434 model_name: llama3.1:8b - remote: - enabled: false - provider_name: none - endpoint: "" - model_name: "" platform: - verification_required: true - require_learner_explanations: true - permit_direct_answers: false - resurfacing_threshold: 0.55 - confidence_threshold: 0.8 - evidence_weights: - explanation: 1.0 - problem: 1.5 - project: 2.5 - transfer: 2.0 - recent_evidence_multiplier: 1.35 - dimension_thresholds: + default_dimension_thresholds: correctness: 0.8 explanation: 0.75 transfer: 0.7 @@ -32,4 +16,3 @@ platform: artifacts: local_pack_dirs: - domain-packs - allow_third_party_packs: true diff --git a/docs/concept-graph-engine.md b/docs/concept-graph-engine.md new file mode 100644 index 0000000..3716bed --- /dev/null +++ b/docs/concept-graph-engine.md @@ -0,0 +1,22 @@ +# Concept Graph Engine + +The concept graph engine provides the backbone for Didactopus. + +## Features in this revision + +- prerequisite reasoning across packs +- cross-pack concept linking +- semantic similarity scoring hook +- curriculum pathfinding +- visualization export + +## Edge types + +The engine distinguishes between: +- `prerequisite` +- `equivalent_to` +- `related_to` +- `extends` +- `depends_on` + +Only prerequisite edges are used for strict learning-order pathfinding. diff --git a/domain-packs/applied-inference/concepts.yaml b/domain-packs/applied-inference/concepts.yaml index 2dc7d10..f9f1c50 100644 --- a/domain-packs/applied-inference/concepts.yaml +++ b/domain-packs/applied-inference/concepts.yaml @@ -1,6 +1,10 @@ concepts: - id: model-checking title: Model Checking + description: Critiquing assumptions, fit, and implications of a probabilistic model. prerequisites: [] mastery_signals: - compare model assumptions + - critique a simple inference model + mastery_profile: + template: critique_heavy diff --git a/domain-packs/applied-inference/pack.yaml b/domain-packs/applied-inference/pack.yaml index 6931c59..f3668d8 100644 --- a/domain-packs/applied-inference/pack.yaml +++ b/domain-packs/applied-inference/pack.yaml @@ -4,7 +4,7 @@ version: 0.2.0 schema_version: "1" didactopus_min_version: 0.1.0 didactopus_max_version: 0.9.99 -description: Simple applied inference pack. +description: Applied inference pack emphasizing transfer and critique. author: Wesley R. Elsberry license: MIT dependencies: @@ -12,3 +12,19 @@ dependencies: min_version: 0.1.0 max_version: 1.0.0 overrides: [] +profile_templates: + critique_heavy: + required_dimensions: + - correctness + - transfer + - critique + dimension_threshold_overrides: + transfer: 0.78 + critique: 0.73 +cross_pack_links: + - source_concept: model-checking + target_concept: bayes-extension::posterior + relation: extends + - source_concept: model-checking + target_concept: foundations-statistics::probability-basics + relation: related_to diff --git a/domain-packs/bayes-extension/concepts.yaml b/domain-packs/bayes-extension/concepts.yaml index 935e9a4..e9ca90b 100644 --- a/domain-packs/bayes-extension/concepts.yaml +++ b/domain-packs/bayes-extension/concepts.yaml @@ -1,12 +1,27 @@ concepts: - id: prior title: Prior + description: A probability distribution representing knowledge before evidence. prerequisites: [] mastery_signals: - explain a prior distribution + - compare reasonable priors + mastery_profile: + template: bayes_concept + - id: posterior title: Posterior + description: Updated beliefs after conditioning on observed evidence. prerequisites: - prior mastery_signals: - explain updating beliefs + - compare prior and posterior distributions + mastery_profile: + required_dimensions: + - correctness + - explanation + - transfer + - critique + dimension_threshold_overrides: + critique: 0.78 diff --git a/domain-packs/bayes-extension/pack.yaml b/domain-packs/bayes-extension/pack.yaml index eed6c70..0652945 100644 --- a/domain-packs/bayes-extension/pack.yaml +++ b/domain-packs/bayes-extension/pack.yaml @@ -12,3 +12,18 @@ dependencies: min_version: 1.0.0 max_version: 1.9.99 overrides: [] +profile_templates: + bayes_concept: + required_dimensions: + - correctness + - explanation + - transfer + dimension_threshold_overrides: + transfer: 0.74 +cross_pack_links: + - source_concept: prior + target_concept: foundations-statistics::probability-basics + relation: depends_on + - source_concept: posterior + target_concept: applied-inference::model-checking + relation: related_to diff --git a/domain-packs/foundations-statistics/concepts.yaml b/domain-packs/foundations-statistics/concepts.yaml index fe41c25..125399e 100644 --- a/domain-packs/foundations-statistics/concepts.yaml +++ b/domain-packs/foundations-statistics/concepts.yaml @@ -1,12 +1,26 @@ concepts: - id: descriptive-statistics title: Descriptive Statistics + description: Core summaries of distributions, central tendency, and spread. prerequisites: [] mastery_signals: - - explain central tendency + - explain mean median and variance + - summarize a simple dataset + mastery_profile: + template: foundation_concept + - id: probability-basics title: Probability Basics + description: Basic event probability and conditional probability. prerequisites: - descriptive-statistics mastery_signals: - explain event probability + - calculate simple conditional probability + mastery_profile: + required_dimensions: + - correctness + - explanation + - transfer + dimension_threshold_overrides: + transfer: 0.72 diff --git a/domain-packs/foundations-statistics/pack.yaml b/domain-packs/foundations-statistics/pack.yaml index 3ab1ec7..b08fc6b 100644 --- a/domain-packs/foundations-statistics/pack.yaml +++ b/domain-packs/foundations-statistics/pack.yaml @@ -9,3 +9,11 @@ author: Wesley R. Elsberry license: MIT dependencies: [] overrides: [] +profile_templates: + foundation_concept: + required_dimensions: + - correctness + - explanation + dimension_threshold_overrides: + explanation: 0.75 +cross_pack_links: [] diff --git a/pyproject.toml b/pyproject.toml index d503f20..67da187 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -10,8 +10,11 @@ readme = "README.md" requires-python = ">=3.10" license = {text = "MIT"} authors = [{name = "Wesley R. Elsberry"}] -dependencies = ["pydantic>=2.7", "pyyaml>=6.0", "networkx>=3.2"] - +dependencies = [ + "pydantic>=2.7", + "pyyaml>=6.0", + "networkx>=3.2", +] [project.optional-dependencies] dev = ["pytest>=8.0", "ruff>=0.6"] diff --git a/src/didactopus/artifact_registry.py b/src/didactopus/artifact_registry.py index a5e9466..bf0bc55 100644 --- a/src/didactopus/artifact_registry.py +++ b/src/didactopus/artifact_registry.py @@ -46,25 +46,32 @@ def validate_pack(pack_dir: str | Path) -> PackValidationResult: result.errors.append(f"missing required file: {filename}") if result.errors: return result + try: result.manifest = PackManifest.model_validate(_load_yaml(pack_path / "pack.yaml")) - if not _version_in_range(DIDACTOPUS_VERSION, result.manifest.didactopus_min_version, result.manifest.didactopus_max_version): + if not _version_in_range( + DIDACTOPUS_VERSION, + result.manifest.didactopus_min_version, + result.manifest.didactopus_max_version, + ): result.errors.append( - f"incompatible with Didactopus core version {DIDACTOPUS_VERSION}; supported range is " - f"{result.manifest.didactopus_min_version}..{result.manifest.didactopus_max_version}" + f"incompatible with Didactopus core version {DIDACTOPUS_VERSION}; " + f"supported range is {result.manifest.didactopus_min_version}..{result.manifest.didactopus_max_version}" ) + result.loaded_files["concepts"] = ConceptsFile.model_validate(_load_yaml(pack_path / "concepts.yaml")) result.loaded_files["roadmap"] = RoadmapFile.model_validate(_load_yaml(pack_path / "roadmap.yaml")) result.loaded_files["projects"] = ProjectsFile.model_validate(_load_yaml(pack_path / "projects.yaml")) result.loaded_files["rubrics"] = RubricsFile.model_validate(_load_yaml(pack_path / "rubrics.yaml")) except Exception as exc: result.errors.append(str(exc)) + result.is_valid = not result.errors return result def discover_domain_packs(base_dirs: list[str | Path]) -> list[PackValidationResult]: - results = [] + results: list[PackValidationResult] = [] for base_dir in base_dirs: base = Path(base_dir) if not base.exists(): @@ -75,7 +82,7 @@ def discover_domain_packs(base_dirs: list[str | Path]) -> list[PackValidationRes def check_pack_dependencies(results: list[PackValidationResult]) -> list[str]: - errors = [] + errors: list[str] = [] manifest_by_name = {r.manifest.name: r.manifest for r in results if r.manifest is not None} for result in results: if result.manifest is None: diff --git a/src/didactopus/artifact_schemas.py b/src/didactopus/artifact_schemas.py index 602dac4..aedd100 100644 --- a/src/didactopus/artifact_schemas.py +++ b/src/didactopus/artifact_schemas.py @@ -8,6 +8,23 @@ class DependencySpec(BaseModel): max_version: str = "9999.9999.9999" +class MasteryProfileSpec(BaseModel): + template: str | None = None + required_dimensions: list[str] = Field(default_factory=list) + dimension_threshold_overrides: dict[str, float] = Field(default_factory=dict) + + +class CrossPackLinkSpec(BaseModel): + source_concept: str + target_concept: str + relation: str + + +class ProfileTemplateSpec(BaseModel): + required_dimensions: list[str] = Field(default_factory=list) + dimension_threshold_overrides: dict[str, float] = Field(default_factory=dict) + + class PackManifest(BaseModel): name: str display_name: str @@ -20,13 +37,17 @@ class PackManifest(BaseModel): license: str = "unspecified" dependencies: list[DependencySpec] = Field(default_factory=list) overrides: list[str] = Field(default_factory=list) + profile_templates: dict[str, ProfileTemplateSpec] = Field(default_factory=dict) + cross_pack_links: list[CrossPackLinkSpec] = Field(default_factory=list) class ConceptEntry(BaseModel): id: str title: str + description: str = "" prerequisites: list[str] = Field(default_factory=list) mastery_signals: list[str] = Field(default_factory=list) + mastery_profile: MasteryProfileSpec = Field(default_factory=MasteryProfileSpec) class ConceptsFile(BaseModel): diff --git a/src/didactopus/concept_graph.py b/src/didactopus/concept_graph.py new file mode 100644 index 0000000..bed554b --- /dev/null +++ b/src/didactopus/concept_graph.py @@ -0,0 +1,93 @@ +from __future__ import annotations + +from dataclasses import dataclass, field +from typing import Any +import json +import networkx as nx + +REL_PREREQ = "prerequisite" +REL_EQUIVALENT = "equivalent_to" +REL_RELATED = "related_to" +REL_EXTENDS = "extends" +REL_DEPENDS = "depends_on" + + +@dataclass +class ConceptGraph: + graph: nx.MultiDiGraph = field(default_factory=nx.MultiDiGraph) + + def add_concept(self, concept_key: str, metadata: dict[str, Any] | None = None) -> None: + self.graph.add_node(concept_key, **(metadata or {})) + + def add_edge(self, source: str, target: str, relation: str) -> None: + self.graph.add_edge(source, target, relation=relation) + + def add_prerequisite(self, prereq: str, concept: str) -> None: + self.add_edge(prereq, concept, REL_PREREQ) + + def add_cross_link(self, source: str, target: str, relation: str) -> None: + self.add_edge(source, target, relation) + + def prerequisite_subgraph(self) -> nx.DiGraph: + g = nx.DiGraph() + for node, data in self.graph.nodes(data=True): + g.add_node(node, **data) + for u, v, data in self.graph.edges(data=True): + if data.get("relation") == REL_PREREQ: + g.add_edge(u, v) + return g + + def prerequisites(self, concept: str) -> list[str]: + return list(self.prerequisite_subgraph().predecessors(concept)) + + def prerequisite_chain(self, concept: str) -> list[str]: + return list(nx.ancestors(self.prerequisite_subgraph(), concept)) + + def dependents(self, concept: str) -> list[str]: + return list(self.prerequisite_subgraph().successors(concept)) + + def learning_path(self, start: str, target: str) -> list[str] | None: + try: + return nx.shortest_path(self.prerequisite_subgraph(), start, target) + except nx.NetworkXNoPath: + return None + + def curriculum_path_to_target(self, mastered: set[str], target: str) -> list[str]: + pg = self.prerequisite_subgraph() + needed = set(nx.ancestors(pg, target)) | {target} + ordered = [n for n in nx.topological_sort(pg) if n in needed] + return [n for n in ordered if n not in mastered] + + def ready_concepts(self, mastered: set[str]) -> list[str]: + pg = self.prerequisite_subgraph() + ready = [] + for node in pg.nodes: + if node in mastered: + continue + if set(pg.predecessors(node)).issubset(mastered): + ready.append(node) + return ready + + def related_concepts(self, concept: str, relation_types: set[str] | None = None) -> list[str]: + relation_types = relation_types or {REL_EQUIVALENT, REL_RELATED, REL_EXTENDS, REL_DEPENDS} + found = [] + for _, v, data in self.graph.out_edges(concept, data=True): + if data.get("relation") in relation_types: + found.append(v) + return found + + def export_graphviz(self, path: str) -> None: + lines = ["digraph Didactopus {"] + for node in self.graph.nodes: + lines.append(f' "{node}";') + for u, v, data in self.graph.edges(data=True): + lines.append(f' "{u}" -> "{v}" [label="{data.get("relation", "")}"];') + lines.append("}") + Path(path).write_text("\n".join(lines), encoding="utf-8") + + def export_cytoscape_json(self, path: str) -> None: + data = { + "nodes": [{"data": {"id": n, **attrs}} for n, attrs in self.graph.nodes(data=True)], + "edges": [{"data": {"source": u, "target": v, **attrs}} for u, v, attrs in self.graph.edges(data=True)], + } + Path(path).write_text(json.dumps(data, indent=2), encoding="utf-8") diff --git a/src/didactopus/config.py b/src/didactopus/config.py index a8e0740..a6bdee8 100644 --- a/src/didactopus/config.py +++ b/src/didactopus/config.py @@ -9,35 +9,13 @@ class ProviderEndpoint(BaseModel): model_name: str = "llama3.1:8b" -class RemoteProvider(BaseModel): - enabled: bool = False - provider_name: str = "none" - endpoint: str = "" - model_name: str = "" - - class ModelProviderConfig(BaseModel): mode: str = Field(default="local_first") local: ProviderEndpoint = Field(default_factory=ProviderEndpoint) - remote: RemoteProvider = Field(default_factory=RemoteProvider) class PlatformConfig(BaseModel): - verification_required: bool = True - require_learner_explanations: bool = True - permit_direct_answers: bool = False - resurfacing_threshold: float = 0.55 - confidence_threshold: float = 0.8 - evidence_weights: dict[str, float] = Field( - default_factory=lambda: { - "explanation": 1.0, - "problem": 1.5, - "project": 2.5, - "transfer": 2.0, - } - ) - recent_evidence_multiplier: float = 1.35 - dimension_thresholds: dict[str, float] = Field( + default_dimension_thresholds: dict[str, float] = Field( default_factory=lambda: { "correctness": 0.8, "explanation": 0.75, @@ -50,7 +28,6 @@ class PlatformConfig(BaseModel): class ArtifactConfig(BaseModel): local_pack_dirs: list[str] = Field(default_factory=lambda: ["domain-packs"]) - allow_third_party_packs: bool = True class AppConfig(BaseModel): diff --git a/src/didactopus/graph_builder.py b/src/didactopus/graph_builder.py new file mode 100644 index 0000000..f4593b5 --- /dev/null +++ b/src/didactopus/graph_builder.py @@ -0,0 +1,49 @@ +from __future__ import annotations + +from .artifact_registry import PackValidationResult +from .concept_graph import ConceptGraph +from .learning_graph import build_merged_learning_graph, namespaced_concept +from .semantic_similarity import concept_similarity + + +def build_concept_graph( + results: list[PackValidationResult], + default_dimension_thresholds: dict[str, float], +) -> ConceptGraph: + merged = build_merged_learning_graph(results, default_dimension_thresholds) + + graph = ConceptGraph() + for concept_key, data in merged.concept_data.items(): + graph.add_concept(concept_key, data) + + for concept_key, data in merged.concept_data.items(): + for prereq in data["prerequisites"]: + if prereq in merged.concept_data: + graph.add_prerequisite(prereq, concept_key) + + for result in results: + if result.manifest is None or not result.is_valid: + continue + pack_name = result.manifest.name + for link in result.manifest.cross_pack_links: + source = link.source_concept if "::" in link.source_concept else namespaced_concept(pack_name, link.source_concept) + target = link.target_concept + if source in graph.graph.nodes and target in graph.graph.nodes: + graph.add_cross_link(source, target, link.relation) + + return graph + + +def suggest_semantic_links(graph: ConceptGraph, minimum_similarity: float = 0.35) -> list[tuple[str, str, float]]: + concepts = list(graph.graph.nodes(data=True)) + found = [] + for i in range(len(concepts)): + key_a, data_a = concepts[i] + for j in range(i + 1, len(concepts)): + key_b, data_b = concepts[j] + if key_a.split("::")[0] == key_b.split("::")[0]: + continue + sim = concept_similarity(data_a, data_b) + if sim >= minimum_similarity: + found.append((key_a, key_b, sim)) + return sorted(found, key=lambda x: x[2], reverse=True) diff --git a/src/didactopus/learning_graph.py b/src/didactopus/learning_graph.py index eac81a7..337a04e 100644 --- a/src/didactopus/learning_graph.py +++ b/src/didactopus/learning_graph.py @@ -2,9 +2,9 @@ from __future__ import annotations from dataclasses import dataclass, field from typing import Any -import networkx as nx from .artifact_registry import PackValidationResult, topological_pack_order +from .profile_templates import resolve_mastery_profile def namespaced_concept(pack_name: str, concept_id: str) -> str: @@ -13,38 +13,44 @@ def namespaced_concept(pack_name: str, concept_id: str) -> str: @dataclass class MergedLearningGraph: - graph: nx.DiGraph = field(default_factory=nx.DiGraph) concept_data: dict[str, dict[str, Any]] = field(default_factory=dict) project_catalog: list[dict[str, Any]] = field(default_factory=list) load_order: list[str] = field(default_factory=list) -def build_merged_learning_graph(results: list[PackValidationResult]) -> MergedLearningGraph: +def build_merged_learning_graph( + results: list[PackValidationResult], + default_dimension_thresholds: dict[str, float], +) -> MergedLearningGraph: merged = MergedLearningGraph() valid = {r.manifest.name: r for r in results if r.manifest is not None and r.is_valid} merged.load_order = topological_pack_order(results) for pack_name in merged.load_order: result = valid[pack_name] + templates = { + name: { + "required_dimensions": list(spec.required_dimensions), + "dimension_threshold_overrides": dict(spec.dimension_threshold_overrides), + } + for name, spec in result.manifest.profile_templates.items() + } for concept in result.loaded_files["concepts"].concepts: key = namespaced_concept(pack_name, concept.id) + resolved_profile = resolve_mastery_profile( + concept.mastery_profile.model_dump(), + templates, + default_dimension_thresholds, + ) merged.concept_data[key] = { "id": concept.id, "title": concept.title, + "description": concept.description, "pack": pack_name, - "prerequisites": list(concept.prerequisites), + "prerequisites": [namespaced_concept(pack_name, p) for p in concept.prerequisites], "mastery_signals": list(concept.mastery_signals), + "mastery_profile": resolved_profile, } - merged.graph.add_node(key) - - for pack_name in merged.load_order: - result = valid[pack_name] - for concept in result.loaded_files["concepts"].concepts: - concept_key = namespaced_concept(pack_name, concept.id) - for prereq in concept.prerequisites: - prereq_key = namespaced_concept(pack_name, prereq) - if prereq_key in merged.graph: - merged.graph.add_edge(prereq_key, concept_key) for project in result.loaded_files["projects"].projects: merged.project_catalog.append({ "id": f"{pack_name}::{project.id}", diff --git a/src/didactopus/main.py b/src/didactopus/main.py index 24b0394..9134b53 100644 --- a/src/didactopus/main.py +++ b/src/didactopus/main.py @@ -2,141 +2,73 @@ import argparse import os from pathlib import Path -from .adaptive_engine import LearnerProfile, build_adaptive_plan -from .artifact_registry import ( - check_pack_dependencies, - detect_dependency_cycles, - discover_domain_packs, - topological_pack_order, -) +from .artifact_registry import check_pack_dependencies, detect_dependency_cycles, discover_domain_packs from .config import load_config -from .evidence_engine import EvidenceItem, ingest_evidence_bundle -from .learning_graph import build_merged_learning_graph -from .mentor import generate_socratic_prompt -from .model_provider import ModelProvider -from .practice import generate_practice_task -from .project_advisor import suggest_capstone +from .graph_builder import build_concept_graph, suggest_semantic_links def build_parser() -> argparse.ArgumentParser: - parser = argparse.ArgumentParser(description="Didactopus multi-dimensional mastery scaffold") - parser.add_argument("--domain", required=True) - parser.add_argument("--goal", required=True) - parser.add_argument( - "--config", - default=os.environ.get("DIDACTOPUS_CONFIG", "configs/config.example.yaml"), - ) + parser = argparse.ArgumentParser(description="Didactopus concept graph engine") + parser.add_argument("--target", default="bayes-extension::posterior") + parser.add_argument("--mastered", nargs="*", default=[]) + parser.add_argument("--export-dot", default="") + parser.add_argument("--export-cytoscape", default="") + parser.add_argument("--config", default=os.environ.get("DIDACTOPUS_CONFIG", "configs/config.example.yaml")) return parser def main() -> None: args = build_parser().parse_args() config = load_config(Path(args.config)) - provider = ModelProvider(config.model_provider) - packs = discover_domain_packs(config.artifacts.local_pack_dirs) - dependency_errors = check_pack_dependencies(packs) - cycles = detect_dependency_cycles(packs) + results = discover_domain_packs(config.artifacts.local_pack_dirs) + dep_errors = check_pack_dependencies(results) + cycles = detect_dependency_cycles(results) - print("== Didactopus ==") - print("Many arms, one goal — mastery.") - print() - - if dependency_errors: - print("== Dependency Errors ==") - for err in dependency_errors: + if dep_errors: + print("Dependency errors:") + for err in dep_errors: print(f"- {err}") - print() - if cycles: - print("== Dependency Cycles ==") + print("Dependency cycles:") for cycle in cycles: - print(f"- cycle: {' -> '.join(cycle)}") + print(f"- {' -> '.join(cycle)}") return - print("== Pack Load Order ==") - for name in topological_pack_order(packs): - print(f"- {name}") + graph = build_concept_graph(results, config.platform.default_dimension_thresholds) + mastered = set(args.mastered) + + print("== Didactopus Concept Graph Engine ==") + print(f"concepts: {len(graph.graph.nodes)}") + print(f"edges: {len(graph.graph.edges)}") print() - - merged = build_merged_learning_graph(packs) - profile = LearnerProfile( - learner_id="demo-learner", - display_name="Demo Learner", - goals=[args.goal], - mastered_concepts=set(), - hide_mastered=True, - ) - - evidence_items = [ - EvidenceItem( - concept_key="foundations-statistics::descriptive-statistics", - evidence_type="project", - score=0.88, - is_recent=True, - rubric_dimensions={ - "correctness": 0.9, - "explanation": 0.83, - "transfer": 0.79, - "project_execution": 0.88, - "critique": 0.74, - }, - notes="Strong integrated performance.", - ), - EvidenceItem( - concept_key="bayes-extension::prior", - evidence_type="problem", - score=0.68, - is_recent=True, - rubric_dimensions={ - "correctness": 0.75, - "explanation": 0.62, - "transfer": 0.55, - "critique": 0.58, - }, - notes="Knows some basics, weak transfer and critique.", - ), - ] - - evidence_state = ingest_evidence_bundle( - profile=profile, - items=evidence_items, - resurfacing_threshold=config.platform.resurfacing_threshold, - confidence_threshold=config.platform.confidence_threshold, - type_weights=config.platform.evidence_weights, - recent_multiplier=config.platform.recent_evidence_multiplier, - dimension_thresholds=config.platform.dimension_thresholds, - ) - - plan = build_adaptive_plan(merged, profile) - - print("== Multi-Dimensional Evidence Summary ==") - for concept_key, summary in evidence_state.summary_by_concept.items(): - print( - f"- {concept_key}: weighted_mean={summary.weighted_mean_score:.2f}, " - f"confidence={summary.confidence:.2f}, mastered={summary.mastered}" - ) - if summary.dimension_means: - dims = ", ".join(f"{k}={v:.2f}" for k, v in sorted(summary.dimension_means.items())) - print(f" * dimensions: {dims}") - if summary.weak_dimensions: - print(f" * weak dimensions: {', '.join(summary.weak_dimensions)}") + print(f"Target concept: {args.target}") + print("Prerequisite chain:") + for item in sorted(graph.prerequisite_chain(args.target)): + print(f"- {item}") print() - - print("== Mastered Concepts ==") - if profile.mastered_concepts: - for concept_key in sorted(profile.mastered_concepts): - print(f"- {concept_key}") - else: - print("- none yet") + print("Curriculum path from current mastery:") + for item in graph.curriculum_path_to_target(mastered, args.target): + print(f"- {item}") print() - - print("== Next Best Concepts ==") - for concept in plan.next_best_concepts: - print(f"- {concept}") + print("Ready concepts:") + for item in graph.ready_concepts(mastered): + print(f"- {item}") print() + print("Declared related concepts for target:") + for item in graph.related_concepts(args.target): + print(f"- {item}") + print() + print("Suggested semantic links:") + for a, b, score in suggest_semantic_links(graph, minimum_similarity=0.10)[:8]: + print(f"- {a} <-> {b} : {score:.2f}") - focus_concept = "bayes-extension::prior" - weak_dims = evidence_state.summary_by_concept.get(focus_concept).weak_dimensions if focus_concept in evidence_state.summary_by_concept else [] - print(generate_socratic_prompt(provider, focus_concept, weak_dims)) - print(generate_practice_task(provider, focus_concept, weak_dims)) - print(suggest_capstone(provider, args.domain)) + if args.export_dot: + graph.export_graphviz(args.export_dot) + print(f"Exported Graphviz DOT to {args.export_dot}") + if args.export_cytoscape: + graph.export_cytoscape_json(args.export_cytoscape) + print(f"Exported Cytoscape JSON to {args.export_cytoscape}") + + +if __name__ == "__main__": + main() diff --git a/src/didactopus/model_provider.py b/src/didactopus/model_provider.py index 3f77bd5..037a7ce 100644 --- a/src/didactopus/model_provider.py +++ b/src/didactopus/model_provider.py @@ -13,10 +13,6 @@ class ModelProvider: def __init__(self, config: ModelProviderConfig) -> None: self.config = config - def describe(self) -> str: - local = self.config.local - return f"mode={self.config.mode}, local={local.backend}:{local.model_name}" - def generate(self, prompt: str) -> ModelResponse: local = self.config.local preview = prompt.strip().replace("\n", " ")[:120] diff --git a/src/didactopus/profile_templates.py b/src/didactopus/profile_templates.py index d640fb0..27406ae 100644 --- a/src/didactopus/profile_templates.py +++ b/src/didactopus/profile_templates.py @@ -1,34 +1,39 @@ -from dataclasses import dataclass -from typing import Dict, List +from typing import Any -@dataclass -class ProfileTemplate: - name: str - required_dimensions: List[str] - dimension_threshold_overrides: Dict[str, float] - - -def resolve_mastery_profile(concept_profile, templates, default_profile): - if concept_profile is None: - return default_profile - - template_name = concept_profile.get("template") - if template_name: - base = templates.get(template_name, default_profile) - profile = { - "required_dimensions": list(base.required_dimensions), - "dimension_threshold_overrides": dict(base.dimension_threshold_overrides), - } +def resolve_mastery_profile( + concept_profile: dict[str, Any] | None, + templates: dict[str, dict[str, Any]], + default_thresholds: dict[str, float], +) -> dict[str, Any]: + default_profile = { + "required_dimensions": list(default_thresholds.keys()), + "dimension_threshold_overrides": {}, + } + if not concept_profile: + effective = dict(default_profile) else: - profile = default_profile.copy() + template_name = concept_profile.get("template") + if template_name and template_name in templates: + tmpl = templates[template_name] + effective = { + "required_dimensions": list(tmpl.get("required_dimensions", default_profile["required_dimensions"])), + "dimension_threshold_overrides": dict(tmpl.get("dimension_threshold_overrides", {})), + } + else: + effective = dict(default_profile) - if "required_dimensions" in concept_profile: - profile["required_dimensions"] = concept_profile["required_dimensions"] + if concept_profile.get("required_dimensions"): + effective["required_dimensions"] = list(concept_profile["required_dimensions"]) + if concept_profile.get("dimension_threshold_overrides"): + effective["dimension_threshold_overrides"].update( + concept_profile["dimension_threshold_overrides"] + ) - if "dimension_threshold_overrides" in concept_profile: - profile["dimension_threshold_overrides"].update( - concept_profile["dimension_threshold_overrides"] - ) - - return profile + thresholds = dict(default_thresholds) + thresholds.update(effective["dimension_threshold_overrides"]) + return { + "required_dimensions": effective["required_dimensions"], + "dimension_threshold_overrides": dict(effective["dimension_threshold_overrides"]), + "effective_thresholds": {dim: thresholds[dim] for dim in effective["required_dimensions"] if dim in thresholds}, + } diff --git a/src/didactopus/semantic_similarity.py b/src/didactopus/semantic_similarity.py new file mode 100644 index 0000000..bbcbeed --- /dev/null +++ b/src/didactopus/semantic_similarity.py @@ -0,0 +1,37 @@ +from collections import Counter +import math + + +def _tokenize(text: str) -> list[str]: + cleaned = "".join(ch.lower() if ch.isalnum() else " " for ch in text) + return [tok for tok in cleaned.split() if tok] + + +def token_cosine_similarity(text_a: str, text_b: str) -> float: + tokens_a = _tokenize(text_a) + tokens_b = _tokenize(text_b) + if not tokens_a or not tokens_b: + return 0.0 + ca = Counter(tokens_a) + cb = Counter(tokens_b) + shared = set(ca) & set(cb) + dot = sum(ca[t] * cb[t] for t in shared) + na = math.sqrt(sum(v * v for v in ca.values())) + nb = math.sqrt(sum(v * v for v in cb.values())) + if na == 0 or nb == 0: + return 0.0 + return dot / (na * nb) + + +def concept_similarity(concept_a: dict, concept_b: dict) -> float: + text_a = " ".join([ + concept_a.get("title", ""), + concept_a.get("description", ""), + " ".join(concept_a.get("mastery_signals", [])), + ]) + text_b = " ".join([ + concept_b.get("title", ""), + concept_b.get("description", ""), + " ".join(concept_b.get("mastery_signals", [])), + ]) + return token_cosine_similarity(text_a, text_b) diff --git a/tests/test_concept_graph.py b/tests/test_concept_graph.py new file mode 100644 index 0000000..0c6be7c --- /dev/null +++ b/tests/test_concept_graph.py @@ -0,0 +1,44 @@ +from didactopus.artifact_registry import discover_domain_packs +from didactopus.config import load_config +from didactopus.graph_builder import build_concept_graph, suggest_semantic_links + + +def test_concept_graph_builds() -> None: + config = load_config("configs/config.example.yaml") + results = discover_domain_packs(["domain-packs"]) + graph = build_concept_graph(results, config.platform.default_dimension_thresholds) + assert "foundations-statistics::probability-basics" in graph.graph.nodes + assert "bayes-extension::posterior" in graph.graph.nodes + + +def test_prerequisite_path() -> None: + config = load_config("configs/config.example.yaml") + results = discover_domain_packs(["domain-packs"]) + graph = build_concept_graph(results, config.platform.default_dimension_thresholds) + path = graph.learning_path("bayes-extension::prior", "bayes-extension::posterior") + assert path == ["bayes-extension::prior", "bayes-extension::posterior"] + + +def test_curriculum_path_to_target() -> None: + config = load_config("configs/config.example.yaml") + results = discover_domain_packs(["domain-packs"]) + graph = build_concept_graph(results, config.platform.default_dimension_thresholds) + path = graph.curriculum_path_to_target(set(), "bayes-extension::posterior") + assert "bayes-extension::prior" in path + assert "bayes-extension::posterior" in path + + +def test_declared_cross_pack_links_exist() -> None: + config = load_config("configs/config.example.yaml") + results = discover_domain_packs(["domain-packs"]) + graph = build_concept_graph(results, config.platform.default_dimension_thresholds) + related = graph.related_concepts("bayes-extension::posterior") + assert "applied-inference::model-checking" in related + + +def test_semantic_link_suggestions() -> None: + config = load_config("configs/config.example.yaml") + results = discover_domain_packs(["domain-packs"]) + graph = build_concept_graph(results, config.platform.default_dimension_thresholds) + suggestions = suggest_semantic_links(graph, minimum_similarity=0.10) + assert len(suggestions) >= 1 diff --git a/tests/test_graph_exports.py b/tests/test_graph_exports.py new file mode 100644 index 0000000..caade35 --- /dev/null +++ b/tests/test_graph_exports.py @@ -0,0 +1,19 @@ +from pathlib import Path +from didactopus.artifact_registry import discover_domain_packs +from didactopus.config import load_config +from didactopus.graph_builder import build_concept_graph + + +def test_exports(tmp_path: Path) -> None: + config = load_config("configs/config.example.yaml") + results = discover_domain_packs(["domain-packs"]) + graph = build_concept_graph(results, config.platform.default_dimension_thresholds) + + dot_path = tmp_path / "graph.dot" + json_path = tmp_path / "graph.json" + + graph.export_graphviz(str(dot_path)) + graph.export_cytoscape_json(str(json_path)) + + assert dot_path.exists() + assert json_path.exists() diff --git a/tests/test_profile_templates.py b/tests/test_profile_templates.py new file mode 100644 index 0000000..57837de --- /dev/null +++ b/tests/test_profile_templates.py @@ -0,0 +1,18 @@ +from didactopus.profile_templates import resolve_mastery_profile + + +def test_template_resolution() -> None: + templates = { + "foundation": { + "required_dimensions": ["correctness", "explanation"], + "dimension_threshold_overrides": {"explanation": 0.8}, + } + } + resolved = resolve_mastery_profile( + {"template": "foundation"}, + templates, + {"correctness": 0.8, "explanation": 0.75, "transfer": 0.7}, + ) + assert resolved["required_dimensions"] == ["correctness", "explanation"] + assert resolved["effective_thresholds"]["correctness"] == 0.8 + assert resolved["effective_thresholds"]["explanation"] == 0.8