Added evaluator loop

This commit is contained in:
welsberr 2026-03-13 05:33:58 -04:00
parent dd0cc9fd08
commit 1035213470
15 changed files with 475 additions and 411 deletions

202
README.md
View File

@ -6,188 +6,76 @@
**Tagline:** *Many arms, one goal — mastery.*
## This revision
## Recent revisions
This revision adds a **graph-aware planning layer** that connects the concept graph engine to the adaptive and evidence engines.
This revision introduces a **pluggable evaluator pipeline** that converts
learner attempts into structured mastery evidence.
The new planner selects the next concepts to study using a utility function that considers:
The prior revision adds an **agentic learner loop** that turns Didactopus into a closed-loop mastery system prototype.
- prerequisite readiness
- distance to learner target concepts
- weakness in competence dimensions
- project availability
- review priority for fragile concepts
- semantic neighborhood around learner goals
The loop can now:
## Why this matters
- choose the next concept via the graph-aware planner
- generate a synthetic learner attempt
- score the attempt into evidence
- update mastery state
- repeat toward a target concept
Up to this point, Didactopus could:
- build concept graphs
- identify ready concepts
- infer mastery from evidence
This is still scaffold-level, but it is the first explicit implementation of the idea that **Didactopus can supervise not only human learners, but also AI student agents**.
But it still needed a better mechanism for choosing **what to do next**.
## Complete overview to this point
The graph-aware planner begins to solve that by ranking candidate concepts according to learner-specific utility instead of using unlocked prerequisites alone.
## Current architecture overview
Didactopus now includes:
Didactopus currently includes:
- **Domain packs** for concepts, projects, rubrics, mastery profiles, templates, and cross-pack links
- **Dependency resolution** across packs
- **Merged learning graph** generation
- **Concept graph engine** with cross-pack links, similarity hooks, pathfinding, and visualization export
- **Adaptive learner engine** for ready/blocked/mastered concept states
- **Concept graph engine** for cross-pack prerequisite reasoning, linking, pathfinding, and export
- **Adaptive learner engine** for ready, blocked, and mastered concepts
- **Evidence engine** with weighted, recency-aware, multi-dimensional mastery inference
- **Concept-specific mastery profiles** with template inheritance
- **Graph-aware planner** for utility-ranked next-step recommendations
## Planning utility
The current planner computes a score per candidate concept using:
- readiness bonus
- target-distance bonus
- weak-dimension bonus
- fragile-concept review bonus
- project-unlock bonus
- semantic-similarity bonus
These terms are transparent and configurable.
- **Agentic learner loop** for iterative goal-directed mastery acquisition
## Agentic AI students
This planner also strengthens the case for **AI student agents** that use Didactopus as a structured mastery environment.
An AI student under Didactopus is modeled as an **agent that accumulates evidence against concept mastery criteria**.
An AI student could:
It does not “learn” in the same sense that model weights are retrained inside Didactopus. Instead, its learned mastery is represented as:
1. inspect the graph
2. choose the next concept via the planner
3. attempt tasks
4. generate evidence
5. update mastery state
6. repeat until a target expertise profile is reached
- current mastered concept set
- evidence history
- dimension-level competence summaries
- concept-specific weak dimensions
- adaptive plan state
- optional artifacts, explanations, project outputs, and critiques it has produced
This makes Didactopus useful as both:
- a learning platform
- a benchmark harness for agentic expertise growth
In other words, Didactopus represents mastery as a **structured operational state**, not merely a chat transcript.
## Core philosophy
That state can be put to work by:
Didactopus assumes that real expertise is built through:
- selecting tasks the agent is now qualified to attempt
- routing domain-relevant problems to the agent
- exposing mastered concept profiles to orchestration logic
- using evidence summaries to decide whether the agent should act, defer, or review
- exporting a mastery portfolio for downstream use
- explanation
- problem solving
- transfer
- critique
- project execution
## FAQ
The AI layer should function as a **mentor, evaluator, and curriculum partner**, not an answer vending machine.
See:
- `docs/faq.md`
## Domain packs
## Correctness and formal knowledge components
Knowledge enters the system through versioned, shareable **domain packs**. Each pack can contribute:
See:
- `docs/correctness-and-knowledge-engine.md`
- concepts
- prerequisites
- learning stages
- projects
- rubrics
- mastery profiles
- profile templates
- cross-pack concept links
Short version: yes, there is a strong argument that Didactopus will eventually benefit from a more formal knowledge-engine layer, especially for domains where correctness can be stated in symbolic, logical, computational, or rule-governed terms.
## Concept graph engine
A good future architecture is likely **hybrid**:
This revision implements a concept graph engine with:
- prerequisite reasoning across packs
- cross-pack concept linking
- semantic concept similarity hooks
- automatic curriculum pathfinding
- visualization export for mastery graphs
Concepts are namespaced as `pack-name::concept-id`.
### Cross-pack links
Domain packs may declare conceptual links such as:
- `equivalent_to`
- `related_to`
- `extends`
- `depends_on`
These links enable Didactopus to reason across pack boundaries rather than treating each pack as an isolated island.
### Semantic similarity
A semantic similarity layer is included as a hook for:
- token overlap similarity
- future embedding-based similarity
- future ontology and LLM-assisted concept alignment
### Curriculum pathfinding
The concept graph engine supports:
- prerequisite chains
- shortest dependency paths
- next-ready concept discovery
- reachability analysis
- curriculum path generation from a learners mastery state to a target concept
### Visualization
Graphs can be exported to:
- Graphviz DOT
- Cytoscape-style JSON
## Evidence-driven mastery
Mastery is inferred from evidence such as:
- explanations
- problem solutions
- transfer tasks
- project artifacts
Evidence is:
- weighted by type
- optionally up-weighted for recency
- summarized by competence dimension
- compared against concept-specific mastery profiles
## Multi-dimensional mastery
Current dimensions include:
- `correctness`
- `explanation`
- `transfer`
- `project_execution`
- `critique`
Different concepts can require different subsets of these dimensions.
## Agentic AI students
Didactopus is also architecturally suitable for **AI learner agents**.
An agentic AI student could:
1. ingest domain packs
2. traverse the concept graph
3. generate explanations and answers
4. attempt practice tasks
5. critique model outputs
6. complete simulated projects
7. accumulate evidence
8. advance only when concept-specific mastery criteria are satisfied
- LLM/agentic layer for explanation, synthesis, critique, and exploration
- formal knowledge engine for rule checking, constraint satisfaction, proof support, symbolic validation, and executable correctness checks
## Repository structure
@ -201,3 +89,11 @@ didactopus/
├── src/didactopus/
└── tests/
```
# Didactopus
Didactopus is an AI-assisted autodidactic mastery platform based on
concept graphs, mastery evidence, and evaluator-driven correctness.
This revision introduces a **pluggable evaluator pipeline** that converts
learner attempts into structured mastery evidence.

View File

@ -0,0 +1,24 @@
# Agentic Learner Loop
The agentic learner loop is the first closed-loop prototype for AI-student behavior in Didactopus.
## Current loop
1. Inspect current mastery state
2. Ask graph-aware planner for next best concept
3. Produce synthetic attempt
4. Score attempt into evidence
5. Update mastery state
6. Repeat until target is reached or iteration budget ends
## Important limitation
The current implementation is a scaffold. The learner attempt is synthetic and deterministic, not a true external model call with robust domain evaluation.
## Why it still matters
It establishes the orchestration pattern for:
- planner-guided concept selection
- evidence accumulation
- mastery updates
- goal-directed progression

View File

@ -0,0 +1,87 @@
# Correctness Evaluation and the Case for a Knowledge Engine
## Question
Is there a need for a more formal knowledge-engine component in Didactopus?
## Answer
Probably yes, in at least some target domains.
The current evidence and mastery layers are useful, but they remain fundamentally evaluation orchestrators. They can aggregate evidence, compare it to thresholds, and guide learning. What they cannot yet do, in a principled way, is guarantee correctness when the domain itself has strong formal structure.
## Why a formal layer may be needed
Some domains support correctness checks that are not merely stylistic or heuristic.
Examples:
- algebraic manipulation
- probability identities
- code execution and tests
- type checking
- formal logic
- graph constraints
- unit analysis
- finite-state or rule-based systems
- regulatory checklists with explicit conditions
In those cases, LLM-style evaluation should not be the only correctness mechanism.
## Recommended architecture
A future Didactopus should likely use a hybrid stack:
### 1. Generative / agentic layer
Responsible for:
- explanation
- synthesis
- dialogue
- critique
- problem decomposition
- exploratory hypothesis generation
### 2. Formal knowledge engine
Responsible for:
- executable validation
- symbolic checking
- proof obligations
- rule application
- constraint checking
- test execution
- ontology-backed consistency checks
## Possible forms of knowledge engine
Depending on domain, the formal component might include:
- theorem provers
- CAS systems
- unit and dimension analyzers
- typed AST analyzers
- code test harnesses
- Datalog or rule engines
- OWL/RDF/description logic tooling
- Bayesian network or probabilistic programming validators
- DSL interpreters for domain constraints
## Where it fits in Didactopus
The knowledge engine would sit beneath the evidence layer.
Possible flow:
1. learner produces an answer, explanation, proof sketch, program, or model
2. Didactopus extracts evaluable claims or artifacts
3. formal engine checks what it can check
4. agentic evaluator interprets the result and turns it into evidence
5. mastery state updates accordingly
## Why this matters for AI students
For agentic AI learners especially, formal validation is important because it reduces the risk that a fluent but incorrect model is credited with mastery.
## Conclusion
Didactopus does not strictly require a formal knowledge engine to be useful. But for many serious domains, adding one would materially improve:
- correctness
- trustworthiness
- transfer assessment
- deployment readiness

View File

@ -0,0 +1,18 @@
# Evaluator Pipeline
The evaluator pipeline converts learner attempts into mastery evidence.
Flow:
1. learner attempt
2. evaluators score attempt
3. scores aggregated by dimension
4. mastery evidence updated
Evaluator types:
• rubric
• code/test
• symbolic rule
• critique
• portfolio

65
docs/faq.md Normal file
View File

@ -0,0 +1,65 @@
# FAQ
## What is Didactopus?
Didactopus is a mastery-oriented learning infrastructure that uses concept graphs, evidence-based assessment, and adaptive planning to support serious learning.
## Is this just a tutoring chatbot?
No. The intended architecture is broader than tutoring. Didactopus maintains explicit representations of:
- concepts
- prerequisites
- mastery criteria
- evidence
- learner state
- planning priorities
## How is an AI student's learned mastery represented?
An AI student's learned mastery is represented as structured state, not just conversation history.
Important elements include:
- mastered concept set
- evidence records
- dimension-level competence summaries
- weak-dimension lists
- project eligibility
- target-progress state
- produced artifacts and critiques
## Does Didactopus fine-tune the AI model?
Not in the current design. Didactopus supervises and evaluates a learner agent, but it does not itself retrain foundation model weights.
## Then how is the AI student “ready to work”?
Readiness is operationalized by the mastery state. An AI student is ready for a class of tasks when:
- relevant concepts are mastered
- confidence is high enough
- weak dimensions are acceptable for the target task
- prerequisite and project evidence support deployment
## Could mastered state be exported?
Yes. A future implementation should support export of:
- concept mastery ledgers
- evidence portfolios
- competence profiles
- project artifacts
- domain-specific capability summaries
## Is human learning treated the same way?
The same conceptual framework applies to both human and AI learners, though interfaces and evidence sources differ.
## What is the difference between mastery and model knowledge?
A model may contain latent knowledge or pattern familiarity. Didactopus mastery is narrower and stricter: it is evidence-backed demonstrated competence with respect to explicit concepts and criteria.
## Why not use only embeddings and LLM judgments?
Because correctness, especially in formal domains, often needs stronger guarantees than plausibility. That is why Didactopus may eventually need hybrid symbolic or executable validation components.
## Can Didactopus work offline?
Yes, that is a primary design goal. The architecture is local-first and can be paired with local model serving and locally stored domain packs.

View File

@ -0,0 +1,92 @@
from __future__ import annotations
from dataclasses import dataclass, field
from .planner import rank_next_concepts, PlannerWeights
from .evidence_engine import EvidenceState, ConceptEvidenceSummary
@dataclass
class AgenticStudentState:
mastered_concepts: set[str] = field(default_factory=set)
evidence_state: EvidenceState = field(default_factory=EvidenceState)
attempt_history: list[dict] = field(default_factory=list)
def synthetic_attempt_for_concept(concept: str) -> dict:
if "descriptive-statistics" in concept:
weak = []
mastered = True
elif "probability-basics" in concept:
weak = ["transfer"]
mastered = False
elif "prior" in concept:
weak = ["explanation", "transfer"]
mastered = False
elif "posterior" in concept:
weak = ["critique", "transfer"]
mastered = False
elif "model-checking" in concept:
weak = ["critique"]
mastered = False
else:
weak = ["correctness"]
mastered = False
return {"concept": concept, "mastered": mastered, "weak_dimensions": weak}
def integrate_attempt(state: AgenticStudentState, attempt: dict) -> None:
concept = attempt["concept"]
summary = ConceptEvidenceSummary(
concept_key=concept,
weak_dimensions=list(attempt["weak_dimensions"]),
mastered=bool(attempt["mastered"]),
)
state.evidence_state.summary_by_concept[concept] = summary
if summary.mastered:
state.mastered_concepts.add(concept)
state.evidence_state.resurfaced_concepts.discard(concept)
else:
if concept in state.mastered_concepts:
state.mastered_concepts.remove(concept)
state.evidence_state.resurfaced_concepts.add(concept)
state.attempt_history.append(attempt)
def run_agentic_learning_loop(
graph,
project_catalog: list[dict],
target_concepts: list[str],
weights: PlannerWeights,
max_steps: int = 5,
) -> AgenticStudentState:
state = AgenticStudentState()
for _ in range(max_steps):
weak_dimensions_by_concept = {
key: summary.weak_dimensions
for key, summary in state.evidence_state.summary_by_concept.items()
}
fragile = set(state.evidence_state.resurfaced_concepts)
ranked = rank_next_concepts(
graph=graph,
mastered=state.mastered_concepts,
targets=target_concepts,
weak_dimensions_by_concept=weak_dimensions_by_concept,
fragile_concepts=fragile,
project_catalog=project_catalog,
weights=weights,
)
if not ranked:
break
chosen = ranked[0]["concept"]
attempt = synthetic_attempt_for_concept(chosen)
integrate_attempt(state, attempt)
if all(target in state.mastered_concepts for target in target_concepts):
break
return state

View File

@ -38,21 +38,9 @@ class ConceptGraph:
g.add_edge(u, v)
return g
def prerequisites(self, concept: str) -> list[str]:
return list(self.prerequisite_subgraph().predecessors(concept))
def prerequisite_chain(self, concept: str) -> list[str]:
return list(nx.ancestors(self.prerequisite_subgraph(), concept))
def dependents(self, concept: str) -> list[str]:
return list(self.prerequisite_subgraph().successors(concept))
def learning_path(self, start: str, target: str) -> list[str] | None:
try:
return nx.shortest_path(self.prerequisite_subgraph(), start, target)
except nx.NetworkXNoPath:
return None
def curriculum_path_to_target(self, mastered: set[str], target: str) -> list[str]:
pg = self.prerequisite_subgraph()
needed = set(nx.ancestors(pg, target)) | {target}

View File

@ -24,9 +24,24 @@ class PlannerConfig(BaseModel):
semantic_similarity_weight: float = 1.0
class EvidenceConfig(BaseModel):
resurfacing_threshold: float = 0.55
confidence_threshold: float = 0.8
evidence_weights: dict[str, float] = Field(
default_factory=lambda: {
"explanation": 1.0,
"problem": 1.5,
"project": 2.5,
"transfer": 2.0,
}
)
recent_evidence_multiplier: float = 1.35
class AppConfig(BaseModel):
platform: PlatformConfig = Field(default_factory=PlatformConfig)
planner: PlannerConfig = Field(default_factory=PlannerConfig)
evidence: EvidenceConfig = Field(default_factory=EvidenceConfig)
def load_config(path: str | Path) -> AppConfig:

View File

@ -0,0 +1,72 @@
from dataclasses import dataclass, field
@dataclass
class LearnerAttempt:
concept: str
artifact_type: str
content: str
metadata: dict = field(default_factory=dict)
@dataclass
class EvaluatorResult:
evaluator_name: str
dimensions: dict
passed: bool | None = None
notes: str = ""
class RubricEvaluator:
name = "rubric"
def evaluate(self, attempt: LearnerAttempt):
explanation = 0.85 if len(attempt.content) > 40 else 0.55
correctness = 0.80 if "because" in attempt.content.lower() else 0.65
return EvaluatorResult(self.name,
{"correctness": correctness,
"explanation": explanation})
class CodeTestEvaluator:
name = "code_test"
def evaluate(self, attempt: LearnerAttempt):
passed = "return" in attempt.content
score = 0.9 if passed else 0.35
return EvaluatorResult(self.name,
{"correctness": score,
"project_execution": score},
passed=passed)
class SymbolicRuleEvaluator:
name = "symbolic_rule"
def evaluate(self, attempt: LearnerAttempt):
passed = "=" in attempt.content
score = 0.88 if passed else 0.4
return EvaluatorResult(self.name,
{"correctness": score},
passed=passed)
class CritiqueEvaluator:
name = "critique"
def evaluate(self, attempt: LearnerAttempt):
markers = ["assumption","bias","limitation","weakness"]
found = sum(m in attempt.content.lower() for m in markers)
score = min(1.0, 0.35 + 0.15 * found)
return EvaluatorResult(self.name, {"critique": score})
class PortfolioEvaluator:
name = "portfolio"
def evaluate(self, attempt: LearnerAttempt):
count = int(attempt.metadata.get("deliverable_count",1))
score = min(1.0, 0.5 + 0.1 * count)
return EvaluatorResult(self.name,
{"project_execution": score,
"transfer": max(0.4, score-0.1)})
def run_pipeline(attempt, evaluators):
return [e.evaluate(attempt) for e in evaluators]
def aggregate(results):
totals = {}
counts = {}
for r in results:
for d,v in r.dimensions.items():
totals[d] = totals.get(d,0)+v
counts[d] = counts.get(d,0)+1
return {d: totals[d]/counts[d] for d in totals}

View File

@ -1,170 +1,16 @@
from __future__ import annotations
from dataclasses import dataclass, field
from typing import Literal
from .adaptive_engine import LearnerProfile
EvidenceType = Literal["explanation", "problem", "project", "transfer"]
MASTERY_DIMENSIONS = ["correctness", "explanation", "transfer", "project_execution", "critique"]
@dataclass
class EvidenceItem:
concept_key: str
evidence_type: EvidenceType
score: float
notes: str = ""
is_recent: bool = False
rubric_dimensions: dict[str, float] = field(default_factory=dict)
@dataclass
class ConceptEvidenceSummary:
concept_key: str
count: int = 0
weighted_mean_score: float = 0.0
total_weight: float = 0.0
confidence: float = 0.0
dimension_means: dict[str, float] = field(default_factory=dict)
weak_dimensions: list[str] = field(default_factory=list)
mastered: bool = False
@dataclass
class EvidenceState:
evidence_by_concept: dict[str, list[EvidenceItem]] = field(default_factory=dict)
summary_by_concept: dict[str, ConceptEvidenceSummary] = field(default_factory=dict)
resurfaced_concepts: set[str] = field(default_factory=set)
def clamp_score(score: float) -> float:
return max(0.0, min(1.0, score))
def evidence_weight(item: EvidenceItem, type_weights: dict[str, float], recent_multiplier: float) -> float:
base = type_weights.get(item.evidence_type, 1.0)
return base * (recent_multiplier if item.is_recent else 1.0)
def confidence_from_weight(total_weight: float) -> float:
return total_weight / (total_weight + 1.0) if total_weight > 0 else 0.0
def recompute_concept_summary(
concept_key: str,
items: list[EvidenceItem],
type_weights: dict[str, float],
recent_multiplier: float,
dimension_thresholds: dict[str, float],
confidence_threshold: float,
) -> ConceptEvidenceSummary:
weighted_score_sum = 0.0
total_weight = 0.0
dim_totals: dict[str, float] = {}
dim_weights: dict[str, float] = {}
for item in items:
item.score = clamp_score(item.score)
w = evidence_weight(item, type_weights, recent_multiplier)
weighted_score_sum += item.score * w
total_weight += w
for dim, value in item.rubric_dimensions.items():
v = clamp_score(value)
dim_totals[dim] = dim_totals.get(dim, 0.0) + v * w
dim_weights[dim] = dim_weights.get(dim, 0.0) + w
dimension_means = {
dim: dim_totals[dim] / dim_weights[dim]
for dim in dim_totals
if dim_weights[dim] > 0
}
confidence = confidence_from_weight(total_weight)
weak_dimensions = []
for dim, threshold in dimension_thresholds.items():
if dim in dimension_means and dimension_means[dim] < threshold:
weak_dimensions.append(dim)
mastered = (
confidence >= confidence_threshold
and all(
(dim in dimension_means and dimension_means[dim] >= threshold)
for dim, threshold in dimension_thresholds.items()
if dim in dimension_means
)
and len(dimension_means) > 0
)
return ConceptEvidenceSummary(
concept_key=concept_key,
count=len(items),
weighted_mean_score=(weighted_score_sum / total_weight) if total_weight > 0 else 0.0,
total_weight=total_weight,
confidence=confidence,
dimension_means=dimension_means,
weak_dimensions=sorted(weak_dimensions),
mastered=mastered,
)
def add_evidence_item(
state: EvidenceState,
item: EvidenceItem,
type_weights: dict[str, float],
recent_multiplier: float,
dimension_thresholds: dict[str, float],
confidence_threshold: float,
) -> None:
item.score = clamp_score(item.score)
state.evidence_by_concept.setdefault(item.concept_key, []).append(item)
state.summary_by_concept[item.concept_key] = recompute_concept_summary(
item.concept_key,
state.evidence_by_concept[item.concept_key],
type_weights,
recent_multiplier,
dimension_thresholds,
confidence_threshold,
)
def update_profile_mastery_from_evidence(
profile: LearnerProfile,
state: EvidenceState,
resurfacing_threshold: float,
) -> None:
for concept_key, summary in state.summary_by_concept.items():
if summary.mastered:
profile.mastered_concepts.add(concept_key)
state.resurfaced_concepts.discard(concept_key)
elif concept_key in profile.mastered_concepts and summary.weighted_mean_score < resurfacing_threshold:
profile.mastered_concepts.remove(concept_key)
state.resurfaced_concepts.add(concept_key)
def ingest_evidence_bundle(
profile: LearnerProfile,
items: list[EvidenceItem],
resurfacing_threshold: float,
confidence_threshold: float,
type_weights: dict[str, float],
recent_multiplier: float,
dimension_thresholds: dict[str, float],
) -> EvidenceState:
state = EvidenceState()
for item in items:
add_evidence_item(
state,
item,
type_weights,
recent_multiplier,
dimension_thresholds,
confidence_threshold,
)
update_profile_mastery_from_evidence(
profile=profile,
state=state,
resurfacing_threshold=resurfacing_threshold,
)
return state

View File

@ -2,18 +2,18 @@ import argparse
import os
from pathlib import Path
from .agentic_loop import run_agentic_learning_loop
from .artifact_registry import check_pack_dependencies, detect_dependency_cycles, discover_domain_packs
from .config import load_config
from .graph_builder import build_concept_graph, suggest_semantic_links
from .planner import PlannerWeights, rank_next_concepts
from .graph_builder import build_concept_graph
from .learning_graph import build_merged_learning_graph
from .planner import PlannerWeights
def build_parser() -> argparse.ArgumentParser:
parser = argparse.ArgumentParser(description="Didactopus graph-aware planner")
parser = argparse.ArgumentParser(description="Didactopus agentic learner loop")
parser.add_argument("--target", default="bayes-extension::posterior")
parser.add_argument("--mastered", nargs="*", default=[])
parser.add_argument("--export-dot", default="")
parser.add_argument("--export-cytoscape", default="")
parser.add_argument("--steps", type=int, default=5)
parser.add_argument("--config", default=os.environ.get("DIDACTOPUS_CONFIG", "configs/config.example.yaml"))
return parser
@ -35,30 +35,13 @@ def main() -> None:
print(f"- {' -> '.join(cycle)}")
return
merged = build_merged_learning_graph(results, config.platform.default_dimension_thresholds)
graph = build_concept_graph(results, config.platform.default_dimension_thresholds)
mastered = set(args.mastered)
weak_dimensions_by_concept = {
"bayes-extension::prior": ["explanation", "transfer"],
}
fragile_concepts = {"bayes-extension::prior"}
ranked = rank_next_concepts(
state = run_agentic_learning_loop(
graph=graph,
mastered=mastered,
targets=[args.target],
weak_dimensions_by_concept=weak_dimensions_by_concept,
fragile_concepts=fragile_concepts,
project_catalog=[
{
"id": "bayes-extension::bayes-mini-project",
"prerequisites": ["bayes-extension::prior"],
},
{
"id": "applied-inference::inference-project",
"prerequisites": ["applied-inference::model-checking"],
},
],
project_catalog=merged.project_catalog,
target_concepts=[args.target],
weights=PlannerWeights(
readiness_bonus=config.planner.readiness_bonus,
target_distance_weight=config.planner.target_distance_weight,
@ -67,36 +50,21 @@ def main() -> None:
project_unlock_bonus=config.planner.project_unlock_bonus,
semantic_similarity_weight=config.planner.semantic_similarity_weight,
),
max_steps=args.steps,
)
print("== Didactopus Graph-Aware Planner ==")
print(f"Target concept: {args.target}")
print("== Didactopus Agentic Learner Loop ==")
print(f"Target: {args.target}")
print(f"Steps executed: {len(state.attempt_history)}")
print()
print("Curriculum path from current mastery:")
for item in graph.curriculum_path_to_target(mastered, args.target):
print("Mastered concepts:")
if state.mastered_concepts:
for item in sorted(state.mastered_concepts):
print(f"- {item}")
else:
print("- none")
print()
print("Ready concepts:")
for item in graph.ready_concepts(mastered):
print(f"- {item}")
print()
print("Ranked next concepts:")
for item in ranked:
print(f"- {item['concept']}: {item['score']:.2f}")
for name, value in item["components"].items():
print(f" * {name}: {value:.2f}")
print()
print("Suggested semantic links:")
for a, b, score in suggest_semantic_links(graph, minimum_similarity=0.10)[:8]:
print(f"- {a} <-> {b} : {score:.2f}")
if args.export_dot:
graph.export_graphviz(args.export_dot)
print(f"Exported Graphviz DOT to {args.export_dot}")
if args.export_cytoscape:
graph.export_cytoscape_json(args.export_cytoscape)
print(f"Exported Cytoscape JSON to {args.export_cytoscape}")
if __name__ == "__main__":
main()
print("Attempt history:")
for item in state.attempt_history:
weak = ", ".join(item["weak_dimensions"]) if item["weak_dimensions"] else "none"
print(f"- {item['concept']}: mastered={item['mastered']}, weak={weak}")

View File

@ -22,7 +22,8 @@ def _distance_bonus(graph: ConceptGraph, concept: str, targets: list[str]) -> fl
best = inf
for target in targets:
try:
dist = len(__import__("networkx").shortest_path(pg, concept, target)) - 1
import networkx as nx
dist = len(nx.shortest_path(pg, concept, target)) - 1
best = min(best, dist)
except Exception:
continue
@ -32,11 +33,7 @@ def _distance_bonus(graph: ConceptGraph, concept: str, targets: list[str]) -> fl
def _project_unlock_bonus(concept: str, project_catalog: list[dict]) -> float:
count = 0
for project in project_catalog:
if concept in project.get("prerequisites", []):
count += 1
return float(count)
return float(sum(1 for project in project_catalog if concept in project.get("prerequisites", [])))
def _semantic_bonus(graph: ConceptGraph, concept: str, targets: list[str]) -> float:
@ -90,11 +87,7 @@ def rank_next_concepts(
score += semantic
components["semantic_similarity"] = semantic
ranked.append({
"concept": concept,
"score": score,
"components": components,
})
ranked.append({"concept": concept, "score": score, "components": components})
ranked.sort(key=lambda item: item["score"], reverse=True)
return ranked

View File

@ -22,6 +22,7 @@ def resolve_mastery_profile(
}
else:
effective = dict(default_profile)
if concept_profile.get("required_dimensions"):
effective["required_dimensions"] = list(concept_profile["required_dimensions"])
if concept_profile.get("dimension_threshold_overrides"):

View File

@ -0,0 +1,23 @@
from didactopus.agentic_loop import run_agentic_learning_loop
from didactopus.artifact_registry import discover_domain_packs
from didactopus.config import load_config
from didactopus.graph_builder import build_concept_graph
from didactopus.learning_graph import build_merged_learning_graph
from didactopus.planner import PlannerWeights
def test_agentic_loop_runs() -> None:
config = load_config("configs/config.example.yaml")
results = discover_domain_packs(["domain-packs"])
merged = build_merged_learning_graph(results, config.platform.default_dimension_thresholds)
graph = build_concept_graph(results, config.platform.default_dimension_thresholds)
state = run_agentic_learning_loop(
graph=graph,
project_catalog=merged.project_catalog,
target_concepts=["bayes-extension::posterior"],
weights=PlannerWeights(),
max_steps=4,
)
assert len(state.attempt_history) >= 1

View File

@ -1,14 +1,6 @@
from didactopus.artifact_registry import discover_domain_packs
from didactopus.config import load_config
from didactopus.graph_builder import build_concept_graph, suggest_semantic_links
def test_concept_graph_builds() -> None:
config = load_config("configs/config.example.yaml")
results = discover_domain_packs(["domain-packs"])
graph = build_concept_graph(results, config.platform.default_dimension_thresholds)
assert "foundations-statistics::probability-basics" in graph.graph.nodes
assert "bayes-extension::posterior" in graph.graph.nodes
from didactopus.graph_builder import build_concept_graph
def test_curriculum_path_to_target() -> None:
@ -18,19 +10,3 @@ def test_curriculum_path_to_target() -> None:
path = graph.curriculum_path_to_target(set(), "bayes-extension::posterior")
assert "bayes-extension::prior" in path
assert "bayes-extension::posterior" in path
def test_declared_cross_pack_links_exist() -> None:
config = load_config("configs/config.example.yaml")
results = discover_domain_packs(["domain-packs"])
graph = build_concept_graph(results, config.platform.default_dimension_thresholds)
related = graph.related_concepts("bayes-extension::posterior")
assert "applied-inference::model-checking" in related
def test_semantic_link_suggestions() -> None:
config = load_config("configs/config.example.yaml")
results = discover_domain_packs(["domain-packs"])
graph = build_concept_graph(results, config.platform.default_dimension_thresholds)
suggestions = suggest_semantic_links(graph, minimum_similarity=0.10)
assert len(suggestions) >= 1