diff --git a/README.md b/README.md index e3aaf4d..4d76382 100644 --- a/README.md +++ b/README.md @@ -2,349 +2,219 @@ ![Didactopus mascot](artwork/didactopus-mascot.png) -**Didactopus** is a local-first AI-assisted autodidactic mastery platform for building genuine expertise through concept graphs, adaptive curriculum planning, evidence-driven mastery, Socratic mentoring, and project-based learning. +Didactopus is a local-first Python codebase for turning educational source material into structured learning domains, evaluating learner progress against those domains, and exporting review, mastery, and skill artifacts. -**Tagline:** *Many arms, one goal — mastery.* +At a high level, the repository does five things: -## Recent revisions +1. Ingest source material such as Markdown, text, HTML, PDF-ish text, DOCX-ish text, and PPTX-ish text into normalized course/topic structures. +2. Distill those structures into draft domain packs with concepts, prerequisites, roadmaps, projects, attribution, and review flags. +3. Validate, review, and promote those draft packs through a workspace-backed review flow. +4. Build merged learning graphs, rank next concepts, accumulate learner evidence, and export capability profiles. +5. Demonstrate end-to-end flows, including an MIT OCW Information and Entropy demo that produces a pack, learner outputs, a reusable skill bundle, and progress visualizations. -### Interactive Domain review +## What Is In This Repository -This revision upgrades the earlier static review scaffold into an **interactive local SPA review UI**. +- `src/didactopus/` + The application and library code. +- `tests/` + The automated test suite. +- `domain-packs/` + Example and generated domain packs. +- `examples/` + Sample source inputs and generated outputs. +- `skills/` + Repo-local skill bundles generated from knowledge products. +- `webui/` + A local review/workbench frontend scaffold. +- `docs/` + Focused design and workflow notes. -The new review layer is meant to help a human curator work through draft packs created -by the ingestion pipeline and promote them into more trusted reviewed packs. +## Core Workflows -## Why this matters +### 1. Course and topic ingestion -One of the practical problems with using open online course contents is that the material -is often scattered, inconsistently structured, awkward to reuse, and cognitively expensive -to turn into something actionable. +The ingestion path converts source documents into `NormalizedDocument`, `NormalizedCourse`, and `TopicBundle` objects, then emits a draft pack. -Even when excellent course material exists, there is often a real **activation energy hump** -between: +Main modules: -- finding useful content -- extracting the structure -- organizing the concepts -- deciding what to trust -- getting a usable learning domain set up +- `didactopus.document_adapters` +- `didactopus.course_ingest` +- `didactopus.topic_ingest` +- `didactopus.rule_policy` +- `didactopus.pack_emitter` -Didactopus is meant to help overcome that hump. +Primary outputs: -Its ingestion and review pipeline should let a motivated learner or curator get from -"here is a pile of course material" to "here is a usable reviewed domain pack" with -substantially less friction. +- `pack.yaml` +- `concepts.yaml` +- `roadmap.yaml` +- `projects.yaml` +- `rubrics.yaml` +- `review_report.md` +- `conflict_report.md` +- `license_attribution.json` -## What is included +### 2. Review and workspace management -- interactive React SPA review UI -- JSON-backed review state model -- curation action application -- promoted-pack export -- reviewer notes and trust-status editing -- conflict resolution support -- README and FAQ updates reflecting the activation-energy goal -- sample review data and promoted pack output +Draft packs can be brought into review workspaces, edited, and promoted to reviewed packs. -## Core workflow +Main modules: -1. ingest course or topic materials into a draft pack -2. open the review UI -3. inspect concepts, conflicts, and review flags -4. edit statuses, notes, titles, descriptions, and prerequisites -5. resolve conflicts -6. export a promoted reviewed pack +- `didactopus.review_schema` +- `didactopus.review_loader` +- `didactopus.review_actions` +- `didactopus.review_export` +- `didactopus.workspace_manager` +- `didactopus.review_bridge` +- `didactopus.review_bridge_server` -## Why the review UI matters for course ingestion +Key capabilities: -In practice, course ingestion is not only a parsing problem. It is a **startup friction** -problem. A person may know what they want to study, and even know that good material exists, -but still fail to start because turning raw educational material into a coherent mastery -domain is too much work. +- create and list workspaces +- preview draft-pack imports +- import draft packs with overwrite checks +- save review actions +- export promoted packs +- export `review_data.json` for the frontend -Didactopus should reduce that work enough that getting started becomes realistic. +### 3. Learning graph and planning +Validated packs can be merged into a namespaced DAG, including explicit overrides and stage/project catalogs. +Main modules: -### Review workflow +- `didactopus.artifact_registry` +- `didactopus.learning_graph` +- `didactopus.graph_builder` +- `didactopus.concept_graph` +- `didactopus.planner` +- `didactopus.adaptive_engine` -This revision adds a **review UI / curation workflow scaffold** for generated draft packs. +Key capabilities: -The purpose is to let a human reviewer inspect draft outputs from the course/topic -ingestion pipeline, make explicit curation decisions, and promote a reviewed draft -into a more trusted domain pack. +- dependency validation for packs +- merged prerequisite DAG construction +- roadmap generation from merged stages +- graph-aware next-concept ranking +- adaptive plan generation from current mastery -#### What is included +### 4. Evidence, mastery, and capability export -- review-state schema -- draft-pack loader -- curation action model -- review decision ledger -- promoted-pack writer -- static HTML review UI scaffold -- JSON data export for the UI -- sample curated review session -- sample promoted pack output +Learner progress is represented as evidence summaries plus exported capability artifacts. -#### Core idea +Main modules: -Draft packs should not move directly into trusted use. -Instead, they should pass through a curation workflow where a reviewer can: +- `didactopus.evidence_engine` +- `didactopus.evaluator_pipeline` +- `didactopus.progression_engine` +- `didactopus.mastery_ledger` +- `didactopus.knowledge_export` -- merge concepts -- split concepts -- edit prerequisites -- mark concepts as trusted / provisional / rejected -- resolve conflict flags -- annotate rationale -- promote a curated pack into a reviewed pack +Key capabilities: -#### Status +- weighted evidence ingestion +- confidence estimation +- multidimensional mastery checks +- resurfacing weak concepts +- capability profile JSON export +- markdown capability reports +- artifact manifests -This is a scaffold for a local-first workflow. -The HTML UI is static but wired to a concrete JSON review-state model so it can -later be upgraded into a richer SPA or desktop app without changing the data contracts. +### 5. Agentic learner demos and visualization -### Course-to-course merger +The repository includes deterministic agentic demos rather than a live external model integration. -This revision adds two major capabilities: +Main modules: -- **real document adapter scaffolds** for PDF, DOCX, PPTX, and HTML -- a **cross-course merger** for combining multiple course-derived packs into one stronger domain draft +- `didactopus.agentic_loop` +- `didactopus.ocw_information_entropy_demo` +- `didactopus.ocw_progress_viz` -These additions extend the earlier multi-source ingestion layer from "multiple files for one course" -to "multiple courses or course-like sources for one topic domain." +Generated demo artifacts: -## What is included +- `domain-packs/mit-ocw-information-entropy/` +- `examples/ocw-information-entropy-run/` +- `skills/ocw-information-entropy-agent/` -- adapter registry for: - - PDF - - DOCX - - PPTX - - HTML - - Markdown - - text -- normalized document extraction interface -- course bundle ingestion across multiple source documents -- cross-course terminology and overlap analysis -- merged topic-pack emitter -- cross-course conflict report -- example source files and example merged output +## Quick Start -## Design stance +### Install -This is still scaffold-level extraction. The purpose is to define stable interfaces and emitted artifacts, -not to claim perfect semantic parsing of every teaching document. - -The implementation is designed so stronger parsers can later replace the stub extractors without changing -the surrounding pipeline. - - -### Multi-Source Course Ingestion - -This revision adds a **Multi-Source Course Ingestion Layer**. - -The pipeline can now accept multiple source files representing the same course or -topic domain, normalize them into a shared intermediate representation, merge them, -and emit a single draft Didactopus pack plus a conflict report. - -#### Supported scaffold source types - -Current scaffold adapters: -- Markdown (`.md`) -- Plain text (`.txt`) -- HTML-ish text (`.html`, `.htm`) -- Transcript text (`.transcript.txt`) -- Syllabus text (`.syllabus.txt`) - -This revision is intentionally adapter-oriented, so future PDF, slide, and DOCX -adapters can be added behind the same interface. - -#### What is included - -- multi-source adapter dispatch -- normalized source records -- source merge logic -- cross-source terminology conflict report -- duplicate lesson/title detection -- merged draft pack emission -- merged attribution manifest -- sample multi-source inputs -- sample merged output pack - - -### Course Ingestion Pipeline - -This revision adds a **Course-to-Pack Ingestion Pipeline** plus a **stable rule-policy adapter layer**. - -The design goal is to turn open or user-supplied course materials into draft -Didactopus domain packs without introducing a brittle external rule-engine dependency. - -#### Why no third-party rule engine here? - -To minimize dependency risk, this scaffold uses a small declarative rule-policy -adapter implemented in pure Python and standard-library data structures. - -That gives Didactopus: -- portable rules -- inspectable rule definitions -- deterministic behavior -- zero extra runtime dependency for policy evaluation - -If a stronger rule engine is needed later, this adapter can remain the stable API surface. - -#### What is included - -- normalized course schema -- Markdown/HTML-ish text ingestion adapter -- module / lesson / objective extraction -- concept candidate extraction -- prerequisite guess generation -- rule-policy adapter -- draft pack emitter -- review report generation -- sample course input -- sample generated pack outputs - - -### Mastery Ledger - -This revision adds a **Mastery Ledger + Capability Export** layer. - -The main purpose is to let Didactopus turn accumulated learner state into -portable, inspectable artifacts that can support downstream deployment, -review, orchestration, or certification-like workflows. - -#### What is new - -- mastery ledger data model -- capability profile export -- JSON export of mastered concepts and evaluator summaries -- Markdown export of a readable capability report -- artifact manifest for produced deliverables -- demo CLI for generating exports for an AI student or human learner -- FAQ covering how learned mastery is represented and put to work - -#### Why this matters - -Didactopus can now do more than guide learning. It can also emit a structured -statement of what a learner appears able to do, based on explicit concepts, -evidence, and artifacts. - -That makes it easier to use Didactopus as: -- a mastery tracker -- a portfolio generator -- a deployment-readiness aid -- an orchestration input for agent routing - -#### Mastery representation - -A learner's mastery is represented as structured operational state, including: - -- mastered concepts -- evaluator results -- evidence summaries -- weak dimensions -- attempt history -- produced artifacts -- capability export - -This is stricter than a normal chat transcript or self-description. - -#### Future direction - -A later revision should connect the capability export with: -- formal evaluator outputs -- signed evidence ledgers -- domain-specific capability schemas -- deployment policies for agent routing - - -### Evaluator Pipeline - -This revision introduces a **pluggable evaluator pipeline** that converts -learner attempts into structured mastery evidence. - -### Agentic Learner Loop - -This revision adds an **agentic learner loop** that turns Didactopus into a closed-loop mastery system prototype. - -The loop can now: - -- choose the next concept via the graph-aware planner -- generate a synthetic learner attempt -- score the attempt into evidence -- update mastery state -- repeat toward a target concept - -This is still scaffold-level, but it is the first explicit implementation of the idea that **Didactopus can supervise not only human learners, but also AI student agents**. - -## Complete overview to this point - -Didactopus currently includes: - -- **Domain packs** for concepts, projects, rubrics, mastery profiles, templates, and cross-pack links -- **Dependency resolution** across packs -- **Merged learning graph** generation -- **Concept graph engine** for cross-pack prerequisite reasoning, linking, pathfinding, and export -- **Adaptive learner engine** for ready, blocked, and mastered concepts -- **Evidence engine** with weighted, recency-aware, multi-dimensional mastery inference -- **Concept-specific mastery profiles** with template inheritance -- **Graph-aware planner** for utility-ranked next-step recommendations -- **Agentic learner loop** for iterative goal-directed mastery acquisition - -## Agentic AI students - -An AI student under Didactopus is modeled as an **agent that accumulates evidence against concept mastery criteria**. - -It does not “learn” in the same sense that model weights are retrained inside Didactopus. Instead, its learned mastery is represented as: - -- current mastered concept set -- evidence history -- dimension-level competence summaries -- concept-specific weak dimensions -- adaptive plan state -- optional artifacts, explanations, project outputs, and critiques it has produced - -In other words, Didactopus represents mastery as a **structured operational state**, not merely a chat transcript. - -That state can be put to work by: - -- selecting tasks the agent is now qualified to attempt -- routing domain-relevant problems to the agent -- exposing mastered concept profiles to orchestration logic -- using evidence summaries to decide whether the agent should act, defer, or review -- exporting a mastery portfolio for downstream use - -## FAQ - -See: -- `docs/faq.md` - -## Correctness and formal knowledge components - -See: -- `docs/correctness-and-knowledge-engine.md` - -Short version: yes, there is a strong argument that Didactopus will eventually benefit from a more formal knowledge-engine layer, especially for domains where correctness can be stated in symbolic, logical, computational, or rule-governed terms. - -A good future architecture is likely **hybrid**: - -- LLM/agentic layer for explanation, synthesis, critique, and exploration -- formal knowledge engine for rule checking, constraint satisfaction, proof support, symbolic validation, and executable correctness checks - -## Repository structure - - -```text -didactopus/ -├── README.md -├── artwork/ -├── configs/ -├── docs/ -├── examples/ -├── src/didactopus/ -├── tests/ -└── webui/ +```bash +pip install -e . ``` + +### Run tests + +```bash +pytest +``` + +### Generate the MIT OCW demo pack, learner outputs, and skill bundle + +```bash +python -m didactopus.ocw_information_entropy_demo +``` + +This writes: + +- `domain-packs/mit-ocw-information-entropy/` +- `examples/ocw-information-entropy-run/` +- `skills/ocw-information-entropy-agent/` + +### Render learner progress visualizations + +Path-focused view: + +```bash +python -m didactopus.ocw_progress_viz +``` + +Full concept map including noisy non-path concepts: + +```bash +python -m didactopus.ocw_progress_viz --full-map +``` + +### Run the review bridge server + +```bash +python -m didactopus.review_bridge_server +``` + +The default config file is `configs/config.example.yaml`. + +## Current State + +This repository is functional, but parts of it remain intentionally heuristic. + +What is solid: + +- pack validation and dependency checks +- review-state export and workspace import flow +- merged learning graph construction +- weighted evidence and capability exports +- deterministic agentic demo runs +- generated skill bundles and progress visualizations + +What remains heuristic or lightweight: + +- document adapters for binary formats are simplified text adapters +- concept extraction can produce noisy candidate terms +- evaluator outputs are heuristic rather than formal assessments +- the agentic learner loop uses synthetic attempts +- the frontend and bridge flow are local-first scaffolds, not a hosted product + +## Recommended Reading + +- [docs/course-to-pack.md](docs/course-to-pack.md) +- [docs/learning-graph.md](docs/learning-graph.md) +- [docs/agentic-learner-loop.md](docs/agentic-learner-loop.md) +- [docs/mastery-ledger.md](docs/mastery-ledger.md) +- [docs/workspace-manager.md](docs/workspace-manager.md) +- [docs/interactive-review-ui.md](docs/interactive-review-ui.md) +- [docs/faq.md](docs/faq.md) + +## MIT OCW Demo Notes + +The MIT OCW Information and Entropy demo is grounded in the MIT OpenCourseWare course page and selected unit/readings metadata, then converted into a local course source file for reproducible ingestion. The resulting generated pack and learner outputs are intentionally reviewable rather than presented as authoritative course mirrors. diff --git a/docs/course-to-pack.md b/docs/course-to-pack.md index d679395..e50beb1 100644 --- a/docs/course-to-pack.md +++ b/docs/course-to-pack.md @@ -1,35 +1,80 @@ -# Course-to-Pack Ingestion Pipeline +# Course-to-Pack Pipeline -The course-to-pack pipeline transforms educational material into Didactopus-native artifacts. +The course-to-pack pipeline turns source material into a Didactopus draft domain pack. -## Inputs +## Current code path -Typical sources: -- syllabus text -- lesson outlines -- markdown notes -- HTML course pages -- assignment sheets -- quiz prompts -- lecture transcripts +The main building blocks are: -## Normalized intermediate structure +- `didactopus.document_adapters` + Normalize source files into `NormalizedDocument`. +- `didactopus.topic_ingest` and `didactopus.course_ingest` + Build `NormalizedCourse` data and extract concept candidates. +- `didactopus.rule_policy` + Apply deterministic cleanup and heuristic rules. +- `didactopus.pack_emitter` + Emit pack files and review/conflict artifacts. -The pipeline builds a `NormalizedCourse` object containing: -- title -- source metadata -- modules -- lessons -- learning objectives -- exercises -- key terms -- project prompts +## Supported source types -## Rule-policy adapter +The repository currently accepts: -The pipeline includes a small rule layer for stable policy transforms such as: -- suggest prerequisites from ordering -- merge repeated key-term candidates -- flag modules with no exercises -- flag concepts with weak evidence of distinctness -- suggest project concepts from capstone markers +- Markdown +- plain text +- HTML +- PDF-ish text +- DOCX-ish text +- PPTX-ish text + +Binary-format adapters are interface-stable but still intentionally simple. + +## Intermediate structures + +The ingestion path works through these data shapes: + +- `NormalizedDocument` +- `NormalizedCourse` +- `TopicBundle` +- `ConceptCandidate` +- `DraftPack` + +## Current emitted artifacts + +The pack emitter writes: + +- `pack.yaml` +- `concepts.yaml` +- `roadmap.yaml` +- `projects.yaml` +- `rubrics.yaml` +- `review_report.md` +- `conflict_report.md` +- `license_attribution.json` + +## Rule layer + +The current default rules: + +- infer prerequisites from content order +- merge duplicate concept candidates by title +- flag modules that look project-like +- flag modules or concepts with weak extracted assessment signals + +These rules are intentionally small and deterministic. They are meant to be easy to inspect and patch. + +## Known limitations + +- title-cased phrases can still become noisy concept candidates +- extracted mastery signals remain weak for many source styles +- project extraction is conservative +- document parsing for PDF/DOCX/PPTX is still lightweight + +## Reference demo + +The end-to-end reference flow in this repository is: + +```bash +python -m didactopus.ocw_information_entropy_demo +``` + +That command ingests the MIT OCW Information and Entropy source file in `examples/ocw-information-entropy/`, emits a draft pack into `domain-packs/mit-ocw-information-entropy/`, runs a deterministic agentic learner over the generated path, and writes downstream skill/visualization artifacts. diff --git a/docs/draft-pack-import.md b/docs/draft-pack-import.md index 62d4436..3af144d 100644 --- a/docs/draft-pack-import.md +++ b/docs/draft-pack-import.md @@ -1,36 +1,32 @@ # Draft-Pack Import Workflow -The draft-pack import workflow bridges ingestion output and review workspace setup. +Draft-pack import connects ingestion output to the review workspace flow. -## Why it exists +## Current behavior -Without import support, users still have to manually: -- locate a generated draft pack -- create a workspace -- copy files into the right directory -- reopen the review tool +The import path currently supports: -That is exactly the kind of startup friction Didactopus is supposed to reduce. +- import preview through `preview_draft_pack_import(...)` +- overwrite detection before import +- workspace creation when the target workspace does not exist yet +- copying the source pack into `workspace/draft_pack/` +- workspace-recency updates after import -## Current scaffold +## Main modules -This revision adds: -- import API endpoint -- workspace-manager copy/import operation -- UI controls for creating a workspace and importing a draft pack path +- `didactopus.import_validator` +- `didactopus.workspace_manager` +- `didactopus.review_bridge_server` -## Import behavior +## Bridge endpoints -The current scaffold: -- creates the target workspace if needed -- copies the source draft-pack directory into `workspace/draft_pack/` -- updates workspace metadata -- allows the workspace to be opened immediately afterward +The bridge server exposes: -## Future work +- `/api/workspaces/import-preview` +- `/api/workspaces/import` -- file picker integration -- import validation -- overwrite protection / confirmation -- pack schema validation before import -- duplicate import detection +These endpoints return structured success/error payloads for missing source directories, invalid source packs, or overwrite conflicts. + +## Why it matters + +The import flow removes a manual step between "I generated a draft pack" and "I can now review it in a managed workspace." diff --git a/docs/faq.md b/docs/faq.md index d4b8bde..ca9b1ff 100644 --- a/docs/faq.md +++ b/docs/faq.md @@ -1,24 +1,117 @@ # FAQ -## Why add semantic QA? +## What is Didactopus, in one sentence? -Because a pack can be structurally valid and still be awkward or misleading as a -learning domain. +Didactopus turns educational material into structured learning packs, then uses graphs, evidence, and review workflows to support human or AI learning against those packs. -## How does this help with the activation-energy problem? +## Is this a packaged application or a research/workbench repository? -It catches likely high-level issues earlier, so users do not have to discover -them only after they have already committed to review or study. +It is a workbench-style repository with runnable code, tests, example packs, generated outputs, and local-first review/demo flows. -## Does semantic QA prove that a pack is good? +## What is a domain pack? -No. It is a heuristic curation aid. +A domain pack is the unit Didactopus uses to represent a learning domain. In practice it is a directory containing: -## What kinds of problems can it flag? +- `pack.yaml` +- `concepts.yaml` +- `roadmap.yaml` +- `projects.yaml` +- `rubrics.yaml` -Examples: -- duplicate or near-duplicate concepts -- over-broad concepts -- abrupt stage transitions -- weak prerequisite structure -- descriptions that are too similar to each other +Generated packs may also include review, conflict, and attribution artifacts. + +## What is the difference between a draft pack and a reviewed pack? + +A draft pack is an ingestion output. A reviewed pack is a pack that has been loaded into the review workflow, edited or triaged by a reviewer, and exported again with review metadata applied. + +## What does the workspace manager do? + +It keeps review work organized. The current implementation supports: + +- create workspace +- list workspaces +- touch/open recent workspaces +- preview draft-pack import +- import draft packs into `workspace/draft_pack/` +- overwrite checks before replacing an existing draft pack + +## Does Didactopus really ingest PDF, DOCX, and PPTX files? + +Yes, but conservatively. Those adapters currently normalize text in a simplified way. They exist to stabilize the interface and surrounding workflow rather than to claim production-grade document parsing. + +## Does the agentic learner call an external LLM? + +No. The current agentic learner paths are deterministic and synthetic. They are meant to exercise the orchestration pattern, evaluator pipeline, mastery updates, capability export, and visualization flow without requiring an external model service. + +## What is the current evidence model? + +The evidence engine supports: + +- evidence items grouped by concept +- per-type weighting +- optional recency weighting +- confidence derived from accumulated evidence mass +- dimension-level summaries +- resurfacing when recent weak evidence drags mastery below threshold + +## What does the capability export contain? + +The exported capability profile includes: + +- learner identity +- target domain +- mastered concepts +- weak dimensions by concept +- evaluator summaries by concept +- artifact records + +The main export formats are JSON, Markdown, and an artifact manifest. + +## What is the MIT OCW Information and Entropy demo? + +It is the repo's current end-to-end reference flow. Running: + +```bash +python -m didactopus.ocw_information_entropy_demo +``` + +generates: + +- a new pack in `domain-packs/mit-ocw-information-entropy/` +- learner outputs in `examples/ocw-information-entropy-run/` +- a repo-local skill bundle in `skills/ocw-information-entropy-agent/` + +## What visualizations exist today? + +The OCW demo currently generates two visualization modes: + +- a guided-path learner progress view +- a full concept map that also surfaces noisy non-path concepts + +You can render them with: + +```bash +python -m didactopus.ocw_progress_viz +python -m didactopus.ocw_progress_viz --full-map +``` + +## Is the generated content free of extractor noise? + +No. The current extractors can still emit noisy candidate concepts, especially from title-cased phrases embedded in lesson text. That is why review flags, workspace review, and promotion flows are first-class parts of the project. + +## How should I think about validation versus QA? + +Validation is structural: required files, schemas, references, duplicates, dependencies. + +QA is heuristic: coverage alignment, evaluator alignment, path quality, semantic QA, and related diagnostics that try to surface likely quality problems before or during review. + +## Where should I start reading if I want the full project overview? + +Start with: + +- `README.md` +- `docs/course-to-pack.md` +- `docs/learning-graph.md` +- `docs/mastery-ledger.md` +- `docs/workspace-manager.md` +- `docs/interactive-review-ui.md` diff --git a/docs/interactive-review-ui.md b/docs/interactive-review-ui.md index 743fef9..99786f6 100644 --- a/docs/interactive-review-ui.md +++ b/docs/interactive-review-ui.md @@ -1,34 +1,52 @@ # Interactive Review UI -This revision introduces a React-based local SPA for reviewing draft packs. +The review UI is the local-first front end for inspecting and promoting draft packs. -## Goals +## What the current code supports -- reduce curation friction -- make review decisions explicit -- allow pack promotion after inspection -- preserve provenance and review rationale +The UI is backed by a JSON review model and a local bridge server. -## Features in this scaffold +Key files and modules: -- concept list with editable fields -- trust status editing -- concept notes editing +- `webui/` +- `didactopus.review_schema` +- `didactopus.review_loader` +- `didactopus.review_actions` +- `didactopus.review_export` +- `didactopus.review_bridge` +- `didactopus.review_bridge_server` + +## Data flow + +1. A draft pack is loaded into a review session. +2. The review bridge serves workspace and session data. +3. The frontend consumes `review_data.json`-style payloads. +4. Review actions update concept status, notes, and prerequisites. +5. The promoted pack export writes reviewed pack files and a review ledger. + +## Current feature set + +- concept list and detail views +- editable concept status +- note editing - prerequisite editing -- conflict visibility and resolution -- promoted-pack export generation in-browser logic +- save/load via the review bridge +- export of promoted pack artifacts +- workspace create/open/import support through the bridge server -## Data model +## Outputs -The SPA loads `review_data.json` and can emit: -- updated review state -- review ledger entries -- promoted concepts payload +The review export layer currently writes: -## Next steps +- `review_session.json` +- `review_data.json` +- promoted `pack.yaml` +- promoted `concepts.yaml` +- `review_ledger.json` +- `license_attribution.json` -- file open/save integration -- conflict filtering -- merge/split concept actions in UI -- richer diff views -- domain-pack validation from the UI +## Limits + +- the UI remains local-first and file-backed +- merge/split concept actions are not deeply modeled yet +- richer diff and conflict tooling is still future work diff --git a/docs/learning-graph.md b/docs/learning-graph.md index 0b4f536..4e86102 100644 --- a/docs/learning-graph.md +++ b/docs/learning-graph.md @@ -1,33 +1,61 @@ # Merged Learning Graph -## Purpose +The merged learning graph is the pack-composition layer that connects validated domain packs into one learner-facing prerequisite model. -The merged learning graph is the first learner-facing composite model built from multiple domain packs. +## What the code builds today -## Features in this revision +`didactopus.learning_graph.build_merged_learning_graph(...)` produces a `MergedLearningGraph` containing: -- namespaced concept keys: `pack-name::concept-id` -- merged prerequisite DAG -- stage catalog across packs -- project catalog across packs -- optional overrides for previously defined concepts +- `concept_data` +- `stage_catalog` +- `project_catalog` +- `load_order` +- `graph` -## Override model +Concept keys are namespaced as: -A pack manifest may include: +```text +pack-name::concept-id +``` + +## Inputs + +The merged graph is built from validated `PackValidationResult` objects, typically discovered through `didactopus.artifact_registry.discover_domain_packs(...)`. + +## Overrides + +Pack manifests can explicitly replace a previously defined concept through: ```yaml overrides: - foundations-statistics::descriptive-statistics ``` -If a pack defines concept `descriptive-statistics` and lists the namespaced target above, it may replace that concept in the merged graph. +If the overriding pack defines `descriptive-statistics`, the merged graph will store that concept under the overridden namespaced key. -That is intentionally explicit and conservative. +## Current exported behaviors -## Future work +The learning-graph and graph-builder layers currently support: -- merged rubric graph -- stage dependency inference -- learner-specific subgraph extraction -- adaptive sequencing from the merged DAG +- merged prerequisite DAG construction +- namespaced prerequisite edges +- stage catalog aggregation +- project catalog aggregation +- concept-graph export through `didactopus.graph_builder` +- learner roadmap generation through `generate_learner_roadmap(...)` + +## Relationship to the concept graph + +`didactopus.graph_builder.build_concept_graph(...)` takes the merged graph inputs and produces the learner-facing `ConceptGraph`, which powers: + +- curriculum path extraction +- ready concept detection +- semantic-link suggestions +- planner scoring + +## Known limitations + +- stage dependencies are still implicit rather than separately modeled +- cross-pack links are supported but still lightweight +- roadmap generation is pack-derived rather than learner-personalized +- richer graph export/visualization is still evolving diff --git a/docs/mit-ocw-notes.md b/docs/mit-ocw-notes.md index 9b35b27..eb40bfc 100644 --- a/docs/mit-ocw-notes.md +++ b/docs/mit-ocw-notes.md @@ -1,14 +1,32 @@ # MIT OpenCourseWare Notes -MIT OpenCourseWare publishes material under CC BY-NC-SA 4.0 on its terms page, while also warning -that some external or third-party linked content may be excluded from that license. +MIT OpenCourseWare material is a good fit for Didactopus demos, but it needs explicit attribution and license handling. -That means a Didactopus ingestion pipeline should not simply mark an entire pack as reusable without nuance. +## Current handling in this repository -Recommended handling: -- record MIT OCW course pages as licensed sources -- record individual excluded items explicitly when identified -- preserve the license URL in source metadata -- record whether Didactopus generated an adaptation -- generate an attribution artifact automatically -- propagate a noncommercial/sharealike flag in pack metadata when derived content is redistributed +The MIT OCW Information and Entropy demo stores: + +- a local derived source file in `examples/ocw-information-entropy/` +- attribution and rights notes in the generated pack +- generated learner outputs in `examples/ocw-information-entropy-run/` +- a repo-local skill bundle in `skills/ocw-information-entropy-agent/` + +## License handling stance + +MIT OpenCourseWare course content is generally distributed under CC BY-NC-SA 4.0, with the important caveat that linked or third-party materials may not always be covered. + +That means Didactopus should: + +- preserve MIT OCW attribution +- keep a rights note in generated artifacts +- treat redistributable derived packs as reviewable outputs rather than unquestioned mirrors +- preserve noncommercial and share-alike implications when applicable + +## Practical guidance + +When building from MIT OCW sources: + +- record the course page and any unit/resource pages used +- separate core MIT OCW material from excluded third-party items if they appear +- keep generated pack content clearly marked as adapted/derived +- include attribution artifacts with the emitted pack diff --git a/docs/ui_visualization_notes.md b/docs/ui_visualization_notes.md index 8dbd5f9..e28e1d9 100644 --- a/docs/ui_visualization_notes.md +++ b/docs/ui_visualization_notes.md @@ -1,33 +1,50 @@ -# UI Visualization Notes +# Visualization Notes -## Review workbench -Main panes: +This repository now has two concrete visualization paths and a broader set of future UI ideas. -1. Candidate queue -2. Candidate detail -3. Evidence/provenance panel -4. Promotion actions -5. Related synthesis suggestions +## Current implemented visualization -## Synthesis map -Features: -- zoomable concept supergraph -- accepted vs proposed links -- cross-pack color coding -- cluster highlighting -- filter by score, pack, theme +The MIT OCW Information and Entropy demo produces: -## Promotion dashboard -Views: -- pack improvement queue -- curriculum draft queue -- skill bundle queue -- archive browser +- `examples/ocw-information-entropy-run/learner_progress.svg` +- `examples/ocw-information-entropy-run/learner_progress.html` +- `examples/ocw-information-entropy-run/learner_progress_full_map.svg` +- `examples/ocw-information-entropy-run/learner_progress_full_map.html` -## Learner-facing synthesis hints -The learner view should be selective and helpful, not noisy. +These are rendered by `didactopus.ocw_progress_viz`. -Good uses: -- “This concept may connect to another pack you know.” -- “An analogy from another topic may help here.” -- “Learners like you often benefit from this bridge concept.” +### Path-focused view + +Shows: + +- guided curriculum path +- mastered versus in-progress state +- per-concept mean evaluator score +- produced artifact name + +### Full concept map + +Shows: + +- the same guided path in the center column +- side concepts grouped around their anchor lesson or prerequisite +- extractor spillover as informative context instead of hiding it + +## Existing SVG/render helpers + +The repository also includes generic SVG frame helpers in: + +- `didactopus.export_svg` +- `didactopus.render_bundle` + +Those are useful for frame-based graph rendering pipelines, but the OCW learner-progress visualizations are currently rendered directly as standalone SVG/HTML artifacts. + +## Future UI directions + +Useful next steps would be: + +- pack DAG views with filtering by mastered/weak/noisy state +- review-workbench graph overlays +- stage-aware roadmap views +- side-by-side before/after review diffs +- animation or frame exports for learner progression over time diff --git a/docs/workspace-manager.md b/docs/workspace-manager.md index 475a23f..9c6b87c 100644 --- a/docs/workspace-manager.md +++ b/docs/workspace-manager.md @@ -1,22 +1,51 @@ # Workspace Manager -The workspace manager provides project-level organization for Didactopus review work. +The workspace manager organizes review work around draft packs. ## Why it exists -Without a workspace layer, users still have to manually track: -- which draft packs exist -- where they live -- which one is currently being reviewed -- which ones have promoted outputs +Without a workspace layer, users have to manually track: -That creates unnecessary friction. +- generated draft-pack directories +- which draft is currently active in review +- where review exports belong +- whether an import would overwrite existing work -## Features in this scaffold +The current code reduces that friction by giving review work a registry and import lifecycle. + +## Current implementation + +`didactopus.workspace_manager.WorkspaceManager` currently supports: + +- `create_workspace(...)` +- `list_workspaces()` +- `get_workspace(...)` +- `touch_recent(...)` +- `preview_import(...)` +- `import_draft_pack(...)` + +The registry is stored as JSON and the default workspace root is configurable through `configs/config.example.yaml`. + +## Import behavior + +Draft-pack import currently: + +- validates source-pack availability through preview logic +- reports whether overwrite will be required +- creates the target workspace if needed +- copies the source draft pack into `workspace/draft_pack/` +- updates registry metadata and recency ordering + +If the target workspace already exists, import requires `allow_overwrite=True`. + +## Bridge integration + +The review bridge server exposes workspace operations through local HTTP endpoints, including: -- workspace registry file -- create workspace - list workspaces -- open a specific workspace -- track recent workspaces -- expose these through a local bridge API +- create workspace +- open workspace +- import preview +- import draft pack + +These endpoints are used to connect ingestion outputs to the review workflow without manual file shuffling.