Document Notebook operating model

2026-05-08 13:39:51 -04:00 · 2026-05-08 13:39:51 -04:00 · dd109ec93c
parent 086215489f
commit dd109ec93c
3 changed files with 292 additions and 10 deletions
--- a/docs/foundation-notebook-inception-pilot.md
+++ b/docs/foundation-notebook-inception-pilot.md
@ -55,6 +55,26 @@ This is a better first region than a broad "history of evolutionary thought"
 pilot because it stresses concept navigation without forcing huge historical
 scope immediately.
 ## Pilot revision from execution
 The original pilot choice was good enough to start, but the actual run showed
 that the stable Notebook center should be broader than a narrow topic label.
 In practice, the pilot worked better after shifting toward a hub such as:
 - `Evolutionary Dynamics of Populations`
 The operational lesson is:
 - narrow topics are still useful as first-ring and second-ring nodes
 - but the primary Notebook page works better when anchored on a broad
  explanatory hub
 - bibliography topics alone are too thin to serve as the main Notebook center
 This should inform future Notebook inception work. Start with a manageable
 region, but expect the durable center to be a broad explanatory hub rather than
 the narrow starting label.
 ## Pilot workspace
 Create one stable workspace outside the library root, for example:
@ -259,6 +279,23 @@ For this first pilot, Foundation Notebook inception should mean:
 If all six are true, Notebook inception has happened even if public publishing
 and richer UI are still pending.
 ## Additional lesson from the pilot
 Inception is only the beginning. The pilot also showed that a strong Notebook
 depends on extraction classes beyond ordinary claims.
 The most useful additions were:
 - definitions
 - qualifications
 - constraints
 - contrasts/distinctions
 - quote candidates
 - source-role weighting
 Those should be treated as part of the normal Notebook operating model, not as
 optional polish after inception.
 ## Expected artifact inventory
 At minimum, the first successful inception run should leave:
--- a/docs/notebook-operating-model.md
+++ b/docs/notebook-operating-model.md
@ -0,0 +1,180 @@
 # Notebook Operating Model
 This note records what the Foundation Notebook pilot changed about how
 Didactopus should be understood.
 The main conclusion is that the Notebook is not just another output format. It
 is the durable knowledge layer between raw source-grounding work and
 learner-facing products.
 Didactopus should therefore operate with three layers:
 1. source-grounded substrate
 2. Notebook knowledge layer
 3. learner-facing and workbench-facing products derived from that layer
 ## 1. Source-grounded substrate
 This is the ingestion and review layer:
 - `doclift`
 - Wolfe-guided source discovery and local-corpus selection
 - `GroundRecall`
 - `CiteGeist`
 Its job is not only to collect sources, but to preserve enough structure to
 support later explanation, review, and public accountability.
 The pilot showed that this layer needs more than raw topic labels and extracted
 claims. It needs to preserve:
 - source role
 - concept neighborhood hints
 - terminology
 - scope conditions
 - contrasts
 - quote candidates
 - bibliographic support
 ## 2. Notebook knowledge layer
 The Notebook is the durable concept-network representation.
 It should be treated as primary for knowledge organization, but supplemental
 relative to the final learner workflow. Learners do not necessarily consume the
 Notebook directly as their main experience, but Didactopus should derive its
 best learner products from it.
 The pilot showed that the Notebook should be:
 - hub-first rather than topic-label-first
 - neighborhood-oriented rather than article-oriented
 - distinction-aware rather than summary-only
 - source-grounded but normally paraphrastic in public rendering
 The broadest useful hub in the pilot was not a narrow topic like
 `population biology`, but a broader explanatory center such as
 `Evolutionary Dynamics of Populations`.
 That shift matters. The Notebook should preserve explanatory structure such as:
 - populations and variation
 - inheritance and mutation
 - selection and drift
 - adaptation and accommodation
 - organism-environment interaction
 - common descent and divergence
 ## 3. Derived products
 Didactopus should derive multiple product types from the same Notebook layer:
 - learner workbench views
 - guided lessons and learning paths
 - mentor/practice/evaluator session grounding
 - review workbench artifacts
 - public Notebook pages
 - argumentation/workbench bundles
 These products should not collapse into one another.
 Different renderings need different rules:
 - Notebook rendering:
  preserve concept structure, source trails, and review context
 - Workbench rendering:
  surface definitions, caveats, distinctions, and quote candidates
 - Public exposition:
  stay paraphrastic by default, mark all quotations, and show source citation
 ## Required extraction classes
 The pilot made it clear that Didactopus needs more than “claim extraction”.
 The durable extraction classes should include:
 - explanatory claims
 - definitions
 - qualifications
 - constraints
 - contrasts and distinctions
 - quote candidates
 - source-trail and bibliographic support
 - learner-significance cues
 The distinction layer is especially important for learning. Many concepts are
 best learned not as isolated statements but as structured contrasts:
 - `A vs B`
 - `A does not imply B`
 - `B can occur without A`
 - `A is one mechanism among several`
 For the evolution pilot, this includes distinctions such as:
 - selection versus drift
 - adaptation versus accommodation
 - heredity versus epigenetic inheritance
 - short-term response versus long-run evolutionary change
 ## Source-role weighting
 The pilot also showed that not all sources do the same work.
 Didactopus should preserve source-role weighting so later products can choose
 better supporting material for the task at hand.
 At minimum, sources should be classifiable as:
 - overview
 - mechanism
 - nuance
 - controversy
 - argumentation
 Short web captures were often good enough for overview and argumentation.
 Wolfe-selected local textbook material was substantially better for nuance,
 qualification, and constraint extraction.
 That means source selection should not be treated as neutral. The system should
 prefer different source roles for different downstream tasks.
 ## Secondary products are not accidental
 Definitions, constraints, qualifications, and quote candidates should be
 treated as first-class secondary products, not as incidental by-products of
 review.
 These secondary products matter because they support:
 - explanation quality
 - misconception prevention
 - learner revision
 - source-grounded argumentation workflows
 - public accountability
 The pilot showed that a strong Notebook/workbench flow depends heavily on these
 secondary lanes.
 ## Citation and quotation policy
 The public-facing rule is simple:
 - quotes must stay marked and attributed
 - public prose should normally be paraphrastic
 - unmarked source wording is not acceptable in public Notebook exposition
 This should remain explicit in both workbench and publication paths.
 ## Operational implications
 Near-term Didactopus work should therefore prioritize:
 1. Notebook-centered concept organization
 2. first-class distinction modeling
 3. source-role-aware retrieval and ranking
 4. first-class secondary products
 5. separate rendering contracts for Notebook, workbench, and public exposition
 The notebook is not the only Didactopus output. It is the durable center that
 lets the other outputs stay grounded, explainable, and pedagogically useful.
--- a/docs/roadmap.md
+++ b/docs/roadmap.md
@ -24,6 +24,10 @@ Near-term scope:
 - extend the session flow beyond one short interaction
 - make scientific virtues operational in the session loop by separating observation from interpretation, preserving uncertainty, and rewarding justified revision
 - replace stubbed provider output in learner-facing pilot flows with configured real model backends where available
 - make learner-facing guidance explicitly distinction-aware:
  - `A vs B`
  - `A does not imply B`
  - `B can occur without A`
 Current code anchors:
@ -101,6 +105,8 @@ Target features:
 - current concept and why-it-matters view
 - prerequisite chain and supporting lessons
 - grounded source excerpts
 - definitions, constraints, and qualifications view
 - quote candidates and source-trail view for argumentation workflows
 - active practice task
 - evaluator feedback
 - recommended next step
@ -172,8 +178,62 @@ Target features:
 - lesson and source-fragment references in explanations
 - explicit distinction between cited source support and model inference
 - easier inspection of concept-to-source provenance
 - explicit quote marking and attribution in any public-facing output
 - no unmarked source wording in public Notebook exposition
-### 8. Pack quality, review, and concept-graph curation improvements
+### 8. Notebook-centered knowledge layer
 Status: planned
 Why it matters:
 - The Foundation Notebook pilot suggests that Didactopus needs one durable
  concept-network representation between raw source grounding and learner-facing
  products.
 - Topic labels alone are too weak; broad explanatory hubs and first-ring
  concept neighborhoods work better.
 - The Notebook is the right place to preserve definitions, constraints,
  qualifications, and contrasts.
 - The pilot also suggests that the Notebook is the durable center between raw
  source-grounding work and learner-facing products, not just a supplemental
  static page format.
 Target features:
 - hub-first concept organization
 - first-ring and second-ring concept neighborhoods
 - first-class distinction modeling:
  - `A vs B`
  - `A does not imply B`
  - `B can occur without A`
 - support for source-role weighting:
  - overview
  - mechanism
  - nuance
  - controversy
  - argumentation
 - support for learner-significance cues so explanation and practice can answer
  “why does this distinction matter?”
 - Notebook-adjacent secondary products:
  - definitions
  - qualifications
  - constraints
  - quote candidates
 - separate rendering rules for Notebook, workbench, and public exposition
 Immediate next steps:
 - promote the Foundation Notebook pilot conclusions into the stable design
  model for Didactopus
 - prefer broad explanatory hubs over narrow topic labels when organizing new
  Notebook regions
 - make source-role-aware retrieval available to learner workbench flows
 - treat secondary products as first-class review/export outputs rather than
  incidental metadata
 - connect Notebook concept neighborhoods more directly to learner-session
  grounding and practice generation
 ### 9. Pack quality, review, and concept-graph curation improvements
 Status: planned
@ -190,7 +250,7 @@ Target features:
 - stronger review support for noisy or broad concepts
 - improved source coverage QA
-### 9. Incremental re-ingestion and course updates
+### 10. Incremental re-ingestion and course updates
 Status: planned
@ -206,7 +266,7 @@ Target features:
 - graph and pack diffs
 - preservation of learner evidence across source updates
-### 10. Richer multimodal and notation support
+### 11. Richer multimodal and notation support
 Status: longer-term
@ -232,15 +292,20 @@ Examples:
 - Treat scientific virtues as operational principles: encourage curiosity, honesty about evidence, skepticism toward weak claims, attentiveness to caveats, and revision when the evidence changes.
 - Separate observation from interpretation in learner-facing guidance so the system does not blur grounded support with model inference.
 - Frame revision as progress rather than as failure, especially in mentor and evaluator feedback.
 - Preserve distinctions, caveats, and scope conditions as learning assets rather
  than treating them as noise.
 - Treat the Notebook as the durable knowledge layer, but not as the only
  learner-facing representation.
 ## Suggested Implementation Sequence
 1. Strengthen `didactopus.learner_session` into the standard session backend.
 2. Fold the learner-workbench pilot into that backend without losing its stronger study-state framing.
-3. Replace stubbed learner-workbench provider output with a configured real model backend.
+3. Add a Notebook-centered operating layer with hub concepts, distinctions, and secondary products.
-4. Ground the `evidence-trail` pilot in richer source fragments and persisted learner state.
+4. Replace stubbed learner-workbench provider output with a configured real model backend.
-5. Build a small model-benchmark harness around the unified learner backend.
+5. Ground the `evidence-trail` pilot and future Notebook pilots in richer source fragments, definitions, constraints, and persisted learner state.
-6. Add accessible learner HTML and text-first outputs.
+6. Build a small model-benchmark harness around the unified learner backend.
-7. Add local TTS and STT support to the same session flow.
+7. Add accessible learner HTML and text-first outputs.
-8. Expand adaptive practice and diagnostics.
+8. Add local TTS and STT support to the same session flow.
-9. Improve review, impact analysis, and incremental update support.
+9. Expand adaptive practice and diagnostics.
 10. Improve review, impact analysis, and incremental update support.