7.9 KiB
Foundation Notebook Inception Pilot
This note turns the broader Notebook pipeline into one concrete first run.
It answers a narrower question than evo-edu-notebook-pipeline.md:
- what is the first pilot region,
- what repos and commands are already ready,
- what exact artifacts should be produced,
- and what still blocks calling the Notebook "incepted".
Current status
The stack is already past pure planning.
Implemented now:
CiteGeistcan export Notebook-ready topic bibliography bundles withexport-notebook-topic.GroundRecallcan exportgroundrecall_query_bundle.jsonfor one concept.Didactopuscan buildnotebook_page.jsondirectly from a GroundRecall concept or bundle.Didactopuspack emission can already carry Notebook-facing artifacts.
Not yet done:
- one named pilot source workspace
- one reviewed pilot concept region carried end to end on real local sources
- one first published Notebook page candidate built from those real sources
So the missing step is no longer "invent Notebook machinery". The missing step is "run one reproducible pilot from provisioned local sources".
Chosen first pilot
Use:
natural selection and adaptation
Reasons:
- it is already represented in the Notebook page tests and example graph
structure:
natural-selectionvariationadaptationcommon descent
- it is narrow enough for a first pass
- it is central enough that the Notebook navigation model is meaningful
- it can draw on both textbook and web-corpus sources without needing an enormous bibliography first
This is a better first region than a broad "history of evolutionary thought" pilot because it stresses concept navigation without forcing huge historical scope immediately.
Pilot workspace
Create one stable workspace outside the library root, for example:
/mnt/CIFS/pengolodh/Docs/Projects/evo-edu-notebook-pilot/natural-selection/
Recommended layout:
natural-selection/
README.md
manifests/
source-manifest.yaml
sources/
textbooks/
web/
bibliographies/
normalized/
doclift/
citegeist/
groundrecall/
didactopus/
publish/
This keeps the library as upstream source storage while making the Notebook run reproducible in one project-local tree.
Minimum pilot sources
Start with a deliberately small set.
Textbook side
Choose 1 to 2 textbook sections on natural selection/adaptation from the local library root:
/mnt/CIFS/pengolodh/Docs/Library
The exact textbooks can be finalized during provisioning, but the likely first choices from the existing plan are:
- Futuyma,
Evolutionary Biology - Pianka,
Evolutionary Ecology
Web corpus side
Provision one small local snapshot from an evolution-focused corpus such as:
- TalkOrigins Archive
- Panda's Thumb
For inception, prefer a small curated subset over a full corpus mirror.
Bibliography seed side
Use:
- one local
.bibseed if available - bibliography material extracted from the chosen textbook sections
- any relevant TalkOrigins bibliography fragments
Inception steps
Step 0. Provision the workspace
Deliverables:
sources/textbooks/sources/web/sources/bibliographies/manifests/source-manifest.yaml
Completion check:
- every pilot input is copied or symlinked into the workspace
- the manifest names source type, origin, and local path
Step 1. Normalize textbook/web material
Use doclift for textbook-like source material where it helps.
Expected output root:
normalized/doclift/
Representative command pattern:
cd /home/netuser/bin/doclift
PYTHONPATH=src .venv/bin/python -m doclift.cli convert-dir \
/path/to/pilot-source-dir \
/mnt/CIFS/pengolodh/Docs/Projects/evo-edu-notebook-pilot/natural-selection/normalized/doclift
Completion check:
- a deterministic normalized bundle exists
- markdown and sidecars are present where applicable
Step 2. Build the bibliography substrate
Use CiteGeist for the Notebook bibliography layer.
Representative command pattern:
cd /home/netuser/bin/CiteGeist
PYTHONPATH=src .venv/bin/python -m citegeist --db \
/mnt/CIFS/pengolodh/Docs/Projects/evo-edu-notebook-pilot/natural-selection/citegeist/library.sqlite3 \
ingest /path/to/pilot.bib
Then export the Notebook topic bibliography bundle once the pilot topic exists:
cd /home/netuser/bin/CiteGeist
PYTHONPATH=src .venv/bin/python -m citegeist --db \
/mnt/CIFS/pengolodh/Docs/Projects/evo-edu-notebook-pilot/natural-selection/citegeist/library.sqlite3 \
export-notebook-topic natural-selection --output-dir \
/mnt/CIFS/pengolodh/Docs/Projects/evo-edu-notebook-pilot/natural-selection/citegeist/notebook-bundle
Completion check:
notebook_topic_bundle.jsonnotebook_topic_bibliography.bib
Step 3. Import and review canonical concepts in GroundRecall
The first real review target should be the concept neighborhood around
natural-selection.
Expected output root:
groundrecall/store/
Completion check:
- reviewed concept for
natural-selection - at least a small connected concept neighborhood
- supporting observations and source artifacts retained
Step 4. Export the Notebook concept bundle
Use GroundRecall export:
PYTHONPATH=/home/netuser/bin/GroundRecall/src python -m groundrecall.export \
/mnt/CIFS/pengolodh/Docs/Projects/evo-edu-notebook-pilot/natural-selection/groundrecall/store \
/mnt/CIFS/pengolodh/Docs/Projects/evo-edu-notebook-pilot/natural-selection/groundrecall/export \
--pack-ready-concept natural-selection
Completion check:
groundrecall_query_bundle.json
Step 5. Build the Notebook page artifact
Use the direct Didactopus wrapper:
didactopus notebook-page-groundrecall \
/mnt/CIFS/pengolodh/Docs/Projects/evo-edu-notebook-pilot/natural-selection/groundrecall/store \
natural-selection \
/mnt/CIFS/pengolodh/Docs/Projects/evo-edu-notebook-pilot/natural-selection/didactopus/notebook-page
Completion check:
groundrecall_query_bundle.jsonnotebook_page.json
The page artifact should already include:
- concept summary
- graph navigation buckets
- supporting sources
- supporting excerpts
- review context
- illustration opportunities
- suggested next actions
Step 6. Decide whether inception is complete
For this first pilot, Foundation Notebook inception should mean:
- one stable pilot workspace exists
- one real pilot concept region is provisioned locally
- one reviewed
GroundRecallconcept neighborhood exists - one
groundrecall_query_bundle.jsonexists for that concept - one
notebook_page.jsonexists from real reviewed sources - one Notebook bibliography bundle exists for the same region
If all six are true, Notebook inception has happened even if public publishing and richer UI are still pending.
Expected artifact inventory
At minimum, the first successful inception run should leave:
natural-selection/
manifests/source-manifest.yaml
normalized/doclift/...
citegeist/library.sqlite3
citegeist/notebook-bundle/notebook_topic_bundle.json
citegeist/notebook-bundle/notebook_topic_bibliography.bib
groundrecall/store/...
groundrecall/export/groundrecall_query_bundle.json
didactopus/notebook-page/notebook_page.json
Recommended immediate next actions
- Create the pilot workspace directory and
source-manifest.yaml. - Pick the exact textbook sections and one small web snapshot.
- Run one small
docliftnormalization pass. - Seed one
CiteGeistpilot database. - Build one real
GroundRecallconcept neighborhood fornatural-selection. - Export the first real
notebook_page.json.
Bottom line
The Foundation Notebook is now blocked more by pilot execution than by missing infrastructure. The first real threshold is not "build more Notebook code". It is "produce the first real Notebook page artifact from provisioned, reviewed, local sources".