293 lines
7.9 KiB
Markdown
293 lines
7.9 KiB
Markdown
# Foundation Notebook Inception Pilot
|
|
|
|
This note turns the broader Notebook pipeline into one concrete first run.
|
|
|
|
It answers a narrower question than
|
|
[evo-edu-notebook-pipeline.md](./evo-edu-notebook-pipeline.md):
|
|
|
|
- what is the first pilot region,
|
|
- what repos and commands are already ready,
|
|
- what exact artifacts should be produced,
|
|
- and what still blocks calling the Notebook "incepted".
|
|
|
|
## Current status
|
|
|
|
The stack is already past pure planning.
|
|
|
|
Implemented now:
|
|
|
|
- `CiteGeist` can export Notebook-ready topic bibliography bundles with
|
|
`export-notebook-topic`.
|
|
- `GroundRecall` can export `groundrecall_query_bundle.json` for one concept.
|
|
- `Didactopus` can build `notebook_page.json` directly from a GroundRecall
|
|
concept or bundle.
|
|
- `Didactopus` pack emission can already carry Notebook-facing artifacts.
|
|
|
|
Not yet done:
|
|
|
|
- one named pilot source workspace
|
|
- one reviewed pilot concept region carried end to end on real local sources
|
|
- one first published Notebook page candidate built from those real sources
|
|
|
|
So the missing step is no longer "invent Notebook machinery". The missing step
|
|
is "run one reproducible pilot from provisioned local sources".
|
|
|
|
## Chosen first pilot
|
|
|
|
Use:
|
|
|
|
- `natural selection and adaptation`
|
|
|
|
Reasons:
|
|
|
|
- it is already represented in the Notebook page tests and example graph
|
|
structure:
|
|
- `natural-selection`
|
|
- `variation`
|
|
- `adaptation`
|
|
- `common descent`
|
|
- it is narrow enough for a first pass
|
|
- it is central enough that the Notebook navigation model is meaningful
|
|
- it can draw on both textbook and web-corpus sources without needing an
|
|
enormous bibliography first
|
|
|
|
This is a better first region than a broad "history of evolutionary thought"
|
|
pilot because it stresses concept navigation without forcing huge historical
|
|
scope immediately.
|
|
|
|
## Pilot workspace
|
|
|
|
Create one stable workspace outside the library root, for example:
|
|
|
|
```text
|
|
/mnt/CIFS/pengolodh/Docs/Projects/evo-edu-notebook-pilot/natural-selection/
|
|
```
|
|
|
|
Recommended layout:
|
|
|
|
```text
|
|
natural-selection/
|
|
README.md
|
|
manifests/
|
|
source-manifest.yaml
|
|
sources/
|
|
textbooks/
|
|
web/
|
|
bibliographies/
|
|
normalized/
|
|
doclift/
|
|
citegeist/
|
|
groundrecall/
|
|
didactopus/
|
|
publish/
|
|
```
|
|
|
|
This keeps the library as upstream source storage while making the Notebook run
|
|
reproducible in one project-local tree.
|
|
|
|
## Minimum pilot sources
|
|
|
|
Start with a deliberately small set.
|
|
|
|
### Textbook side
|
|
|
|
Choose 1 to 2 textbook sections on natural selection/adaptation from the local
|
|
library root:
|
|
|
|
- `/mnt/CIFS/pengolodh/Docs/Library`
|
|
|
|
The exact textbooks can be finalized during provisioning, but the likely first
|
|
choices from the existing plan are:
|
|
|
|
- Futuyma, `Evolutionary Biology`
|
|
- Pianka, `Evolutionary Ecology`
|
|
|
|
### Web corpus side
|
|
|
|
Provision one small local snapshot from an evolution-focused corpus such as:
|
|
|
|
- TalkOrigins Archive
|
|
- Panda's Thumb
|
|
|
|
For inception, prefer a small curated subset over a full corpus mirror.
|
|
|
|
### Bibliography seed side
|
|
|
|
Use:
|
|
|
|
- one local `.bib` seed if available
|
|
- bibliography material extracted from the chosen textbook sections
|
|
- any relevant TalkOrigins bibliography fragments
|
|
|
|
## Inception steps
|
|
|
|
### Step 0. Provision the workspace
|
|
|
|
Deliverables:
|
|
|
|
- `sources/textbooks/`
|
|
- `sources/web/`
|
|
- `sources/bibliographies/`
|
|
- `manifests/source-manifest.yaml`
|
|
|
|
Completion check:
|
|
|
|
- every pilot input is copied or symlinked into the workspace
|
|
- the manifest names source type, origin, and local path
|
|
|
|
### Step 1. Normalize textbook/web material
|
|
|
|
Use `doclift` for textbook-like source material where it helps.
|
|
|
|
Expected output root:
|
|
|
|
```text
|
|
normalized/doclift/
|
|
```
|
|
|
|
Representative command pattern:
|
|
|
|
```bash
|
|
cd /home/netuser/bin/doclift
|
|
PYTHONPATH=src .venv/bin/python -m doclift.cli convert-dir \
|
|
/path/to/pilot-source-dir \
|
|
/mnt/CIFS/pengolodh/Docs/Projects/evo-edu-notebook-pilot/natural-selection/normalized/doclift
|
|
```
|
|
|
|
Completion check:
|
|
|
|
- a deterministic normalized bundle exists
|
|
- markdown and sidecars are present where applicable
|
|
|
|
### Step 2. Build the bibliography substrate
|
|
|
|
Use `CiteGeist` for the Notebook bibliography layer.
|
|
|
|
Representative command pattern:
|
|
|
|
```bash
|
|
cd /home/netuser/bin/CiteGeist
|
|
PYTHONPATH=src .venv/bin/python -m citegeist --db \
|
|
/mnt/CIFS/pengolodh/Docs/Projects/evo-edu-notebook-pilot/natural-selection/citegeist/library.sqlite3 \
|
|
ingest /path/to/pilot.bib
|
|
```
|
|
|
|
Then export the Notebook topic bibliography bundle once the pilot topic exists:
|
|
|
|
```bash
|
|
cd /home/netuser/bin/CiteGeist
|
|
PYTHONPATH=src .venv/bin/python -m citegeist --db \
|
|
/mnt/CIFS/pengolodh/Docs/Projects/evo-edu-notebook-pilot/natural-selection/citegeist/library.sqlite3 \
|
|
export-notebook-topic natural-selection --output-dir \
|
|
/mnt/CIFS/pengolodh/Docs/Projects/evo-edu-notebook-pilot/natural-selection/citegeist/notebook-bundle
|
|
```
|
|
|
|
Completion check:
|
|
|
|
- `notebook_topic_bundle.json`
|
|
- `notebook_topic_bibliography.bib`
|
|
|
|
### Step 3. Import and review canonical concepts in GroundRecall
|
|
|
|
The first real review target should be the concept neighborhood around
|
|
`natural-selection`.
|
|
|
|
Expected output root:
|
|
|
|
```text
|
|
groundrecall/store/
|
|
```
|
|
|
|
Completion check:
|
|
|
|
- reviewed concept for `natural-selection`
|
|
- at least a small connected concept neighborhood
|
|
- supporting observations and source artifacts retained
|
|
|
|
### Step 4. Export the Notebook concept bundle
|
|
|
|
Use `GroundRecall` export:
|
|
|
|
```bash
|
|
PYTHONPATH=/home/netuser/bin/GroundRecall/src python -m groundrecall.export \
|
|
/mnt/CIFS/pengolodh/Docs/Projects/evo-edu-notebook-pilot/natural-selection/groundrecall/store \
|
|
/mnt/CIFS/pengolodh/Docs/Projects/evo-edu-notebook-pilot/natural-selection/groundrecall/export \
|
|
--pack-ready-concept natural-selection
|
|
```
|
|
|
|
Completion check:
|
|
|
|
- `groundrecall_query_bundle.json`
|
|
|
|
### Step 5. Build the Notebook page artifact
|
|
|
|
Use the direct `Didactopus` wrapper:
|
|
|
|
```bash
|
|
didactopus notebook-page-groundrecall \
|
|
/mnt/CIFS/pengolodh/Docs/Projects/evo-edu-notebook-pilot/natural-selection/groundrecall/store \
|
|
natural-selection \
|
|
/mnt/CIFS/pengolodh/Docs/Projects/evo-edu-notebook-pilot/natural-selection/didactopus/notebook-page
|
|
```
|
|
|
|
Completion check:
|
|
|
|
- `groundrecall_query_bundle.json`
|
|
- `notebook_page.json`
|
|
|
|
The page artifact should already include:
|
|
|
|
- concept summary
|
|
- graph navigation buckets
|
|
- supporting sources
|
|
- supporting excerpts
|
|
- review context
|
|
- illustration opportunities
|
|
- suggested next actions
|
|
|
|
### Step 6. Decide whether inception is complete
|
|
|
|
For this first pilot, Foundation Notebook inception should mean:
|
|
|
|
1. one stable pilot workspace exists
|
|
2. one real pilot concept region is provisioned locally
|
|
3. one reviewed `GroundRecall` concept neighborhood exists
|
|
4. one `groundrecall_query_bundle.json` exists for that concept
|
|
5. one `notebook_page.json` exists from real reviewed sources
|
|
6. one Notebook bibliography bundle exists for the same region
|
|
|
|
If all six are true, Notebook inception has happened even if public publishing
|
|
and richer UI are still pending.
|
|
|
|
## Expected artifact inventory
|
|
|
|
At minimum, the first successful inception run should leave:
|
|
|
|
```text
|
|
natural-selection/
|
|
manifests/source-manifest.yaml
|
|
normalized/doclift/...
|
|
citegeist/library.sqlite3
|
|
citegeist/notebook-bundle/notebook_topic_bundle.json
|
|
citegeist/notebook-bundle/notebook_topic_bibliography.bib
|
|
groundrecall/store/...
|
|
groundrecall/export/groundrecall_query_bundle.json
|
|
didactopus/notebook-page/notebook_page.json
|
|
```
|
|
|
|
## Recommended immediate next actions
|
|
|
|
1. Create the pilot workspace directory and `source-manifest.yaml`.
|
|
2. Pick the exact textbook sections and one small web snapshot.
|
|
3. Run one small `doclift` normalization pass.
|
|
4. Seed one `CiteGeist` pilot database.
|
|
5. Build one real `GroundRecall` concept neighborhood for `natural-selection`.
|
|
6. Export the first real `notebook_page.json`.
|
|
|
|
## Bottom line
|
|
|
|
The Foundation Notebook is now blocked more by pilot execution than by missing
|
|
infrastructure. The first real threshold is not "build more Notebook code". It
|
|
is "produce the first real Notebook page artifact from provisioned, reviewed,
|
|
local sources".
|