Compare commits
2 Commits
e4eaf52393
...
1ae42ec2c4
| Author | SHA1 | Date |
|---|---|---|
|
|
1ae42ec2c4 | |
|
|
c0fe9de6f0 |
|
|
@ -62,6 +62,7 @@ The initial repo includes:
|
||||||
|
|
||||||
Example applications live alongside the core package rather than defining it. Current examples include:
|
Example applications live alongside the core package rather than defining it. Current examples include:
|
||||||
|
|
||||||
|
- a comprehensive CLI cookbook in [examples/cli/README.md](./examples/cli/README.md);
|
||||||
- a topic-only bootstrap workflow for `artificial life` in [examples/artificial-life/README.md](./examples/artificial-life/README.md);
|
- a topic-only bootstrap workflow for `artificial life` in [examples/artificial-life/README.md](./examples/artificial-life/README.md);
|
||||||
- the TalkOrigins bibliography pipeline under [`citegeist.examples.talkorigins`](./src/citegeist/examples/talkorigins.py) with a usage guide in [examples/talkorigins/README.md](./examples/talkorigins/README.md).
|
- the TalkOrigins bibliography pipeline under [`citegeist.examples.talkorigins`](./src/citegeist/examples/talkorigins.py) with a usage guide in [examples/talkorigins/README.md](./examples/talkorigins/README.md).
|
||||||
|
|
||||||
|
|
@ -150,6 +151,8 @@ PYTHONPATH=src .venv/bin/python -m citegeist --db library.sqlite3 harvest-oai ht
|
||||||
PYTHONPATH=src .venv/bin/python -m citegeist --db library.sqlite3 export --output reviewed.bib
|
PYTHONPATH=src .venv/bin/python -m citegeist --db library.sqlite3 export --output reviewed.bib
|
||||||
```
|
```
|
||||||
|
|
||||||
|
For a fuller option-by-option CLI cookbook, see [examples/cli/README.md](./examples/cli/README.md).
|
||||||
|
|
||||||
For live-source development, prefer fixture-backed or cache-backed source clients so resolver and expansion work can be exercised repeatedly without re-hitting upstream APIs on every run.
|
For live-source development, prefer fixture-backed or cache-backed source clients so resolver and expansion work can be exercised repeatedly without re-hitting upstream APIs on every run.
|
||||||
|
|
||||||
## Example Application
|
## Example Application
|
||||||
|
|
|
||||||
|
|
@ -0,0 +1,641 @@
|
||||||
|
# CLI Examples
|
||||||
|
|
||||||
|
This guide gives example invocations for the `citegeist` CLI, including the major option combinations for each command.
|
||||||
|
|
||||||
|
Where a topic is named, this guide uses:
|
||||||
|
|
||||||
|
- topic phrase: `artificial life`
|
||||||
|
- topic slug: `artificial-life`
|
||||||
|
- topic name: `Artificial life`
|
||||||
|
|
||||||
|
Assume:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
cd citegeist
|
||||||
|
export PYTHONPATH=src
|
||||||
|
```
|
||||||
|
|
||||||
|
## Global Option
|
||||||
|
|
||||||
|
Use a non-default database path:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 topics
|
||||||
|
```
|
||||||
|
|
||||||
|
## Ingest
|
||||||
|
|
||||||
|
Basic ingest:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 ingest references.bib
|
||||||
|
```
|
||||||
|
|
||||||
|
Set initial review status:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 ingest references.bib --status reviewed
|
||||||
|
```
|
||||||
|
|
||||||
|
Set a provenance label:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 ingest references.bib --source-label "examples/artificial-life/references.bib"
|
||||||
|
```
|
||||||
|
|
||||||
|
Use both ingest options together:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 ingest references.bib --status draft --source-label "manual-import:artificial-life"
|
||||||
|
```
|
||||||
|
|
||||||
|
## Search
|
||||||
|
|
||||||
|
Basic search:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 search "artificial life"
|
||||||
|
```
|
||||||
|
|
||||||
|
Limit the number of matches:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 search "artificial life" --limit 5
|
||||||
|
```
|
||||||
|
|
||||||
|
Restrict search to one topic slice:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 search "artificial life" --topic artificial-life
|
||||||
|
```
|
||||||
|
|
||||||
|
## Show
|
||||||
|
|
||||||
|
Show one entry:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 show langton1989artificial1
|
||||||
|
```
|
||||||
|
|
||||||
|
List entries:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 show --limit 10
|
||||||
|
```
|
||||||
|
|
||||||
|
Include provenance:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 show langton1989artificial1 --provenance
|
||||||
|
```
|
||||||
|
|
||||||
|
Include conflicts:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 show langton1989artificial1 --conflicts
|
||||||
|
```
|
||||||
|
|
||||||
|
Use both:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 show langton1989artificial1 --provenance --conflicts
|
||||||
|
```
|
||||||
|
|
||||||
|
## Export
|
||||||
|
|
||||||
|
Export the whole library:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 export
|
||||||
|
```
|
||||||
|
|
||||||
|
Export selected citation keys:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 export langton1989artificial1 bedau2003artificial2
|
||||||
|
```
|
||||||
|
|
||||||
|
Write BibTeX to a file:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 export --output artificial-life.bib
|
||||||
|
```
|
||||||
|
|
||||||
|
## Entry Review
|
||||||
|
|
||||||
|
Set review status:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 set-status langton1989artificial1 reviewed
|
||||||
|
```
|
||||||
|
|
||||||
|
Resolve field conflicts:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 resolve-conflicts langton1989artificial1 title accepted
|
||||||
|
```
|
||||||
|
|
||||||
|
Reject a conflict instead:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 resolve-conflicts langton1989artificial1 title rejected
|
||||||
|
```
|
||||||
|
|
||||||
|
Apply the latest proposed conflict value:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 apply-conflict langton1989artificial1 title
|
||||||
|
```
|
||||||
|
|
||||||
|
## Extract
|
||||||
|
|
||||||
|
Extract draft BibTeX from plaintext:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist extract references.txt
|
||||||
|
```
|
||||||
|
|
||||||
|
Write extracted BibTeX to a file:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist extract references.txt --output extracted-artificial-life.bib
|
||||||
|
```
|
||||||
|
|
||||||
|
## Resolve
|
||||||
|
|
||||||
|
Resolve one or more entries against remote metadata:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 resolve langton1989artificial1 bedau2003artificial2
|
||||||
|
```
|
||||||
|
|
||||||
|
## Graph Traversal
|
||||||
|
|
||||||
|
Basic traversal:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 graph langton1989artificial1
|
||||||
|
```
|
||||||
|
|
||||||
|
Use multiple relation filters:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 graph langton1989artificial1 --relation cites --relation cited_by
|
||||||
|
```
|
||||||
|
|
||||||
|
Set traversal depth:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 graph langton1989artificial1 --depth 2
|
||||||
|
```
|
||||||
|
|
||||||
|
Filter by target review status:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 graph langton1989artificial1 --review-status reviewed
|
||||||
|
```
|
||||||
|
|
||||||
|
Show only unresolved targets:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 graph langton1989artificial1 --missing-only
|
||||||
|
```
|
||||||
|
|
||||||
|
Render DOT instead of traversal rows:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 graph langton1989artificial1 --format dot
|
||||||
|
```
|
||||||
|
|
||||||
|
Render node/edge JSON for visualization:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 graph langton1989artificial1 --format json-graph
|
||||||
|
```
|
||||||
|
|
||||||
|
Write graph output to a file:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 graph langton1989artificial1 --depth 2 --format dot --output artificial-life.dot
|
||||||
|
```
|
||||||
|
|
||||||
|
## Graph Viewer
|
||||||
|
|
||||||
|
Render a standalone HTML page from a `json-graph` export:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist graph-view artificial-life.json --output artificial-life.html
|
||||||
|
```
|
||||||
|
|
||||||
|
Set the HTML page title:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist graph-view artificial-life.json --output artificial-life.html --title "Artificial Life Graph"
|
||||||
|
```
|
||||||
|
|
||||||
|
## Graph Expansion
|
||||||
|
|
||||||
|
Expand from one or more seed entries:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 expand langton1989artificial1
|
||||||
|
```
|
||||||
|
|
||||||
|
Choose the source:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 expand langton1989artificial1 --source openalex
|
||||||
|
```
|
||||||
|
|
||||||
|
Choose relation direction:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 expand langton1989artificial1 --source openalex --relation cited_by
|
||||||
|
```
|
||||||
|
|
||||||
|
Limit discoveries per seed:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 expand langton1989artificial1 --source openalex --limit 10
|
||||||
|
```
|
||||||
|
|
||||||
|
## Topic Expansion
|
||||||
|
|
||||||
|
Basic topic expansion from stored topic metadata:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 expand-topic artificial-life
|
||||||
|
```
|
||||||
|
|
||||||
|
Override the topic phrase:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 expand-topic artificial-life --topic-phrase "artificial life alife artificial organisms"
|
||||||
|
```
|
||||||
|
|
||||||
|
Choose source and relation:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 expand-topic artificial-life --source openalex --relation cited_by
|
||||||
|
```
|
||||||
|
|
||||||
|
Control seed and discovery limits:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 expand-topic artificial-life --seed-limit 10 --per-seed-limit 5
|
||||||
|
```
|
||||||
|
|
||||||
|
Restrict to trusted seed entries:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 expand-topic artificial-life --seed-key langton1989artificial1 --seed-key bedau2003artificial2
|
||||||
|
```
|
||||||
|
|
||||||
|
Raise or lower the topic assignment threshold:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 expand-topic artificial-life --min-relevance 0.3
|
||||||
|
```
|
||||||
|
|
||||||
|
Preview without writing:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 expand-topic artificial-life --preview
|
||||||
|
```
|
||||||
|
|
||||||
|
## Topic Phrase Storage
|
||||||
|
|
||||||
|
Set a stored topic phrase:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 set-topic-phrase artificial-life "artificial life alife artificial organisms complex systems evolution simulation"
|
||||||
|
```
|
||||||
|
|
||||||
|
Clear a stored topic phrase:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 set-topic-phrase artificial-life --clear
|
||||||
|
```
|
||||||
|
|
||||||
|
## OAI-PMH Harvesting
|
||||||
|
|
||||||
|
Inspect a repository:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist discover-oai https://example.edu/oai
|
||||||
|
```
|
||||||
|
|
||||||
|
Harvest with default metadata prefix:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 harvest-oai https://example.edu/oai
|
||||||
|
```
|
||||||
|
|
||||||
|
Use an alternate metadata prefix:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 harvest-oai https://example.edu/oai --metadata-prefix mods
|
||||||
|
```
|
||||||
|
|
||||||
|
Restrict to a set:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 harvest-oai https://example.edu/oai --set artificial-life
|
||||||
|
```
|
||||||
|
|
||||||
|
Harvest a date range:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 harvest-oai https://example.edu/oai --from 2024-01-01 --until 2024-12-31
|
||||||
|
```
|
||||||
|
|
||||||
|
Limit harvested records and set review status:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 harvest-oai https://example.edu/oai --limit 10 --status draft
|
||||||
|
```
|
||||||
|
|
||||||
|
## Bootstrap
|
||||||
|
|
||||||
|
Seed from a BibTeX file:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 bootstrap --seed-bib artificial-life.bib
|
||||||
|
```
|
||||||
|
|
||||||
|
Seed from a topic phrase alone:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 bootstrap --topic "artificial life"
|
||||||
|
```
|
||||||
|
|
||||||
|
Use both a seed `.bib` and a topic phrase:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 bootstrap --seed-bib artificial-life.bib --topic "artificial life"
|
||||||
|
```
|
||||||
|
|
||||||
|
Store topic metadata while bootstrapping:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 \
|
||||||
|
bootstrap \
|
||||||
|
--topic "artificial life" \
|
||||||
|
--topic-slug artificial-life \
|
||||||
|
--topic-name "Artificial life" \
|
||||||
|
--store-topic-phrase "artificial life alife artificial organisms complex systems evolution simulation"
|
||||||
|
```
|
||||||
|
|
||||||
|
Control topic-search candidate count:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 bootstrap --topic "artificial life" --topic-limit 10
|
||||||
|
```
|
||||||
|
|
||||||
|
Control how many topic candidates are actually committed:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 bootstrap --topic "artificial life" --topic-commit-limit 5
|
||||||
|
```
|
||||||
|
|
||||||
|
Disable immediate expansion:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 bootstrap --topic "artificial life" --no-expand
|
||||||
|
```
|
||||||
|
|
||||||
|
Preview without writing:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 bootstrap --topic "artificial life" --preview
|
||||||
|
```
|
||||||
|
|
||||||
|
Set review status for imported entries:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 bootstrap --topic "artificial life" --status reviewed
|
||||||
|
```
|
||||||
|
|
||||||
|
## Batch Bootstrap
|
||||||
|
|
||||||
|
Run a JSON batch file:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 bootstrap-batch artificial-life.json
|
||||||
|
```
|
||||||
|
|
||||||
|
## Topic Phrase Review Workflow
|
||||||
|
|
||||||
|
Apply topic phrases directly:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 apply-topic-phrases topic-phrases.json
|
||||||
|
```
|
||||||
|
|
||||||
|
Stage topic phrase suggestions:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 stage-topic-phrases topic-phrases.json
|
||||||
|
```
|
||||||
|
|
||||||
|
Review one staged phrase:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 review-topic-phrase artificial-life accepted
|
||||||
|
```
|
||||||
|
|
||||||
|
Add notes while reviewing:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 review-topic-phrase artificial-life accepted --notes "good fit for topic expansion"
|
||||||
|
```
|
||||||
|
|
||||||
|
Override the accepted phrase while reviewing:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 review-topic-phrase artificial-life accepted --phrase "artificial life alife artificial organisms autonomous agents"
|
||||||
|
```
|
||||||
|
|
||||||
|
Apply review decisions in bulk:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 review-topic-phrases topic-phrase-review.json
|
||||||
|
```
|
||||||
|
|
||||||
|
List staged phrase reviews:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 topic-phrase-reviews
|
||||||
|
```
|
||||||
|
|
||||||
|
Filter review rows by status:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 topic-phrase-reviews --phrase-review-status pending
|
||||||
|
```
|
||||||
|
|
||||||
|
Export an editable review template:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 export-topic-phrase-reviews
|
||||||
|
```
|
||||||
|
|
||||||
|
Limit exported review rows:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 export-topic-phrase-reviews --limit 10
|
||||||
|
```
|
||||||
|
|
||||||
|
Filter exported rows by status:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 export-topic-phrase-reviews --phrase-review-status rejected
|
||||||
|
```
|
||||||
|
|
||||||
|
Write the review template to a file:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 export-topic-phrase-reviews --output topic-phrase-review.json
|
||||||
|
```
|
||||||
|
|
||||||
|
## Topic Inspection
|
||||||
|
|
||||||
|
List topics:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 topics
|
||||||
|
```
|
||||||
|
|
||||||
|
Limit topic rows:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 topics --limit 20
|
||||||
|
```
|
||||||
|
|
||||||
|
Filter topics by phrase review status:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 topics --phrase-review-status pending
|
||||||
|
```
|
||||||
|
|
||||||
|
List entries for a topic:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 topic-entries artificial-life
|
||||||
|
```
|
||||||
|
|
||||||
|
Limit topic entries:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 topic-entries artificial-life --limit 25
|
||||||
|
```
|
||||||
|
|
||||||
|
Export one topic slice as BibTeX:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 export-topic artificial-life
|
||||||
|
```
|
||||||
|
|
||||||
|
Write the topic slice to a file:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 export-topic artificial-life --output artificial-life-topic.bib
|
||||||
|
```
|
||||||
|
|
||||||
|
## TalkOrigins Example Commands
|
||||||
|
|
||||||
|
Scrape the example corpus:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist example-talkorigins-scrape talkorigins-out
|
||||||
|
```
|
||||||
|
|
||||||
|
Override the source URL:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist example-talkorigins-scrape talkorigins-out --base-url https://www.talkorigins.org/origins/biblio/
|
||||||
|
```
|
||||||
|
|
||||||
|
Limit topics and entries:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist example-talkorigins-scrape talkorigins-out --limit-topics 5 --limit-entries-per-topic 20
|
||||||
|
```
|
||||||
|
|
||||||
|
Resolve seeds, ingest immediately, and keep expansion disabled:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db library.sqlite3 example-talkorigins-scrape talkorigins-out --resolve-seeds --ingest --no-expand
|
||||||
|
```
|
||||||
|
|
||||||
|
Disable snapshot reuse:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist example-talkorigins-scrape talkorigins-out --no-resume
|
||||||
|
```
|
||||||
|
|
||||||
|
Control generated bootstrap defaults:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist example-talkorigins-scrape talkorigins-out --topic-limit 10 --topic-commit-limit 5 --status draft
|
||||||
|
```
|
||||||
|
|
||||||
|
Validate the generated manifest:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist example-talkorigins-validate talkorigins-out/talkorigins_manifest.json
|
||||||
|
```
|
||||||
|
|
||||||
|
Suggest phrases from the corpus:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist example-talkorigins-suggest-phrases talkorigins-out/talkorigins_manifest.json --topic abiogenesis --limit 10 --output topic-phrases.json
|
||||||
|
```
|
||||||
|
|
||||||
|
Inspect duplicate clusters:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist example-talkorigins-duplicates talkorigins-out/talkorigins_manifest.json --limit 20 --min-count 2 --match origin --topic abiogenesis --preview --weak-only
|
||||||
|
```
|
||||||
|
|
||||||
|
Ingest the reconstructed corpus:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db talkorigins.sqlite3 example-talkorigins-ingest talkorigins-out/talkorigins_manifest.json --status draft
|
||||||
|
```
|
||||||
|
|
||||||
|
Disable deduplication during example ingest:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db talkorigins.sqlite3 example-talkorigins-ingest talkorigins-out/talkorigins_manifest.json --no-dedupe
|
||||||
|
```
|
||||||
|
|
||||||
|
Enrich weak canonical entries:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db talkorigins.sqlite3 example-talkorigins-enrich talkorigins-out/talkorigins_manifest.json --limit 20 --min-count 2 --match origin --topic abiogenesis --status enriched
|
||||||
|
```
|
||||||
|
|
||||||
|
Apply enrichment and allow unsafe search matches for experiments:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db talkorigins-copy.sqlite3 example-talkorigins-enrich talkorigins-out/talkorigins_manifest.json --apply --allow-unsafe-search-matches
|
||||||
|
```
|
||||||
|
|
||||||
|
Export a review artifact:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db talkorigins.sqlite3 example-talkorigins-review talkorigins-out/talkorigins_manifest.json --limit 20 --min-count 2 --match origin --topic abiogenesis --output talkorigins-review.json
|
||||||
|
```
|
||||||
|
|
||||||
|
Apply curated corrections:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -m citegeist --db talkorigins.sqlite3 example-talkorigins-apply-corrections talkorigins-out/talkorigins_manifest.json talkorigins-corrections.json --status reviewed
|
||||||
|
```
|
||||||
|
|
||||||
|
## Notes
|
||||||
|
|
||||||
|
- Some commands depend on live source access.
|
||||||
|
- For topic-oriented examples, use preview mode before committing changes when possible.
|
||||||
|
- The older TalkOrigins alias commands remain available, but the example-prefixed names are the preferred surface.
|
||||||
|
|
@ -15,6 +15,10 @@ class BootstrapResult:
|
||||||
origin: str
|
origin: str
|
||||||
created: bool
|
created: bool
|
||||||
score: float = 0.0
|
score: float = 0.0
|
||||||
|
title: str = ""
|
||||||
|
author: str = ""
|
||||||
|
year: str = ""
|
||||||
|
abstract: str = ""
|
||||||
|
|
||||||
|
|
||||||
class Bootstrapper:
|
class Bootstrapper:
|
||||||
|
|
@ -57,7 +61,17 @@ class Bootstrapper:
|
||||||
review_status=review_status,
|
review_status=review_status,
|
||||||
)
|
)
|
||||||
seed_keys.append(entry.citation_key)
|
seed_keys.append(entry.citation_key)
|
||||||
results.append(BootstrapResult(entry.citation_key, "seed_bibtex", created))
|
results.append(
|
||||||
|
BootstrapResult(
|
||||||
|
entry.citation_key,
|
||||||
|
"seed_bibtex",
|
||||||
|
created,
|
||||||
|
title=entry.fields.get("title", ""),
|
||||||
|
author=entry.fields.get("author", ""),
|
||||||
|
year=entry.fields.get("year", ""),
|
||||||
|
abstract=entry.fields.get("abstract", ""),
|
||||||
|
)
|
||||||
|
)
|
||||||
|
|
||||||
if topic:
|
if topic:
|
||||||
if not preview_only and (topic_slug or topic_name or topic_phrase):
|
if not preview_only and (topic_slug or topic_name or topic_phrase):
|
||||||
|
|
@ -67,7 +81,8 @@ class Bootstrapper:
|
||||||
source_type="bootstrap",
|
source_type="bootstrap",
|
||||||
expansion_phrase=topic_phrase or topic,
|
expansion_phrase=topic_phrase or topic,
|
||||||
)
|
)
|
||||||
ranked_candidates = self._topic_candidates(topic, seed_keys, topic_limit)
|
candidate_limit = max(topic_limit, topic_commit_limit or 0)
|
||||||
|
ranked_candidates = self._topic_candidates(topic, seed_keys, candidate_limit)
|
||||||
if topic_commit_limit is not None:
|
if topic_commit_limit is not None:
|
||||||
ranked_candidates = ranked_candidates[:topic_commit_limit]
|
ranked_candidates = ranked_candidates[:topic_commit_limit]
|
||||||
|
|
||||||
|
|
@ -82,7 +97,18 @@ class Bootstrapper:
|
||||||
review_status=review_status,
|
review_status=review_status,
|
||||||
)
|
)
|
||||||
seed_keys.append(entry.citation_key)
|
seed_keys.append(entry.citation_key)
|
||||||
results.append(BootstrapResult(entry.citation_key, "topic", created, score=score))
|
results.append(
|
||||||
|
BootstrapResult(
|
||||||
|
entry.citation_key,
|
||||||
|
"topic",
|
||||||
|
created,
|
||||||
|
score=score,
|
||||||
|
title=entry.fields.get("title", ""),
|
||||||
|
author=entry.fields.get("author", ""),
|
||||||
|
year=entry.fields.get("year", ""),
|
||||||
|
abstract=entry.fields.get("abstract", ""),
|
||||||
|
)
|
||||||
|
)
|
||||||
|
|
||||||
if expand and not preview_only:
|
if expand and not preview_only:
|
||||||
expanded_keys = list(dict.fromkeys(seed_keys))
|
expanded_keys = list(dict.fromkeys(seed_keys))
|
||||||
|
|
|
||||||
|
|
@ -87,6 +87,44 @@ def test_bootstrap_cli_accepts_seed_and_topic(tmp_path):
|
||||||
assert exit_code == 0
|
assert exit_code == 0
|
||||||
|
|
||||||
|
|
||||||
|
def test_bootstrap_cli_preview_outputs_candidate_metadata(tmp_path, capsys):
|
||||||
|
from unittest.mock import patch
|
||||||
|
from citegeist.bootstrap import BootstrapResult
|
||||||
|
|
||||||
|
database = tmp_path / "library.sqlite3"
|
||||||
|
with patch("citegeist.cli.Bootstrapper.bootstrap") as mocked_bootstrap:
|
||||||
|
mocked_bootstrap.return_value = [
|
||||||
|
BootstrapResult(
|
||||||
|
citation_key="openalexw123",
|
||||||
|
origin="topic",
|
||||||
|
created=True,
|
||||||
|
score=4.0,
|
||||||
|
title="Artificial Life and Adaptive Behavior",
|
||||||
|
author="Langton, Christopher G.",
|
||||||
|
year="1989",
|
||||||
|
abstract="A foundational overview of artificial life systems.",
|
||||||
|
)
|
||||||
|
]
|
||||||
|
exit_code = main(
|
||||||
|
[
|
||||||
|
"--db",
|
||||||
|
str(database),
|
||||||
|
"bootstrap",
|
||||||
|
"--topic",
|
||||||
|
"artificial life",
|
||||||
|
"--preview",
|
||||||
|
"--topic-commit-limit",
|
||||||
|
"50",
|
||||||
|
]
|
||||||
|
)
|
||||||
|
|
||||||
|
assert exit_code == 0
|
||||||
|
payload = capsys.readouterr().out
|
||||||
|
assert "Artificial Life and Adaptive Behavior" in payload
|
||||||
|
assert "Langton, Christopher G." in payload
|
||||||
|
assert "A foundational overview of artificial life systems." in payload
|
||||||
|
|
||||||
|
|
||||||
def test_bootstrap_ranks_and_deduplicates_topic_candidates():
|
def test_bootstrap_ranks_and_deduplicates_topic_candidates():
|
||||||
store = BibliographyStore()
|
store = BibliographyStore()
|
||||||
try:
|
try:
|
||||||
|
|
@ -140,6 +178,7 @@ def test_bootstrap_preview_does_not_write_to_database():
|
||||||
results = bootstrapper.bootstrap(store, topic="graph topic", expand=False, preview_only=True)
|
results = bootstrapper.bootstrap(store, topic="graph topic", expand=False, preview_only=True)
|
||||||
|
|
||||||
assert [item.citation_key for item in results] == ["preview2024graph"]
|
assert [item.citation_key for item in results] == ["preview2024graph"]
|
||||||
|
assert results[0].title == "Preview Graph Topic"
|
||||||
assert store.get_entry("preview2024graph") is None
|
assert store.get_entry("preview2024graph") is None
|
||||||
finally:
|
finally:
|
||||||
store.close()
|
store.close()
|
||||||
|
|
@ -173,3 +212,50 @@ def test_bootstrap_topic_commit_limit_restricts_persisted_candidates():
|
||||||
assert store.get_entry("rank2") is None
|
assert store.get_entry("rank2") is None
|
||||||
finally:
|
finally:
|
||||||
store.close()
|
store.close()
|
||||||
|
|
||||||
|
|
||||||
|
def test_bootstrap_preview_uses_topic_commit_limit_when_larger_than_topic_limit():
|
||||||
|
store = BibliographyStore()
|
||||||
|
try:
|
||||||
|
bootstrapper = Bootstrapper()
|
||||||
|
from citegeist import BibEntry
|
||||||
|
|
||||||
|
bootstrapper.resolver.search_openalex = lambda topic, limit=5: [ # type: ignore[method-assign]
|
||||||
|
BibEntry(
|
||||||
|
entry_type="article",
|
||||||
|
citation_key=f"rank{index}",
|
||||||
|
fields={
|
||||||
|
"title": f"Preview Topic Result {index}",
|
||||||
|
"author": f"Author, {index}",
|
||||||
|
"year": f"20{index:02d}",
|
||||||
|
"abstract": f"Abstract {index}",
|
||||||
|
},
|
||||||
|
)
|
||||||
|
for index in range(1, 8)
|
||||||
|
][:limit]
|
||||||
|
bootstrapper.resolver.search_crossref = lambda topic, limit=5: [] # type: ignore[method-assign]
|
||||||
|
bootstrapper.resolver.search_datacite = lambda topic, limit=5: [] # type: ignore[method-assign]
|
||||||
|
|
||||||
|
results = bootstrapper.bootstrap(
|
||||||
|
store,
|
||||||
|
topic="graph topic",
|
||||||
|
expand=False,
|
||||||
|
preview_only=True,
|
||||||
|
topic_limit=5,
|
||||||
|
topic_commit_limit=7,
|
||||||
|
)
|
||||||
|
|
||||||
|
assert [item.citation_key for item in results] == [
|
||||||
|
"rank1",
|
||||||
|
"rank2",
|
||||||
|
"rank3",
|
||||||
|
"rank4",
|
||||||
|
"rank5",
|
||||||
|
"rank6",
|
||||||
|
"rank7",
|
||||||
|
]
|
||||||
|
assert results[0].author == "Author, 1"
|
||||||
|
assert results[0].year == "2001"
|
||||||
|
assert results[0].abstract == "Abstract 1"
|
||||||
|
finally:
|
||||||
|
store.close()
|
||||||
|
|
|
||||||
Loading…
Reference in New Issue