CiteGeist/examples/cli/README.md

17 KiB
Raw Blame History

CLI Examples

This guide gives example invocations for the citegeist CLI, including the major option combinations for each command.

Where a topic is named, this guide uses:

  • topic phrase: artificial life
  • topic slug: artificial-life
  • topic name: Artificial life

Assume:

cd citegeist
export PYTHONPATH=src

Setup

Purpose: point commands at the right database before doing anything else.

Global Option

Use a non-default database path:

.venv/bin/python -m citegeist --db library.sqlite3 topics

Build And Inspect A Library

Purpose: ingest records, search them, inspect them, and export them.

Ingest

Basic ingest:

.venv/bin/python -m citegeist --db library.sqlite3 ingest references.bib

Set initial review status:

.venv/bin/python -m citegeist --db library.sqlite3 ingest references.bib --status reviewed

Set a provenance label:

.venv/bin/python -m citegeist --db library.sqlite3 ingest references.bib --source-label "examples/artificial-life/references.bib"

Use both ingest options together:

.venv/bin/python -m citegeist --db library.sqlite3 ingest references.bib --status draft --source-label "manual-import:artificial-life"

Basic search:

.venv/bin/python -m citegeist --db library.sqlite3 search "artificial life"

Limit the number of matches:

.venv/bin/python -m citegeist --db library.sqlite3 search "artificial life" --limit 5

Restrict search to one topic slice:

.venv/bin/python -m citegeist --db library.sqlite3 search "artificial life" --topic artificial-life

Show

Show one entry:

.venv/bin/python -m citegeist --db library.sqlite3 show langton1989artificial1

List entries:

.venv/bin/python -m citegeist --db library.sqlite3 show --limit 10

Include provenance:

.venv/bin/python -m citegeist --db library.sqlite3 show langton1989artificial1 --provenance

Include conflicts:

.venv/bin/python -m citegeist --db library.sqlite3 show langton1989artificial1 --conflicts

Use both:

.venv/bin/python -m citegeist --db library.sqlite3 show langton1989artificial1 --provenance --conflicts

Export

Export the whole library:

.venv/bin/python -m citegeist --db library.sqlite3 export

Export selected citation keys:

.venv/bin/python -m citegeist --db library.sqlite3 export langton1989artificial1 bedau2003artificial2

Write BibTeX to a file:

.venv/bin/python -m citegeist --db library.sqlite3 export --output artificial-life.bib

Include DOI-only placeholder records in a broad export:

.venv/bin/python -m citegeist --db library.sqlite3 export --include-stubs --output artificial-life.bib

Review And Clean Metadata

Purpose: inspect merge conflicts, apply corrections, and enrich incomplete records.

Entry Review

Set review status:

.venv/bin/python -m citegeist --db library.sqlite3 set-status langton1989artificial1 reviewed

Resolve field conflicts:

.venv/bin/python -m citegeist --db library.sqlite3 resolve-conflicts langton1989artificial1 title accepted

Reject a conflict instead:

.venv/bin/python -m citegeist --db library.sqlite3 resolve-conflicts langton1989artificial1 title rejected

Apply the latest proposed conflict value:

.venv/bin/python -m citegeist --db library.sqlite3 apply-conflict langton1989artificial1 title

Extract

Extract draft BibTeX from plaintext:

.venv/bin/python -m citegeist extract references.txt

Write extracted BibTeX to a file:

.venv/bin/python -m citegeist extract references.txt --output extracted-artificial-life.bib

Resolve

Resolve one or more entries against remote metadata:

.venv/bin/python -m citegeist --db library.sqlite3 resolve langton1989artificial1 bedau2003artificial2

Preview DOI-bearing placeholder records before enriching them:

.venv/bin/python -m citegeist --db library.sqlite3 resolve-stubs --doi-only --preview --limit 25

Enrich DOI-bearing placeholder records inside one topic slice:

.venv/bin/python -m citegeist --db library.sqlite3 resolve-stubs --doi-only --topic artificial-life --limit 25

Preview all current @misc entries with DOIs, not just placeholder-like stubs:

.venv/bin/python -m citegeist --db library.sqlite3 resolve-stubs --doi-only --all-misc --preview --limit 25

Re-enrich all current @misc entries with DOIs:

.venv/bin/python -m citegeist --db library.sqlite3 resolve-stubs --doi-only --all-misc --limit 25

Explore Citation Graphs

Purpose: traverse citation edges, export graph data, and render quick visualizations.

Graph Traversal

Basic traversal:

.venv/bin/python -m citegeist --db library.sqlite3 graph langton1989artificial1

Use multiple relation filters:

.venv/bin/python -m citegeist --db library.sqlite3 graph langton1989artificial1 --relation cites --relation cited_by

Set traversal depth:

.venv/bin/python -m citegeist --db library.sqlite3 graph langton1989artificial1 --depth 2

Filter by target review status:

.venv/bin/python -m citegeist --db library.sqlite3 graph langton1989artificial1 --review-status reviewed

Show only unresolved targets:

.venv/bin/python -m citegeist --db library.sqlite3 graph langton1989artificial1 --missing-only

Render DOT instead of traversal rows:

.venv/bin/python -m citegeist --db library.sqlite3 graph langton1989artificial1 --format dot

Render node/edge JSON for visualization:

.venv/bin/python -m citegeist --db library.sqlite3 graph langton1989artificial1 --format json-graph

Write graph output to a file:

.venv/bin/python -m citegeist --db library.sqlite3 graph langton1989artificial1 --depth 2 --format dot --output artificial-life.dot

Graph Viewer

Render a standalone HTML page from a json-graph export:

.venv/bin/python -m citegeist graph-view artificial-life.json --output artificial-life.html

Set the HTML page title:

.venv/bin/python -m citegeist graph-view artificial-life.json --output artificial-life.html --title "Artificial Life Graph"

Graph Expansion

Expand from one or more seed entries:

.venv/bin/python -m citegeist --db library.sqlite3 expand langton1989artificial1

Choose the source:

.venv/bin/python -m citegeist --db library.sqlite3 expand langton1989artificial1 --source openalex

Choose relation direction:

.venv/bin/python -m citegeist --db library.sqlite3 expand langton1989artificial1 --source openalex --relation cited_by

Limit discoveries per seed:

.venv/bin/python -m citegeist --db library.sqlite3 expand langton1989artificial1 --source openalex --limit 10

Build A Topic-Centered Bibliography

Purpose: create, expand, inspect, and export a topic slice such as artificial life.

Topic Expansion

Basic topic expansion from stored topic metadata:

.venv/bin/python -m citegeist --db library.sqlite3 expand-topic artificial-life

Override the topic phrase:

.venv/bin/python -m citegeist --db library.sqlite3 expand-topic artificial-life --topic-phrase "artificial life alife artificial organisms"

Choose source and relation:

.venv/bin/python -m citegeist --db library.sqlite3 expand-topic artificial-life --source openalex --relation cited_by

Control seed and discovery limits:

.venv/bin/python -m citegeist --db library.sqlite3 expand-topic artificial-life --seed-limit 10 --per-seed-limit 5

Restrict to trusted seed entries:

.venv/bin/python -m citegeist --db library.sqlite3 expand-topic artificial-life --seed-key langton1989artificial1 --seed-key bedau2003artificial2

Raise or lower the topic assignment threshold:

.venv/bin/python -m citegeist --db library.sqlite3 expand-topic artificial-life --min-relevance 0.3

Preview without writing:

.venv/bin/python -m citegeist --db library.sqlite3 expand-topic artificial-life --preview

Topic Phrase Storage

Set a stored topic phrase:

.venv/bin/python -m citegeist --db library.sqlite3 set-topic-phrase artificial-life "artificial life alife artificial organisms complex systems evolution simulation"

Clear a stored topic phrase:

.venv/bin/python -m citegeist --db library.sqlite3 set-topic-phrase artificial-life --clear

Topic Inspection

List topics:

.venv/bin/python -m citegeist --db library.sqlite3 topics

Limit topic rows:

.venv/bin/python -m citegeist --db library.sqlite3 topics --limit 20

Filter topics by phrase review status:

.venv/bin/python -m citegeist --db library.sqlite3 topics --phrase-review-status pending

List entries for a topic:

.venv/bin/python -m citegeist --db library.sqlite3 topic-entries artificial-life

Limit topic entries:

.venv/bin/python -m citegeist --db library.sqlite3 topic-entries artificial-life --limit 25

Export one topic slice as BibTeX:

.venv/bin/python -m citegeist --db library.sqlite3 export-topic artificial-life

Write the topic slice to a file:

.venv/bin/python -m citegeist --db library.sqlite3 export-topic artificial-life --output artificial-life-topic.bib

Include DOI-only placeholder records in the topic export:

.venv/bin/python -m citegeist --db library.sqlite3 export-topic artificial-life --include-stubs --output artificial-life-topic.bib

Bootstrap

Seed from a BibTeX file:

.venv/bin/python -m citegeist --db library.sqlite3 bootstrap --seed-bib artificial-life.bib

Seed from a topic phrase alone:

.venv/bin/python -m citegeist --db library.sqlite3 bootstrap --topic "artificial life"

Use both a seed .bib and a topic phrase:

.venv/bin/python -m citegeist --db library.sqlite3 bootstrap --seed-bib artificial-life.bib --topic "artificial life"

Store topic metadata while bootstrapping:

.venv/bin/python -m citegeist --db library.sqlite3 \
  bootstrap \
  --topic "artificial life" \
  --topic-slug artificial-life \
  --topic-name "Artificial life" \
  --store-topic-phrase "artificial life alife artificial organisms complex systems evolution simulation"

Control topic-search candidate count:

.venv/bin/python -m citegeist --db library.sqlite3 bootstrap --topic "artificial life" --topic-limit 10

Control how many topic candidates are actually committed:

.venv/bin/python -m citegeist --db library.sqlite3 bootstrap --topic "artificial life" --topic-commit-limit 5

Disable immediate expansion:

.venv/bin/python -m citegeist --db library.sqlite3 bootstrap --topic "artificial life" --no-expand

Preview without writing:

.venv/bin/python -m citegeist --db library.sqlite3 bootstrap --topic "artificial life" --preview

Set review status for imported entries:

.venv/bin/python -m citegeist --db library.sqlite3 bootstrap --topic "artificial life" --status reviewed

Batch Bootstrap

Run a JSON batch file:

.venv/bin/python -m citegeist --db library.sqlite3 bootstrap-batch artificial-life.json

Topic Phrase Review Workflow

Apply topic phrases directly:

.venv/bin/python -m citegeist --db library.sqlite3 apply-topic-phrases topic-phrases.json

Stage topic phrase suggestions:

.venv/bin/python -m citegeist --db library.sqlite3 stage-topic-phrases topic-phrases.json

Review one staged phrase:

.venv/bin/python -m citegeist --db library.sqlite3 review-topic-phrase artificial-life accepted

Add notes while reviewing:

.venv/bin/python -m citegeist --db library.sqlite3 review-topic-phrase artificial-life accepted --notes "good fit for topic expansion"

Override the accepted phrase while reviewing:

.venv/bin/python -m citegeist --db library.sqlite3 review-topic-phrase artificial-life accepted --phrase "artificial life alife artificial organisms autonomous agents"

Apply review decisions in bulk:

.venv/bin/python -m citegeist --db library.sqlite3 review-topic-phrases topic-phrase-review.json

List staged phrase reviews:

.venv/bin/python -m citegeist --db library.sqlite3 topic-phrase-reviews

Filter review rows by status:

.venv/bin/python -m citegeist --db library.sqlite3 topic-phrase-reviews --phrase-review-status pending

Export an editable review template:

.venv/bin/python -m citegeist --db library.sqlite3 export-topic-phrase-reviews

Limit exported review rows:

.venv/bin/python -m citegeist --db library.sqlite3 export-topic-phrase-reviews --limit 10

Filter exported rows by status:

.venv/bin/python -m citegeist --db library.sqlite3 export-topic-phrase-reviews --phrase-review-status rejected

Write the review template to a file:

.venv/bin/python -m citegeist --db library.sqlite3 export-topic-phrase-reviews --output topic-phrase-review.json

Harvest External Repositories

Purpose: inspect and harvest OAI-PMH repositories into the library.

OAI-PMH Harvesting

Inspect a repository:

.venv/bin/python -m citegeist discover-oai https://example.edu/oai

Harvest with default metadata prefix:

.venv/bin/python -m citegeist --db library.sqlite3 harvest-oai https://example.edu/oai

Use an alternate metadata prefix:

.venv/bin/python -m citegeist --db library.sqlite3 harvest-oai https://example.edu/oai --metadata-prefix mods

Restrict to a set:

.venv/bin/python -m citegeist --db library.sqlite3 harvest-oai https://example.edu/oai --set artificial-life

Harvest a date range:

.venv/bin/python -m citegeist --db library.sqlite3 harvest-oai https://example.edu/oai --from 2024-01-01 --until 2024-12-31

Limit harvested records and set review status:

.venv/bin/python -m citegeist --db library.sqlite3 harvest-oai https://example.edu/oai --limit 10 --status draft

Work Through Example Corpora

Purpose: run the repos example workflows without treating them as the core product surface.

TalkOrigins Example Commands

Scrape the example corpus:

.venv/bin/python -m citegeist example-talkorigins-scrape talkorigins-out

Override the source URL:

.venv/bin/python -m citegeist example-talkorigins-scrape talkorigins-out --base-url https://www.talkorigins.org/origins/biblio/

Limit topics and entries:

.venv/bin/python -m citegeist example-talkorigins-scrape talkorigins-out --limit-topics 5 --limit-entries-per-topic 20

Resolve seeds, ingest immediately, and keep expansion disabled:

.venv/bin/python -m citegeist --db library.sqlite3 example-talkorigins-scrape talkorigins-out --resolve-seeds --ingest --no-expand

Disable snapshot reuse:

.venv/bin/python -m citegeist example-talkorigins-scrape talkorigins-out --no-resume

Control generated bootstrap defaults:

.venv/bin/python -m citegeist example-talkorigins-scrape talkorigins-out --topic-limit 10 --topic-commit-limit 5 --status draft

Validate the generated manifest:

.venv/bin/python -m citegeist example-talkorigins-validate talkorigins-out/talkorigins_manifest.json

Suggest phrases from the corpus:

.venv/bin/python -m citegeist example-talkorigins-suggest-phrases talkorigins-out/talkorigins_manifest.json --topic abiogenesis --limit 10 --output topic-phrases.json

Inspect duplicate clusters:

.venv/bin/python -m citegeist example-talkorigins-duplicates talkorigins-out/talkorigins_manifest.json --limit 20 --min-count 2 --match origin --topic abiogenesis --preview --weak-only

Ingest the reconstructed corpus:

.venv/bin/python -m citegeist --db talkorigins.sqlite3 example-talkorigins-ingest talkorigins-out/talkorigins_manifest.json --status draft

Disable deduplication during example ingest:

.venv/bin/python -m citegeist --db talkorigins.sqlite3 example-talkorigins-ingest talkorigins-out/talkorigins_manifest.json --no-dedupe

Enrich weak canonical entries:

.venv/bin/python -m citegeist --db talkorigins.sqlite3 example-talkorigins-enrich talkorigins-out/talkorigins_manifest.json --limit 20 --min-count 2 --match origin --topic abiogenesis --status enriched

Apply enrichment and allow unsafe search matches for experiments:

.venv/bin/python -m citegeist --db talkorigins-copy.sqlite3 example-talkorigins-enrich talkorigins-out/talkorigins_manifest.json --apply --allow-unsafe-search-matches

Export a review artifact:

.venv/bin/python -m citegeist --db talkorigins.sqlite3 example-talkorigins-review talkorigins-out/talkorigins_manifest.json --limit 20 --min-count 2 --match origin --topic abiogenesis --output talkorigins-review.json

Apply curated corrections:

.venv/bin/python -m citegeist --db talkorigins.sqlite3 example-talkorigins-apply-corrections talkorigins-out/talkorigins_manifest.json talkorigins-corrections.json --status reviewed

Notes

  • Some commands depend on live source access.
  • For topic-oriented examples, use preview mode before committing changes when possible.
  • The older TalkOrigins alias commands remain available, but the example-prefixed names are the preferred surface.