|
|
||
|---|---|---|
| .forgejo/workflows | ||
| .github | ||
| apps | ||
| docs | ||
| scripts | ||
| tests/ui | ||
| .gitignore | ||
| CONTRIBUTING.md | ||
| LICENSE | ||
| README.md | ||
| docker-compose.yml | ||
| package-lock.json | ||
| package.json | ||
| playwright.config.js | ||
README.md
EcoSpecies-Atlas
Modern open-source follow-on to the legacy EcoSpecies application, built to ingest historical Species Life History materials and evolve into a maintainable public research platform.
Repository layout
apps/api: Python API and import logicapps/web: public web UI served by nginxdocs: migration survey and roadmapscripts: Compose runtime scripts for bootstrapping the container-managed Python environment- Docker named volume
python_venv: container-managed Python virtual environment - Docker named volume
pip_cache: pip cache for container bootstrapping - Docker named volume
postgres_data: PostgreSQL data directory var/sqlite: host fallback for local non-Compose verification
Runtime model
Docker Compose owns all runtime dependencies:
- PostgreSQL runs in a container with a Docker-managed named volume
- Python services run in
python:3.12-slim - the Python virtual environment is created in a Docker-managed volume mounted at
/workspace/.docker/venv - dependencies are installed from
apps/api/requirements.txtinside that virtual environment - the legacy corpus is mounted read-only from a sibling directory, defaulting to
../legacy-corpus
No host Python packages are required for the Compose workflow.
Start the stack
cd EcoSpecies-Atlas
docker compose up
Endpoints:
- web UI:
http://localhost:8080 - API:
http://localhost:8000 - PostgreSQL:
localhost:5432 - liveness:
/healthz - readiness:
/readyz - auth session:
/api/auth/session - editor status:
/api/editor/status(requireseditororadmin) - editor species list:
/api/editor/species(requireseditororadmin) - editor workflow detail/update:
/api/editor/species/<slug>/workflow(requireseditororadmin) - editor species detail:
/api/editor/species/<slug>(requireseditororadmin) - editor editorial update:
/api/editor/species/<slug>/editorial(requireseditororadmin) - editor section detail/update:
/api/editor/species/<slug>/sections/<position>(requireseditororadmin) - editor audit history:
/api/editor/species/<slug>/audit(requireseditororadmin)
The app can also be published under a URL prefix. A reverse-proxy deployment can publish the app at a host and path such as:
ECOSPECIES_HOSTNAME=example.orgECOSPECIES_BASE_PATH=/apps/ecospecies
When the site is served below a path prefix, the frontend derives its API base from the current page URL and nginx serves both the UI and proxied API under that same prefix.
If those host ports are already in use, override them when starting Compose, for example:
ECOSPECIES_API_PORT=18000 ECOSPECIES_WEB_PORT=18080 docker compose up
Host-visible state
All important runtime state is bind-mounted and visible on the host:
- source code and docs in this repo
- SQLite fallback database in
var/sqlite - optional SQLite fallback in
var/sqlite
Automated checks
Repository-host CI runs the repository-layer tests and the stubbed browser smoke test on pushes and change requests.
Contributor workflow guidance is in CONTRIBUTING.md.
When hosted on Forgejo, the current GitHub-compatible workflow layout can still be used. Forgejo Actions will look for workflows in .forgejo/workflows, and if that directory is absent it will fall back to .github/workflows. A Forgejo-native template is provided at .forgejo/workflows/ci.yml.template; copy it to .forgejo/workflows/ci.yml only after adapting the runner label and action source policy for the target instance. The activation checklist is in docs/forgejo-activation.md.
Run the repository-layer test suite with:
./scripts/check-api-tests.sh
Run the browser-level editor smoke test with:
./scripts/check-ui-smoke.sh
Run the browser-level smoke test against the real Compose stack with:
./scripts/check-ui-stack-smoke.sh
Run a bounded citation backfill pass with:
./scripts/run-citation-backfill.sh
The wrapper runs inside ecospecies-api, keeps a rotating cursor in var/citation-backfill.cursor, and skips a run if another backfill is already active.
Notes
- The importer seeds PostgreSQL from the legacy text corpus before the API starts and now synchronizes by slug instead of truncating the full dataset.
- Species missing from a later import payload are archived instead of deleted. Public endpoints hide archived records; editor endpoints can still inspect them.
- The editor species list supports
active,all, andarchivedclient-side filtering so archived records remain manageable in the UI. - Editors can also archive or unarchive species explicitly from the editorial controls, with audit history recorded alongside other editorial changes.
- The API also supports a host-local SQLite fallback for direct verification when
ECOSPECIES_DATABASE_URLis unset. - PostgreSQL, the Python virtualenv, and the pip cache use Docker named volumes because bind-mounted runtime state is not reliable on CIFS-backed workspaces like this one.
- Initial editor auth uses
ECOSPECIES_AUTH_TOKENSin the formattoken:username:role[,token2:username2:role2], whereroleisviewer,editor, oradmin. - Editorial workflow state is persisted per species with
draft,review, andpublishedstatuses. Public endpoints return onlypublishedrecords; editor endpoints can inspect and update all records. - Editors can curate top-level metadata and section content from the web UI, and every editorial or section change is recorded in per-species audit history.
- Citation backfill can be scheduled externally, such as with a nightly cron job that runs
./scripts/run-citation-backfill.sh. UseECOSPECIES_BACKFILL_LOG_DIRif logs should go somewhere other thanvar/logs. - Unresolved citation enrichment now still refreshes the locally parsed BibTeX and normalized citation text, so parser improvements propagate even without a remote metadata match.
- Summary authoring guidance for future FLELMR-compatible records is in
docs/flelmr-authoring.md. - Legacy survey and roadmap artifacts are in
docs/.
Governance And Operations
The repository host, such as GitHub or Forgejo, is used for source control, change requests, code review, and CI checks. It is not part of the application runtime.
EcoSpecies-Atlas itself runs through Docker Compose, the Python API, nginx, and PostgreSQL. Import jobs, editor workflows, and browser access all depend on the application stack, not on the repository host.
In practice, the repository host is responsible for change management:
- storing the code, docs, workflow definitions, and test harnesses
- running CI checks on pushes and change requests
- supporting review and merge workflows
In practice, the application stack is responsible for operations:
- serving the web UI and API
- persisting editorial and import state
- running imports and editor workflows