From 4f36070a7fd8f7dba28204a5570fb66315e9ee75 Mon Sep 17 00:00:00 2001 From: welsberr Date: Sat, 11 Apr 2026 07:20:57 -0400 Subject: [PATCH] Add biological overview and Nunney analysis docs --- README.md | 136 +++++++++++++++--------- docs/NUNNEY_ANALYSIS.md | 224 ++++++++++++++++++++++++++++++++++++++++ 2 files changed, 309 insertions(+), 51 deletions(-) create mode 100644 docs/NUNNEY_ANALYSIS.md diff --git a/README.md b/README.md index 84e7e9f..b209d92 100644 --- a/README.md +++ b/README.md @@ -1,55 +1,97 @@ # renunney -Clean working repository for: +Working repository for replication and reanalysis of Leonard Nunney's 2003 +cost-of-substitution simulations. -- faithful replication of Leonard Nunney's 2003 cost-of-substitution results, -- orchestration of distributed sweep runs, -- later migration to a faster Rust-backed worker. +## Biological Question -## Current Scope +Nunney's paper asks how rapidly an adapting population can track a moving +environment before extinction becomes likely. In this model: -This repository was bootstrapped from earlier work in: +- the selective optimum moves steadily through time, +- adaptation requires repeated allelic substitutions across one or more loci, +- population growth is density-dependent, +- offspring survival falls as genotypes lag behind the moving optimum, +- and "cost of substitution" is summarized by the smallest environmental-change + interval `T` that still allows persistence. -- [`../collaborations/to_ptbc/evc/cost_of_substitution`](/mnt/CIFS/pengolodh/Docs/Projects/collaborations/to_ptbc/evc/cost_of_substitution) +Smaller `T` means faster environmental change and a harder adaptive problem. -That earlier tree remains useful as provenance and historical context. The -Track 1 runtime and orchestration stack now live in `renunney`. +## Two Approaches -`renunney` provides: +This project treats the problem in two separate ways. -- a clean git repo, -- a stable working directory layout, -- a local orchestration CLI and library, -- local paper-scale Figure 1 submission configs, -- a local Track 1 runner and config/API layer, -- a local Track 1 analysis layer for tracking summaries and loci-regression, -- a local Track 1 threshold/search layer for Nunney-style threshold checks, -- a local Track 1 simulation kernel, -- a local Track 1 report generator, -- a local Track 1 extinction-model data layer, -- a local Track 1 dataset generator, -- a local Track 1 fit layer, -- a Makefile for common tasks, -- migration notes for pulling code into this repo in stages. +Track 1: Nunney-faithful replication + +- reconstruct the published simulation and threshold heuristic as closely as + possible, +- preserve historically relevant assumptions even when they are inefficient or + awkward, +- and use that result as the baseline for replication and criticism. + +Track 2: modern replacement + +- build a cleaner and faster simulator around the same biological question, +- define threshold explicitly rather than through Nunney's heuristic, +- and use a more performant implementation path, likely including Rust. + +The current repo is centered on Track 1, with orchestration intended to support +both tracks later. + +## Nunney's Approach + +The paper-level model combines four pieces: + +- constant environmental change, +- genotype survival `w_i` as a function of distance from the moving optimum, +- density-dependent female fecundity `f`, +- and Mendelian transmission with mutation. + +The published threshold procedure is heuristic rather than inferential: Nunney +searches for the smallest `T` with no extinctions in 20 runs, then checks +nearby larger values. That historical rule is preserved in Track 1 because it +is part of the claimed result structure. + +## This Repo's Approach + +`renunney` turns that work into a clean, testable stack: + +- local Track 1 simulation kernel, +- local threshold/search layer, +- local analysis, reporting, extinction-dataset, and fitting layers, +- local orchestration CLI and SQLite job registry, +- local paper-scale Figure 1 configs, +- and a Makefile for common operational tasks. + +The repo was bootstrapped from earlier work in +[`../collaborations/to_ptbc/evc/cost_of_substitution`](/mnt/CIFS/pengolodh/Docs/Projects/collaborations/to_ptbc/evc/cost_of_substitution), +which remains useful as provenance and historical context, but the Track 1 +runtime now lives in `renunney`. + +## Key Docs + +- [docs/MIGRATION.md](/mnt/CIFS/pengolodh/Docs/Projects/renunney/docs/MIGRATION.md) +- [docs/WORKFLOW.md](/mnt/CIFS/pengolodh/Docs/Projects/renunney/docs/WORKFLOW.md) +- [docs/NUNNEY_ANALYSIS.md](/mnt/CIFS/pengolodh/Docs/Projects/renunney/docs/NUNNEY_ANALYSIS.md) ## Layout - `docs/` - - project and migration notes + project, migration, and paper-analysis notes - `config/` - - configuration templates and examples + configuration templates and paper-scale treatment configs - `runs/state/` - - SQLite registries and persistent orchestration state + SQLite registries and persistent orchestration state - `runs/results/` - - result artifacts collected by orchestration + result artifacts collected by orchestration - `runs/scratch/` - - local worker scratch and cache files + local worker scratch and cache files - `src/renunney/` - - future in-repo Python package and migration target + in-repo Python package - `scripts/` - - local CLI entrypoints + local CLI entrypoints - `tests/` - - local verification for migrated boundaries + local verification ## Start @@ -59,18 +101,18 @@ Initialize the local run directories and SQLite registry: make init ``` +Run one local Track 1 simulation: + +```bash +make track1-sim-smoke +``` + Submit a paper-scale Figure 1 treatment: ```bash make submit-figure1-m10 ``` -Run one local Track 1 simulation through the migrated runner/API boundary: - -```bash -make track1-sim-smoke -``` - Run one worker loop locally: ```bash @@ -85,17 +127,9 @@ make collate-figure1 ## Status -The current state is split: +The Track 1 runtime and orchestration stack are now local to `renunney`. The +next major step is no longer migration of Track 1 code; it is either: -- orchestration control plane: local to `renunney` -- Track 1 runner and config/API layer: local to `renunney` -- Track 1 analysis layer: local to `renunney` -- Track 1 threshold/search layer: local to `renunney` -- Track 1 simulation kernel: local to `renunney` -- Track 1 report generator: local to `renunney` -- Track 1 extinction-model data layer: local to `renunney` -- Track 1 dataset generator: local to `renunney` -- Track 1 fit layer: local to `renunney` - -This repo is now the clean operational entry point for the Track 1 runtime and -its orchestration stack. +- hardening multi-host orchestration, +- organizing publication-quality replication outputs, +- or starting the Rust-backed Track 2 path. diff --git a/docs/NUNNEY_ANALYSIS.md b/docs/NUNNEY_ANALYSIS.md new file mode 100644 index 0000000..89a49ab --- /dev/null +++ b/docs/NUNNEY_ANALYSIS.md @@ -0,0 +1,224 @@ +# Nunney Analysis + +Updated: 2026-04-11 + +## Purpose + +This note gives a compact in-repo analysis of Nunney's main claims, equations, +and reported results, and how the current replication effort interprets them. + +It is not a complete paper summary. Its job is to make the scientific and +implementation targets explicit enough that code and results can be reviewed +against them. + +Primary paper: + +- `nunney_cost_of_substitution_anz40-185.pdf` + +Related internal notes: + +- [COST_OF_SUBSTITUTION.md](/mnt/CIFS/pengolodh/Docs/Projects/collaborations/to_ptbc/evc/cost_of_substitution/COST_OF_SUBSTITUTION.md) +- [TRACK1_BASELINE.md](/mnt/CIFS/pengolodh/Docs/Projects/collaborations/to_ptbc/evc/cost_of_substitution/TRACK1_BASELINE.md) +- [PAPER_CODE_ALIGNMENT.md](/mnt/CIFS/pengolodh/Docs/Projects/collaborations/to_ptbc/evc/cost_of_substitution/PAPER_CODE_ALIGNMENT.md) + +## Core Claim + +Nunney's central claim is that the rate of environmental change a population +can tolerate depends on the cost of repeated adaptive substitutions, and that +this cost can be decomposed into: + +- a fixed component, and +- a per-locus component. + +The paper presents simulation evidence that both components depend on mutation +supply, summarized by `M = 2Ku`, and that extinction occurs when the +environment changes too quickly for the population to keep pace. + +## Model Structure + +The paper describes four interacting components: + +1. constant environmental change +2. genotype-dependent survival relative to the moving optimum +3. density-dependent female fecundity +4. Mendelian transmission with mutation + +The adaptive problem is staged so that one substitution is needed every `T` +generations at each selected locus. Smaller `T` means faster change and thus a +more demanding environment. + +## Main Equations + +### Growth/Fecundity + +Nunney uses: + +```text +R = 2 exp(r) +``` + +and: + +```text +f = 2 exp(r * (1 - (N/K)^(1/r))) +``` + +Interpretation: + +- `R` is the density-independent net reproductive rate. +- `f` is density-dependent female fecundity. +- fecundity is genotype-independent; selection enters through survival. + +### Fitness / Offspring Survival + +The key selection equation is: + +```text +w_i = exp(-(r/n) * Σ_j (Av_ij - t/T)^2) +``` + +where: + +- `Av_ij` is the mean allelic value at locus `j` in genotype `i` +- `t/T` is the moving optimum on the allele-value scale +- `n` is the number of loci + +Interpretation: + +- survival declines as genotype means lag the moving optimum, +- the factor `r/n` scales selection intensity across different numbers of + loci, +- and the Gaussian form makes tracking lag the central state variable. + +### Mutation Supply + +The paper uses `u` as the mutation-rate parameter and `M = 2Ku` as a derived +population-level mutation-supply quantity for comparing treatments. + +Important consequence for replication: + +- `u` is the paper-native input, +- `M` is a derived comparison variable, +- and the simulation must expose mutation across both diploid strands for the + `M = 2Ku` interpretation to make sense. + +## Threshold Claim + +Nunney's reported threshold is not a mathematically defined extinction +probability threshold. It is a simulation-search heuristic: + +- find the lowest `T` with no extinctions in 20 runs, +- search from below, +- then require no extinction at `1.02T`, `1.05T`, and `1.10T`, +- with extra retesting in borderline cases. + +This matters because the paper's "threshold" mixes: + +- biological persistence, +- stochastic variation, +- and the search protocol itself. + +That is acceptable for Track 1 replication, but it is one of the main reasons +Track 2 exists. + +## Claimed Result Structure + +The paper's most important reported pattern is that threshold cost can be +regressed on number of loci: + +```text +C = C0 + n C1 +``` + +where: + +- `C0` is the fixed cost component, +- `C1` is the per-locus cost component, +- and both are analyzed as functions of mutation supply `M`. + +Figure 1 and Table 1 are therefore not just descriptive outputs; they are the +main statistical structure the replication must recover if the implementation is +faithful. + +## What Must Be Reproduced + +A credible Track 1 replication should reproduce, or clearly fail to reproduce, +all of the following: + +- the paper's parameter framing in terms of `u`, `K`, `R`, `T`, and derived `M` +- the threshold-search behavior over repeated stochastic runs +- the locus-sweep regression structure `C = C0 + n C1` +- the directional effect of mutation supply on fixed and per-locus cost +- the extinction/non-extinction boundary under the published search rule + +## Key Ambiguities In The Paper + +Several implementation details are underdetermined by the paper text and must +be treated explicitly as reconstruction choices: + +- exact generation update order +- exact stochastic law for realized births +- exact mutation operator over the allele set +- exact practical allele-state truncation for finite runs +- exact sex realization rule +- exact extinction condition in code + +These do not make replication impossible, but they mean "faithful replication" +is always conditional on a documented reconstruction policy. + +## Current Replication Reading + +The present Track 1 implementation uses the following interpretation: + +- integer allele states with finite truncation tied to the run horizon +- lottery polygyny with one male sampled per female reproductive event +- births drawn stochastically from fecundity +- offspring survival governed by `w_i` +- extinction on zero population or absence of one sex +- explicit reporting of `f`, mean `w`, `f*w`, mutation supply, allele tracking, + and extinction timing + +This is intended to stay as close as possible to the paper while making the +reconstruction auditable. + +## Main Scientific Risks + +The main ways the replication could still diverge from the paper are: + +1. the wrong stochastic realization of fecundity and survival +2. an off-by-one time alignment in `t` versus offspring evaluation +3. mutation semantics that do not match the paper's effective `M = 2Ku` + treatment +4. a threshold-search implementation that is formally similar to Nunney's but + operationally too permissive or too strict + +These are exactly the points where current diagnostics and paper-scale runs +should be examined. + +## How This Effort Differs From Nunney + +Nunney's paper presents the biological argument through a simulation workflow. +This repo separates that into layers: + +- simulation kernel +- threshold search +- analysis and reporting +- dataset generation +- extinction fitting +- orchestration + +That separation does not change the Track 1 target; it makes it inspectable. + +## Current Bottom Line + +The project should treat Nunney's paper as making three distinct deliverables +necessary: + +1. a faithful historical reconstruction of the published simulation and search + rule +2. a clear statement of where the paper is underdetermined +3. a modern replacement path that keeps the biological question while replacing + the threshold heuristic and performance limitations + +Track 1 in `renunney` now addresses the first two. Track 2 remains the next +major scientific and engineering step.