# Doclift Claim Tournament This benchmark is a small evaluation harness for comparing multiple doclift prose-claim extraction strategies before changing the default GroundRecall import behavior. Current tracks: - `conservative`: prefers higher precision and sentence-level claims. - `broad`: allows paragraph-level claims and shorter sentence candidates to improve recall. Judge criteria: - maximize F1 against the benchmark gold claims - prefer higher recall when F1 ties - penalize meta or identity-claim noise - prefer predicted claim counts close to the gold-set size Fixture location: - `tests/fixtures/doclift_claim_eval/` Primary entrypoint: - `groundrecall.doclift_claim_tournament.evaluate_doclift_claim_tracks(...)` This is intentionally small and deterministic. It is meant to support an iterative tournament workflow, not to serve as a full evaluation platform by itself.