sr-rs/docs/PORTING.md

43 lines
1.8 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Porting plan (from SymbolicRegression.jl concepts)
This document maps major features/design choices to Rust equivalents and notes feasibility.
## Search & representation
- **Representation:** tree/graph AST with typed ops. Rust enums for ops and nodes. Feasible (done in MVP).
- **Search:** GP with tournament selection + crossover/mutation, plus Pareto archiving. Extensible to age-fitness, lexicase.
- **Constants:** LM/variable projection via `argmin`/`nalgebra`. Start with LM; later add autodiff for local gradients.
## Evaluation performance
- **Julias fused loops & SIMD:** replicate with `std::simd` and hand-rolled evaluators. Bench and specialize hot ops.
- **Parallelism:** use `rayon` for population eval; thread support optional in WASM (SharedArrayBuffer route).
## Simplification & pruning
- **Local rules:** identities (e.g., `x+0→x`, `x*1→x`, `sin(0)→0`).
- **Global rewriting:** integrate `egg` as optional feature for equality saturation + cost-based extraction.
## Multi-objective
- Maintain Pareto front on (error, complexity). Provide knobs for complexity penalty and max size/depth.
## Export & interop
- **Python:** PyO3 module `symreg_rs` with sklearn-like API (`fit`, `predict`, `score`).
- **WASM:** thin bindgen façade for browser demos; avoid heavy deps.
- **SymPy/LaTeX:** stringifier to SymPy code, plus LaTeX pretty-printing.
## Test & benchmark
- Reuse standard SRBench datasets where licensing permits; include Friedman1 synthetic.
- Add criterion benches for evaluator and variation operators.
## Milestones
1. MVP (this scaffold): scalar eval, GP loop, Python & WASM hello world.
2. Vectorized eval + `rayon` parallel pop eval; constant fitting.
3. Simplifier + export; sklearn-compatible estimator.
4. E-graph integration; advanced search strategies; docs + notebooks.