sr-rs/docs/PORTING.md

1.8 KiB
Raw Permalink Blame History

Porting plan (from SymbolicRegression.jl concepts)

This document maps major features/design choices to Rust equivalents and notes feasibility.

Search & representation

  • Representation: tree/graph AST with typed ops. Rust enums for ops and nodes. Feasible (done in MVP).
  • Search: GP with tournament selection + crossover/mutation, plus Pareto archiving. Extensible to age-fitness, lexicase.
  • Constants: LM/variable projection via argmin/nalgebra. Start with LM; later add autodiff for local gradients.

Evaluation performance

  • Julias fused loops & SIMD: replicate with std::simd and hand-rolled evaluators. Bench and specialize hot ops.
  • Parallelism: use rayon for population eval; thread support optional in WASM (SharedArrayBuffer route).

Simplification & pruning

  • Local rules: identities (e.g., x+0→x, x*1→x, sin(0)→0).
  • Global rewriting: integrate egg as optional feature for equality saturation + cost-based extraction.

Multi-objective

  • Maintain Pareto front on (error, complexity). Provide knobs for complexity penalty and max size/depth.

Export & interop

  • Python: PyO3 module symreg_rs with sklearn-like API (fit, predict, score).
  • WASM: thin bindgen façade for browser demos; avoid heavy deps.
  • SymPy/LaTeX: stringifier to SymPy code, plus LaTeX pretty-printing.

Test & benchmark

  • Reuse standard SRBench datasets where licensing permits; include Friedman1 synthetic.
  • Add criterion benches for evaluator and variation operators.

Milestones

  1. MVP (this scaffold): scalar eval, GP loop, Python & WASM hello world.
  2. Vectorized eval + rayon parallel pop eval; constant fitting.
  3. Simplifier + export; sklearn-compatible estimator.
  4. E-graph integration; advanced search strategies; docs + notebooks.