A proposed taxonomy of AI approaches that captures relation to biological inspiration and yields an OPT-Code framework for specifying AI systems, including hybrids and pipelines.
Go to file
Wesley R. Elsberry e81e3a8879 Initial pieces versions. 2025-11-17 12:37:16 -05:00
doc Initial manuscript files commit 2025-11-13 20:49:09 -05:00
paper Initial pieces versions. 2025-11-17 12:37:16 -05:00
.gitignore Initial commit 2025-11-12 06:29:51 -05:00
LICENSE Initial commit 2025-11-12 06:29:51 -05:00
README.md Update with OPT-Intent 2025-11-14 13:55:06 -05:00

README.md

OPT: Operational Premise Taxonomy for AI Systems

This repository collects:

  • The LaTeX manuscript defining the Operational Premise Taxonomy (OPT) and the OPTCode convention.
  • Prompt sets for classifying AI systems into OPT mechanisms using large language models (LLMs).
  • A small Python library and scripts to run an endtoend Classifier → Evaluator → Adjudicator pipeline.
  • A handannotated gold test suite of systems (backprop, GA, A*, rulebased expert systems, PSO, AIS, etc.).
  • Example JSONL/YAML audit logs for storing OPT classifications.

The core idea: classify AI implementations by their operative mechanism (learning, evolution, symbolic reasoning, probabilistic inference, search, control, swarm, or hybrids), while explicitly separating that from execution details (parallelism, pipelines, hardware).


1. Repository layout

Operational-Premise-Taxonomy/
├── README.md              # This file
├── LICENSE
├── .gitignore
├── Makefile               # Top-level convenience targets
│
├── paper/                 # LaTeX sources for the OPT paper
│   ├── main.tex           # arXiv/general article format
│   ├── main_ieee.tex      # IEEE two-column wrapper
│   ├── main_acm.tex       # ACM-style wrapper
│   ├── main_kaobook.tex   # Book-style wrapper
│   ├── body_shared.tex    # Shared main content
│   ├── related-work.tex   # Related work section
│   ├── appendix_opt_prompts.tex
│   ├── appendix_prompt_minimal.tex
│   ├── appendix_prompt_maximal.tex
│   ├── appendix_prompt_evaluator.tex
│   ├── figures/           # TikZ/PGFPlots figures
│   │   ├── opt_radar_1.tikz
│   │   ├── opt_radar_2.tikz
│   │   └── opt_eval_pipeline.tikz
│   ├── Makefile           # Build main.pdf, IEEE/ACM variants
│   └── bib/
│       └── references.bib
│
├── prompts/               # Plain-text LLM prompts
│   ├── minimal_classifier_prompt.txt
│   ├── maximal_classifier_prompt.txt
│   ├── evaluator_prompt.txt
│   └── adjudicator_prompt.txt
│
├── opt_eval/              # Python library for OPT classification/evaluation
│   ├── __init__.py
│   ├── opt_prompts.py     # Utility to load prompt text
│   ├── opt_pipeline.py    # Data classes + run_pipeline + parsers
│   ├── model_client.py    # Abstraction over your local/remote LLM endpoint
│   ├── cli.py             # CLI entrypoint for simple use
│   └── tests/
│       ├── __init__.py
│       ├── test_parsers.py
│       ├── test_gold_suite.py
│       └── data/
│           ├── gold_opt.yaml
│           └── gold_opt.jsonl
│
├── data/
│   ├── gold/
│   │   ├── opt_gold.yaml   # Canonical gold test suite
│   │   └── opt_gold.jsonl
│   └── examples/
│       ├── opt_audit_example.jsonl
│       └── opt_audit_example.yaml
│
├── scripts/
│   ├── run_eval_pipeline.py
│   └── export_gold_to_jsonl.py
│
└── docs/
    ├── usage.md
    ├── schema_opt_audit.md
    └── model_notes_local_llm.md

2. Building the paper

The paper lives in paper/ and is structured to support multiple venues (arXiv, IEEE, ACM, bookstyle).

Prerequisites

  • A reasonably recent TeX Live (or MikTeX) with:

    • pgfplots (with polar library),
    • newtxtext, newtxmath,
    • booktabs, longtable, framed, fancyvrb, etc.
  • latexmk and make.

Typical build

From the repository root:

cd paper
make          # builds main.pdf by default
# Or explicitly:
make main.pdf

# For an IEEE variant:
make main_ieee.pdf

# For ACM:
make main_acm.pdf

If you run into font or pgfplots compat warnings, consult comments at the top of main.tex and body_shared.tex (we assume \pgfplotsset{compat=1.18} and \usepackage{newtxtext,newtxmath}).


3. Python OPT evaluation pipeline

The opt_eval package provides:

  • Data classes for candidate classifications, evaluator results, and adjudications.

  • Parsers for extracting OPT lines and rationales from LLM output.

  • A run_pipeline function that wires together:

    • Classifier A and B,
    • Evaluator,
    • Adjudicator,
    • and returns a structured result suitable for JSONL/YAML logging.

3.1 Installation (local dev)

Option 1: editable install with pip:

python -m venv .venv
source .venv/bin/activate
pip install -U pip

pip install -e .
# or, if you dont define setup.cfg/pyproject:
pip install pyyaml

Option 2: just use it in-place with PYTHONPATH:

export PYTHONPATH=$PWD

3.2 Configuring a local LLM

You must implement opt_eval/model_client.py to talk to your model(s). A typical pattern:

  • For an OpenAI-compatible HTTP endpoint (local or remote), use requests or openai client.
  • For Ollama or llamafile, call http://localhost:11434 or similar.

model_client.call_model(system_prompt, user_content, model="local-llm") should:

  1. Send system_prompt as the system role (if your API supports it).
  2. Send user_content as the user content.
  3. Return the raw text content of the models reply.

Once implemented, you can run the pipeline on a simple description.


4. Quickstart: Running the evaluation pipeline

Minimal example (from repo root, after configuring model_client.py):

python scripts/run_eval_pipeline.py << 'EOF'
This system trains a fully-connected neural network on MNIST using SGD and
cross-entropy loss, and then uses the trained weights for inference only.
EOF

A typical JSON-like output will include:

  • candidate_a, candidate_b
  • eval_a, eval_b
  • final (final OPT-Code and rationale)
  • adjudication (if performed)

You can adapt run_eval_pipeline.py to write JSONL to data/examples/opt_audit_example.jsonl.


5. Gold test suite and benchmarking

The directory data/gold/ contains a small handannotated test suite (opt_gold.yaml and opt_gold.jsonl) covering:

  • Backprop MLP on MNIST (Lrn),
  • GA for TSP (Evo),
  • A* gridworld planner (Sch),
  • Rule-based expert system like XCON (Sym),
  • Bayesian network for fault diagnosis (Prb),
  • Deep Q-Network for Atari (Lrn),
  • PID + Kalman filter drone control (Ctl),
  • PSO for hyperparameter tuning (Swm),
  • Immune negative-selection anomaly detection (Evo/Sch+Prb),
  • Three-stage hybrid: GA → rule pruning → Bayesian classifier (Evo/Sym/Prb).

To run tests (after youve wired up model_client.py):

pytest opt_eval/tests

test_gold_suite.py will:

  • Call the classifier prompt(s) on each gold description.
  • Compare predicted OPT roots against the gold OPTCode.
  • Optionally compute partial-match metrics (Jaccard similarity of root sets) and simple accuracy.

6. JSONL/YAML audit logs

For large-scale use, we recommend JSONL or YAML for storing evaluations.

  • Example JSONL audit: data/examples/opt_audit_example.jsonl
  • Example YAML audit: data/examples/opt_audit_example.yaml

Each record includes:

  • id, description
  • candidates (A, B)
  • evaluations (verdicts, scores)
  • adjudication
  • final (final OPT-Code)
  • meta (timestamps, model IDs, etc.)

See docs/schema_opt_audit.md for field descriptions.


7. Using smaller local LLMs

OPT classification needs:

  • Understanding of code/algorithm descriptions.
  • Solid instruction-following.
  • Ability to respect a fairly structured output format.

Models that are feasible to run locally and are good candidates:

  • LLaMA 3 8B Instruct Good general reasoning and code understanding; works well as Classifier, Evaluator, and Adjudicator if VRAM allows.

  • Mistral 7B Instruct (and compatible fine-tunes like Dolphin, OpenHermes) Strong general-purpose local model with solid coding and instruction-following; good as a classifier.

  • Qwen2 7B / 14B Instruct 7B is a capable all-rounder; 14B (if you can run it) is strong for the evaluator/adjudicator roles.

  • Phi-3-mini (3.8B) Instruct Smaller footprint; may work as a classifier on simpler cases. For nuanced hybrid systems (Evo/Sym/Prb, Swm vs Evo, Ctl vs Prb), you may want a larger model as evaluator/adjudicator.

A reasonable starting configuration:

  • Classifier A: llama3-8b-instruct
  • Classifier B: mistral-7b-instruct
  • Evaluator: qwen2-14b-instruct (if available) or llama3-8b-instruct
  • Adjudicator: same as Evaluator

You can also run all roles on the same 78B model if resources are constrained; the explicit prompts and the evaluator rubric are designed to catch many misclassifications.

See docs/model_notes_local_llm.md for more detailed notes on deployment options (Ollama, llamafile, vLLM, etc.) and recommended quantization levels.


8. Citing

Once the OPT paper is on arXiv or accepted somewhere, include a BibTeX entry like:

@article{Elsberry_OPT_2025,
  author  = {Wesley R. Elsberry and N.~Collaborators},
  title   = {Operational Premise Taxonomy (OPT): Mechanism-Level Classification of AI Systems},
  journal = {arXiv preprint},
  year    = {2025},
  eprint  = {XXXX.YYYYY},
  archivePrefix = {arXiv}
}

(Replace with the actual venue and identifier when available.)


9. Contributing

  • Extend the gold test suite (YAML + JSONL) with more systems and hybrids.
  • Add additional prompts (e.g., language-specific variants for Python-only code, RL-specific prompts).
  • Improve the parsing logic or add better metrics (confusion matrices, root-wise F1).
  • Open issues for any misclassifications that recur: they can inform future revisions of prompts and possibly the taxonomy itself.

Pull requests that add well-documented examples, tests, or tooling around OPT are welcome.