Update with OPT-Intent

2025-11-14 13:55:06 -05:00 · 2025-11-14 13:55:06 -05:00 · 0856cb2e76
parent e89da19290
commit 0856cb2e76
1 changed files with 297 additions and 5 deletions
--- a/README.md
+++ b/README.md
@ -1,10 +1,302 @@
-# Operational-Premise-Taxonomy
+# OPT: Operational Premise Taxonomy for AI Systems

-A proposed taxonomy of AI approaches that captures relation to biological inspiration and yields an OPT-Code framework for specifying AI systems, including hybrids and pipelines.
+This repository collects:

-## Making the document
+- The LaTeX manuscript defining the **Operational Premise Taxonomy (OPT)** and the OPT‐Code convention.
+- Prompt sets for classifying AI systems into OPT mechanisms using large language models (LLMs).
+- A small Python library and scripts to run an end‐to‐end **Classifier → Evaluator → Adjudicator** pipeline.
+- A hand‐annotated **gold test suite** of systems (backprop, GA, A*, rule‐based expert systems, PSO, AIS, etc.).
+- Example JSONL/YAML **audit logs** for storing OPT classifications.
+
+The core idea: classify AI implementations by their **operative mechanism** (learning, evolution, symbolic reasoning, probabilistic inference, search, control, swarm, or hybrids), while explicitly separating that from **execution details** (parallelism, pipelines, hardware).
+
+---
+
+## 1. Repository layout
+
+```text
+Operational-Premise-Taxonomy/
+├── README.md              # This file
+├── LICENSE
+├── .gitignore
+├── Makefile               # Top-level convenience targets
+│
+├── paper/                 # LaTeX sources for the OPT paper
+│   ├── main.tex           # arXiv/general article format
+│   ├── main_ieee.tex      # IEEE two-column wrapper
+│   ├── main_acm.tex       # ACM-style wrapper
+│   ├── main_kaobook.tex   # Book-style wrapper
+│   ├── body_shared.tex    # Shared main content
+│   ├── related-work.tex   # Related work section
+│   ├── appendix_opt_prompts.tex
+│   ├── appendix_prompt_minimal.tex
+│   ├── appendix_prompt_maximal.tex
+│   ├── appendix_prompt_evaluator.tex
+│   ├── figures/           # TikZ/PGFPlots figures
+│   │   ├── opt_radar_1.tikz
+│   │   ├── opt_radar_2.tikz
+│   │   └── opt_eval_pipeline.tikz
+│   ├── Makefile           # Build main.pdf, IEEE/ACM variants
+│   └── bib/
+│       └── references.bib
+│
+├── prompts/               # Plain-text LLM prompts
+│   ├── minimal_classifier_prompt.txt
+│   ├── maximal_classifier_prompt.txt
+│   ├── evaluator_prompt.txt
+│   └── adjudicator_prompt.txt
+│
+├── opt_eval/              # Python library for OPT classification/evaluation
+│   ├── __init__.py
+│   ├── opt_prompts.py     # Utility to load prompt text
+│   ├── opt_pipeline.py    # Data classes + run_pipeline + parsers
+│   ├── model_client.py    # Abstraction over your local/remote LLM endpoint
+│   ├── cli.py             # CLI entrypoint for simple use
+│   └── tests/
+│       ├── __init__.py
+│       ├── test_parsers.py
+│       ├── test_gold_suite.py
+│       └── data/
+│           ├── gold_opt.yaml
+│           └── gold_opt.jsonl
+│
+├── data/
+│   ├── gold/
+│   │   ├── opt_gold.yaml   # Canonical gold test suite
+│   │   └── opt_gold.jsonl
+│   └── examples/
+│       ├── opt_audit_example.jsonl
+│       └── opt_audit_example.yaml
+│
+├── scripts/
+│   ├── run_eval_pipeline.py
+│   └── export_gold_to_jsonl.py
+│
+└── docs/
+    ├── usage.md
+    ├── schema_opt_audit.md
+    └── model_notes_local_llm.md
+````
+
+---
+
+## 2. Building the paper
+
+The paper lives in `paper/` and is structured to support multiple venues (arXiv, IEEE, ACM, book‐style).
+
+### Prerequisites
+
+* A reasonably recent TeX Live (or MikTeX) with:
+
+  * `pgfplots` (with `polar` library),
+  * `newtxtext`, `newtxmath`,
+  * `booktabs`, `longtable`, `framed`, `fancyvrb`, etc.
+* `latexmk` and `make`.
+
+### Typical build
+
+From the repository root:

 ```bash
-cd doc
-latexmk main.tex
+cd paper
+make          # builds main.pdf by default
+# Or explicitly:
+make main.pdf
+
+# For an IEEE variant:
+make main_ieee.pdf
+
+# For ACM:
+make main_acm.pdf
 ```
+
+If you run into font or pgfplots `compat` warnings, consult comments at the top of `main.tex` and `body_shared.tex` (we assume `\pgfplotsset{compat=1.18}` and `\usepackage{newtxtext,newtxmath}`).
+
+---
+
+## 3. Python OPT evaluation pipeline
+
+The `opt_eval` package provides:
+
+* Data classes for candidate classifications, evaluator results, and adjudications.
+* Parsers for extracting OPT lines and rationales from LLM output.
+* A `run_pipeline` function that wires together:
+
+  * Classifier A and B,
+  * Evaluator,
+  * Adjudicator,
+  * and returns a structured result suitable for JSONL/YAML logging.
+
+### 3.1 Installation (local dev)
+
+Option 1: editable install with `pip`:
+
+```bash
+python -m venv .venv
+source .venv/bin/activate
+pip install -U pip
+
+pip install -e .
+# or, if you don’t define setup.cfg/pyproject:
+pip install pyyaml
+```
+
+Option 2: just use it in-place with `PYTHONPATH`:
+
+```bash
+export PYTHONPATH=$PWD
+```
+
+### 3.2 Configuring a local LLM
+
+You must implement `opt_eval/model_client.py` to talk to your model(s). A typical pattern:
+
+* For an OpenAI-compatible HTTP endpoint (local or remote), use `requests` or `openai` client.
+* For **Ollama** or **llamafile**, call `http://localhost:11434` or similar.
+
+`model_client.call_model(system_prompt, user_content, model="local-llm")` should:
+
+1. Send `system_prompt` as the system role (if your API supports it).
+2. Send `user_content` as the user content.
+3. Return the raw text content of the model’s reply.
+
+Once implemented, you can run the pipeline on a simple description.
+
+---
+
+## 4. Quickstart: Running the evaluation pipeline
+
+Minimal example (from repo root, after configuring `model_client.py`):
+
+```bash
+python scripts/run_eval_pipeline.py << 'EOF'
+This system trains a fully-connected neural network on MNIST using SGD and
+cross-entropy loss, and then uses the trained weights for inference only.
+EOF
+```
+
+A typical JSON-like output will include:
+
+* `candidate_a`, `candidate_b`
+* `eval_a`, `eval_b`
+* `final` (final OPT-Code and rationale)
+* `adjudication` (if performed)
+
+You can adapt `run_eval_pipeline.py` to write JSONL to `data/examples/opt_audit_example.jsonl`.
+
+---
+
+## 5. Gold test suite and benchmarking
+
+The directory `data/gold/` contains a small hand‐annotated test suite (`opt_gold.yaml` and `opt_gold.jsonl`) covering:
+
+* Backprop MLP on MNIST (Lrn),
+* GA for TSP (Evo),
+* A* gridworld planner (Sch),
+* Rule-based expert system like XCON (Sym),
+* Bayesian network for fault diagnosis (Prb),
+* Deep Q-Network for Atari (Lrn),
+* PID + Kalman filter drone control (Ctl),
+* PSO for hyperparameter tuning (Swm),
+* Immune negative-selection anomaly detection (Evo/Sch+Prb),
+* Three-stage hybrid: GA → rule pruning → Bayesian classifier (Evo/Sym/Prb).
+
+To run tests (after you’ve wired up `model_client.py`):
+
+```bash
+pytest opt_eval/tests
+```
+
+`test_gold_suite.py` will:
+
+* Call the classifier prompt(s) on each gold description.
+* Compare predicted OPT roots against the gold OPT‐Code.
+* Optionally compute partial-match metrics (Jaccard similarity of root sets) and simple accuracy.
+
+---
+
+## 6. JSONL/YAML audit logs
+
+For large-scale use, we recommend JSONL or YAML for storing evaluations.
+
+* Example JSONL audit: `data/examples/opt_audit_example.jsonl`
+* Example YAML audit: `data/examples/opt_audit_example.yaml`
+
+Each record includes:
+
+* `id`, `description`
+* `candidates` (A, B)
+* `evaluations` (verdicts, scores)
+* `adjudication`
+* `final` (final OPT-Code)
+* `meta` (timestamps, model IDs, etc.)
+
+See `docs/schema_opt_audit.md` for field descriptions.
+
+---
+
+## 7. Using smaller local LLMs
+
+OPT classification needs:
+
+* Understanding of code/algorithm descriptions.
+* Solid instruction-following.
+* Ability to respect a fairly structured output format.
+
+Models that are feasible to run locally and are good candidates:
+
+* **LLaMA 3 8B Instruct**
+  Good general reasoning and code understanding; works well as Classifier, Evaluator, and Adjudicator if VRAM allows.
+
+* **Mistral 7B Instruct** (and compatible fine-tunes like Dolphin, OpenHermes)
+  Strong general-purpose local model with solid coding and instruction-following; good as a classifier.
+
+* **Qwen2 7B / 14B Instruct**
+  7B is a capable all-rounder; 14B (if you can run it) is strong for the evaluator/adjudicator roles.
+
+* **Phi-3-mini (3.8B) Instruct**
+  Smaller footprint; may work as a classifier on simpler cases. For nuanced hybrid systems (Evo/Sym/Prb, Swm vs Evo, Ctl vs Prb), you may want a larger model as evaluator/adjudicator.
+
+A reasonable starting configuration:
+
+* Classifier A: `llama3-8b-instruct`
+* Classifier B: `mistral-7b-instruct`
+* Evaluator: `qwen2-14b-instruct` (if available) or `llama3-8b-instruct`
+* Adjudicator: same as Evaluator
+
+You can also run all roles on the same 7–8B model if resources are constrained; the explicit prompts and the evaluator rubric are designed to catch many misclassifications.
+
+See `docs/model_notes_local_llm.md` for more detailed notes on deployment options (Ollama, llamafile, vLLM, etc.) and recommended quantization levels.
+
+---
+
+## 8. Citing
+
+Once the OPT paper is on arXiv or accepted somewhere, include a BibTeX entry like:
+
+```bibtex
+@article{Elsberry_OPT_2025,
+  author  = {Wesley R. Elsberry and N.~Collaborators},
+  title   = {Operational Premise Taxonomy (OPT): Mechanism-Level Classification of AI Systems},
+  journal = {arXiv preprint},
+  year    = {2025},
+  eprint  = {XXXX.YYYYY},
+  archivePrefix = {arXiv}
+}
+```
+
+(Replace with the actual venue and identifier when available.)
+
+---
+
+## 9. Contributing
+
+* Extend the gold test suite (YAML + JSONL) with more systems and hybrids.
+* Add additional prompts (e.g., language-specific variants for Python-only code, RL-specific prompts).
+* Improve the parsing logic or add better metrics (confusion matrices, root-wise F1).
+* Open issues for any misclassifications that recur: they can inform future revisions of prompts and possibly the taxonomy itself.
+
+Pull requests that add well-documented examples, tests, or tooling around OPT are welcome.
+
+```
+