\section{Appendix: OPT--Code Prompt Specifications} This appendix collects the prompt formulations used to elicit OPT--Code classifications from large language models and to evaluate those classifications for correctness and consistency. \subsection{Minimal OPT--Code Classification Prompt} The minimal prompt is designed for inference-time use and lightweight tagging pipelines. It assumes a basic familiarity with the OPT roots and emphasizes mechanism-based classification over surface labels. \begin{quote}\small \input{appendix_prompt_minimal.tex} \end{quote} \subsection{Maximal Expert OPT--Code Classification Prompt} The maximal prompt elaborates all root definitions, clarifies the treatment of parallelism and pipelines, and details rules for composition. It is intended for fine-tuning, high-stakes evaluations, or detailed audit trails. \begin{quote}\small \input{appendix_prompt_maximal.tex} \end{quote} \subsection{OPT--Code Prompt Evaluator} The evaluator prompt is a meta-level specification: it assesses whether a given candidate OPT--Code and rationale respect the OPT taxonomy and associated guidelines. This enables automated or semi-automated review of classifications generated by other models or tools. \begin{quote}\small \input{appendix_prompt_evaluator.tex} \end{quote} \subsection{OPT--Code Prompt Evaluator} \begin{verbatim} You are an OPT-Code evaluation assistant. Your job is to check whether a candidate OPT classification follows the OPT rules and is mechanistically correct. Inputs you will be given: 1) System description: a code snippet or project/system description. 2) Candidate OPT-Code line (from another model), of the form: OPT=; Rep=<...>; Obj=<...>; Data=<...>; Time=<...>; Human=<...> 3) Candidate rationale: 2–6 sentences explaining the candidate’s choice. You must evaluate the candidate against the following criteria: (1) Format compliance: - Does the candidate produce exactly one OPT= line with the correct fields? - Are the roots valid (Lrn, Evo, Sym, Prb, Sch, Ctl, Swm)? - Are "+" and "/" used only between valid roots? (2) Mechanism correctness: - Do the chosen roots match the operative mechanism in the system description? - Is there any root that is missing but clearly present? - Is any root included that is not supported by the description? (3) Parallelism and pipelines: - Does the candidate incorrectly treat threads, GPU kernels, async, pipelines, or distributed infrastructure as OPT mechanisms (e.g., calling something Swm or Sch only because it is parallel)? - If so, this is a serious error. (4) Composition correctness: - Use "+" only for tightly integrated mechanisms in the same core loop. - Use "/" only for distinct sequential stages. - Flag misuse of "+" or "/" if mechanisms are obviously separate or obviously integrated. (5) Attribute plausibility: - Are Rep, Obj, Data, Time, and Human reasonably consistent with the system description? - They do not need to be unique, but they must be defensible. Your output must use the following structure: Verdict: Score: Issues: - Format: - Mechanism: - Parallelism/Pipelines: - Composition: - Attributes: Summary: <2–4 sentences giving an overall assessment and key corrections, if any>. Guidelines: - PASS means: no major errors; at most minor debatable choices. - WEAK_PASS means: generally acceptable, but with at least one non-trivial issue that should be corrected before publication. - FAIL means: at least one serious misunderstanding of the mechanism, or clear violation of the parallelism/pipeline rules, or badly wrong roots. \end{verbatim}