9.4 KiB

Raw Blame History

Python 3 Migration Plan

Scope

The original system is a cooperative composition pipeline built from three neural subsystems:

Bach: a Hopfield-Tank note generator over a 5-position by 8-note grid.
Salieri: a back-propagation critic trained against a rule-based classical-sequence supervisor.
Beethoven: an ART1 novelty/category network over the note sequence plus one classicality bit.

The immediate goal should be a Python 3 package that reproduces the Pascal algorithms and file-driven behavior closely enough to validate compatibility, while replacing the Pascal linked-list memory model with direct numeric data structures.

What Exists Today

Core orchestration

THES/ANNCOMP.PP is the integrated driver.
The composition loop is effectively:
1. Generate a candidate note with the Hopfield-Tank network.
2. Evaluate/train the back-propagation network using the current note window and the rule-based instructor.
3. Pass the same window plus the classical/not-classical flag into ART1.

Shared state

THES/GLOBALS.PP defines:
- fixed note vocabulary of 8 notes,
- sequence window length of 5,
- ART1 dimensions Max_F1_nodes = 41, Max_F2_nodes = 25,
- Common_Area_, which is the cross-network exchange object.

Hopfield-Tank subsystem

THES/ANNCOMP.PP implements Bach and nested HTN.
The network operates on a flattened 40-cell representation: 8 notes x 5 positions.
It loads a 64 x 64 weight matrix from HTN.DAT, but the active note grid uses the first 40 cells.
The update rule uses:
- per-neuron activation a,
- output 0.5 * (1 + tanh(a / c)),
- resistance/capacitance/input/weight/iteration scaling factors from globals.
THES/HTNDATA.PP shows how the Hopfield weights were built from SEQUENCE.DAT, plus row/column inhibition and sequence reinforcement.

Back-propagation subsystem

THES/BP_UNIT.PP is a general BP implementation with:
- input, hidden, and output nodes,
- weight matrix and momentum,
- feed-forward,
- back-propagation,
- file-based parameter and weight loading.
THES/S61.DAT configures Salieri as:
- 40 input nodes,
- 20 hidden nodes,
- 1 output node,
- learning rate 0.5,
- momentum 0.5.
THES/ANNCOMP.PP converts the current 5-note window into a 40-bit one-hot vector and trains the network online against Classical_instructor.

Rule-based supervisor

THES/CLASINST.PP loads SEQUENCE.DAT.
It converts the 5-note sequence to a digit string and returns 1 if the target suffix matches any stored example sequence, else 0.
This acts as the teaching signal for the BP network.

ART1 subsystem

THES/ANNCOMP.PP implements ART1.
F1 input is the 40-bit one-hot sequence plus one bit for Is_classical, for a total vector length of 41.
F2 supports up to 25 committed categories.
The implementation includes a nonstandard compatibility detail: when all categories are saturated and none remain eligible, vigilance is reduced by 1 percent and matching is retried.

Legacy data model problem

THES/STRUCT.PP provides generic linked-list vectors and matrices (DVE, HVE) used to work around Turbo Pascal memory constraints.
THES/BP_UNIT.PP stores nodes, IO vectors, and weights through those linked structures rather than direct arrays.
That representation should not be preserved in Python except where needed for compatibility tests.

Recommended Python Representation

Use explicit typed structures and dense arrays:

numpy.ndarray for:
- Hopfield state vectors and weight matrices,
- BP activations, deltas, biases, and weights,
- ART1 F1/F2 activations and top-down/bottom-up LTM weights.
dataclasses.dataclass for stable API/state containers.
Enum for note identifiers only if it does not complicate file compatibility.

Recommended canonical encodings:

NoteSequence: shape (5,), integer values 0..8.
SequenceOneHot: shape (40,), binary.
ArtInputVector: shape (41,), binary.
HopfieldWeights: shape (40, 40) as the normalized active subset of the legacy file.
BPWeightsIH, BPWeightsHO or one legacy-compatible dense square matrix, depending on whether fidelity or clarity is prioritized in a given layer of the codebase.

Package Layout

composer_ans/
  __init__.py
  types.py
  encoding.py
  io/
    __init__.py
    legacy_files.py
  hopfield.py
  backprop.py
  art1.py
  classical_rules.py
  pipeline.py
  compatibility.py
tests/
  data/
  test_encoding.py
  test_classical_rules.py
  test_hopfield.py
  test_backprop.py
  test_art1.py
  test_pipeline.py

API Design

Keep the public API small and deterministic.

from composer_ans.pipeline import CompositionContext, CompositionPipeline

ctx = CompositionContext(notes=[0, 0, 0, 0, 0])
pipeline = CompositionPipeline.from_legacy_data("THES")
result = pipeline.step(ctx)

Suggested subsystem APIs:

candidate = hopfield.generate_next_note(notes, params)
is_classical, bp_state = salieri.evaluate_and_train(notes, target=None)
art_result = beethoven.categorize(notes, is_classical)

Where:

target=None means "derive target from the classical instructor", matching the Pascal integrated flow.
Each call returns structured state useful for debugging and test baselines, not just the final scalar.

Migration Strategy

Phase 1: Preserve semantics, not implementation style

Recreate file readers for:
- SEQUENCE.DAT,
- S61.DAT,
- S61.WT,
- HTN.DAT.
Recreate sequence encodings exactly:
- 5-note rolling window,
- 40-bit one-hot flattening,
- ART1 extra classicality bit.
Recreate the rule-based instructor exactly before porting the trainable models.

Deliverable:

A Python package that can parse legacy files and reproduce the same encoded inputs the Pascal code would produce.

Phase 2: Port Hopfield-Tank

Implement the continuous-time iterative update as written.
Preserve:
- noise injection behavior,
- stop condition using epsilon on alternating time buffers,
- "pick max cell in each column" post-processing.
Isolate random number generation behind an injectable RNG so deterministic tests are possible.

Deliverable:

generate_next_note() producing the same result as Pascal for fixed seeds and known sequences.

Phase 3: Port Salieri back-propagation

First implement a legacy-compatible execution mode mirroring the square-node storage and update order.
Then wrap it with a clearer façade that exposes standard layer matrices.
Preserve:
- sigmoid behavior,
- theta updates,
- momentum handling,
- online training after every presentation,
- periodic weight dumping capability.

Deliverable:

evaluate_and_train() matching legacy outputs and weight updates for a controlled presentation sequence.

Phase 4: Port Beethoven ART1

Port the F1/F2 STM and LTM equations directly.
Preserve:
- 41-bit input vector,
- eligibility and commitment logic,
- resonance loop,
- modified vigilance-reduction behavior on saturation.
Keep ART1 state persistent across calls, because the Pascal version learns over the composition session.

Deliverable:

categorize() returning winner, new-category flag, vigilance-change flag, and current category count.

Phase 5: Rebuild the integrated pipeline

Recreate Common_Area_ as a Python dataclass.
Implement a single-step pipeline equivalent to one iteration of the Pascal composition loop.
Add an optional batch runner that emits a complete composition and an event log.

Deliverable:

End-to-end run over a fixed number of notes using legacy data assets.

Compatibility Plan

Compatibility should be measured in layers:

Encoding compatibility:
- identical one-hot vectors and ART input vectors for the same note windows.
File compatibility:
- legacy .DAT and .WT files load without manual editing.
Behavioral compatibility:
- same classical instructor decisions,
- same Hopfield winner for fixed seed/input,
- same BP output progression for replayed presentations,
- same ART1 category decisions for replayed inputs.
Pipeline compatibility:
- same sequence of generated notes for a fixed random seed, or if exact replication is blocked by legacy RNG differences, same per-step subsystem outputs within defined tolerances.

Known Risks

Pascal Single, file layout, and RNG behavior may not map exactly to Python defaults.
HTN.DAT is written as a Pascal binary FILE OF ARRAY[1..64,1..64] OF REAL; a dedicated reader may be needed to confirm element size and ordering.
The BP code relies on update order within linked structures. A mathematically equivalent refactor may still diverge numerically unless a legacy mode preserves operation order.
ART1 has thesis-specific modifications; replacing them with textbook ART1 would break compatibility.

Recommended Delivery Order

Build legacy readers and encoders.
Port Classical_instructor.
Port Hopfield-Tank and verify with controlled seeds.
Port BP in legacy-compatible mode and replay known presentations.
Port ART1 with persistent state.
Assemble the integrated pipeline.
Add a second, cleaner API layer only after compatibility tests pass.

Immediate Next Step

Implement the non-neural compatibility layer first:

legacy file parsers,
note/sequence encoders,
rule-based classical instructor,
golden tests based on the files already in THES.

That gives a stable foundation for porting the three neural subsystems without losing track of what the original program actually did.

9.4 KiB Raw Blame History

Python 3 Migration Plan

Scope

What Exists Today

Core orchestration

Shared state

Hopfield-Tank subsystem

Back-propagation subsystem

Rule-based supervisor

ART1 subsystem

Legacy data model problem

Recommended Python Representation

Package Layout

API Design

Migration Strategy

Phase 1: Preserve semantics, not implementation style

Phase 2: Port Hopfield-Tank

Phase 3: Port Salieri back-propagation

Phase 4: Port Beethoven ART1

Phase 5: Rebuild the integrated pipeline

Compatibility Plan

Known Risks

Recommended Delivery Order

Immediate Next Step

9.4 KiB

Raw Blame History