9.4 KiB
Python 3 Migration Plan
Scope
The original system is a cooperative composition pipeline built from three neural subsystems:
Bach: a Hopfield-Tank note generator over a 5-position by 8-note grid.Salieri: a back-propagation critic trained against a rule-based classical-sequence supervisor.Beethoven: an ART1 novelty/category network over the note sequence plus one classicality bit.
The immediate goal should be a Python 3 package that reproduces the Pascal algorithms and file-driven behavior closely enough to validate compatibility, while replacing the Pascal linked-list memory model with direct numeric data structures.
What Exists Today
Core orchestration
THES/ANNCOMP.PPis the integrated driver.- The composition loop is effectively:
- Generate a candidate note with the Hopfield-Tank network.
- Evaluate/train the back-propagation network using the current note window and the rule-based instructor.
- Pass the same window plus the classical/not-classical flag into ART1.
Shared state
THES/GLOBALS.PPdefines:- fixed note vocabulary of 8 notes,
- sequence window length of 5,
- ART1 dimensions
Max_F1_nodes = 41,Max_F2_nodes = 25, Common_Area_, which is the cross-network exchange object.
Hopfield-Tank subsystem
THES/ANNCOMP.PPimplementsBachand nestedHTN.- The network operates on a flattened 40-cell representation:
8 notes x 5 positions. - It loads a
64 x 64weight matrix fromHTN.DAT, but the active note grid uses the first 40 cells. - The update rule uses:
- per-neuron activation
a, - output
0.5 * (1 + tanh(a / c)), - resistance/capacitance/input/weight/iteration scaling factors from globals.
- per-neuron activation
THES/HTNDATA.PPshows how the Hopfield weights were built fromSEQUENCE.DAT, plus row/column inhibition and sequence reinforcement.
Back-propagation subsystem
THES/BP_UNIT.PPis a general BP implementation with:- input, hidden, and output nodes,
- weight matrix and momentum,
- feed-forward,
- back-propagation,
- file-based parameter and weight loading.
THES/S61.DATconfigures Salieri as:- 40 input nodes,
- 20 hidden nodes,
- 1 output node,
- learning rate
0.5, - momentum
0.5.
THES/ANNCOMP.PPconverts the current 5-note window into a 40-bit one-hot vector and trains the network online againstClassical_instructor.
Rule-based supervisor
THES/CLASINST.PPloadsSEQUENCE.DAT.- It converts the 5-note sequence to a digit string and returns
1if the target suffix matches any stored example sequence, else0. - This acts as the teaching signal for the BP network.
ART1 subsystem
THES/ANNCOMP.PPimplementsART1.- F1 input is the 40-bit one-hot sequence plus one bit for
Is_classical, for a total vector length of 41. - F2 supports up to 25 committed categories.
- The implementation includes a nonstandard compatibility detail: when all categories are saturated and none remain eligible, vigilance is reduced by 1 percent and matching is retried.
Legacy data model problem
THES/STRUCT.PPprovides generic linked-list vectors and matrices (DVE,HVE) used to work around Turbo Pascal memory constraints.THES/BP_UNIT.PPstores nodes, IO vectors, and weights through those linked structures rather than direct arrays.- That representation should not be preserved in Python except where needed for compatibility tests.
Recommended Python Representation
Use explicit typed structures and dense arrays:
numpy.ndarrayfor:- Hopfield state vectors and weight matrices,
- BP activations, deltas, biases, and weights,
- ART1 F1/F2 activations and top-down/bottom-up LTM weights.
dataclasses.dataclassfor stable API/state containers.Enumfor note identifiers only if it does not complicate file compatibility.
Recommended canonical encodings:
NoteSequence: shape(5,), integer values0..8.SequenceOneHot: shape(40,), binary.ArtInputVector: shape(41,), binary.HopfieldWeights: shape(40, 40)as the normalized active subset of the legacy file.BPWeightsIH,BPWeightsHOor one legacy-compatible dense square matrix, depending on whether fidelity or clarity is prioritized in a given layer of the codebase.
Package Layout
composer_ans/
__init__.py
types.py
encoding.py
io/
__init__.py
legacy_files.py
hopfield.py
backprop.py
art1.py
classical_rules.py
pipeline.py
compatibility.py
tests/
data/
test_encoding.py
test_classical_rules.py
test_hopfield.py
test_backprop.py
test_art1.py
test_pipeline.py
API Design
Keep the public API small and deterministic.
from composer_ans.pipeline import CompositionContext, CompositionPipeline
ctx = CompositionContext(notes=[0, 0, 0, 0, 0])
pipeline = CompositionPipeline.from_legacy_data("THES")
result = pipeline.step(ctx)
Suggested subsystem APIs:
candidate = hopfield.generate_next_note(notes, params)
is_classical, bp_state = salieri.evaluate_and_train(notes, target=None)
art_result = beethoven.categorize(notes, is_classical)
Where:
target=Nonemeans "derive target from the classical instructor", matching the Pascal integrated flow.- Each call returns structured state useful for debugging and test baselines, not just the final scalar.
Migration Strategy
Phase 1: Preserve semantics, not implementation style
- Recreate file readers for:
SEQUENCE.DAT,S61.DAT,S61.WT,HTN.DAT.
- Recreate sequence encodings exactly:
- 5-note rolling window,
- 40-bit one-hot flattening,
- ART1 extra classicality bit.
- Recreate the rule-based instructor exactly before porting the trainable models.
Deliverable:
- A Python package that can parse legacy files and reproduce the same encoded inputs the Pascal code would produce.
Phase 2: Port Hopfield-Tank
- Implement the continuous-time iterative update as written.
- Preserve:
- noise injection behavior,
- stop condition using epsilon on alternating time buffers,
- "pick max cell in each column" post-processing.
- Isolate random number generation behind an injectable RNG so deterministic tests are possible.
Deliverable:
generate_next_note()producing the same result as Pascal for fixed seeds and known sequences.
Phase 3: Port Salieri back-propagation
- First implement a legacy-compatible execution mode mirroring the square-node storage and update order.
- Then wrap it with a clearer façade that exposes standard layer matrices.
- Preserve:
- sigmoid behavior,
- theta updates,
- momentum handling,
- online training after every presentation,
- periodic weight dumping capability.
Deliverable:
evaluate_and_train()matching legacy outputs and weight updates for a controlled presentation sequence.
Phase 4: Port Beethoven ART1
- Port the F1/F2 STM and LTM equations directly.
- Preserve:
- 41-bit input vector,
- eligibility and commitment logic,
- resonance loop,
- modified vigilance-reduction behavior on saturation.
- Keep ART1 state persistent across calls, because the Pascal version learns over the composition session.
Deliverable:
categorize()returning winner, new-category flag, vigilance-change flag, and current category count.
Phase 5: Rebuild the integrated pipeline
- Recreate
Common_Area_as a Python dataclass. - Implement a single-step pipeline equivalent to one iteration of the Pascal composition loop.
- Add an optional batch runner that emits a complete composition and an event log.
Deliverable:
- End-to-end run over a fixed number of notes using legacy data assets.
Compatibility Plan
Compatibility should be measured in layers:
- Encoding compatibility:
- identical one-hot vectors and ART input vectors for the same note windows.
- File compatibility:
- legacy
.DATand.WTfiles load without manual editing.
- legacy
- Behavioral compatibility:
- same classical instructor decisions,
- same Hopfield winner for fixed seed/input,
- same BP output progression for replayed presentations,
- same ART1 category decisions for replayed inputs.
- Pipeline compatibility:
- same sequence of generated notes for a fixed random seed, or if exact replication is blocked by legacy RNG differences, same per-step subsystem outputs within defined tolerances.
Known Risks
- Pascal
Single, file layout, and RNG behavior may not map exactly to Python defaults. HTN.DATis written as a Pascal binaryFILE OF ARRAY[1..64,1..64] OF REAL; a dedicated reader may be needed to confirm element size and ordering.- The BP code relies on update order within linked structures. A mathematically equivalent refactor may still diverge numerically unless a legacy mode preserves operation order.
- ART1 has thesis-specific modifications; replacing them with textbook ART1 would break compatibility.
Recommended Delivery Order
- Build legacy readers and encoders.
- Port
Classical_instructor. - Port Hopfield-Tank and verify with controlled seeds.
- Port BP in legacy-compatible mode and replay known presentations.
- Port ART1 with persistent state.
- Assemble the integrated pipeline.
- Add a second, cleaner API layer only after compatibility tests pass.
Immediate Next Step
Implement the non-neural compatibility layer first:
- legacy file parsers,
- note/sequence encoders,
- rule-based classical instructor,
- golden tests based on the files already in
THES.
That gives a stable foundation for porting the three neural subsystems without losing track of what the original program actually did.