EcoSpecies-Atlas/docs/legacy-survey.md

102 lines
4.5 KiB
Markdown

# Legacy EcoSpecies Survey
## Scope
This repository is primarily an archive of the legacy EcoSpecies system and its source materials. The current contents are sufficient to begin a structured migration plan and an initial replacement implementation.
## Repository Inventory
- `01-legacy-code-and-data/EcoSpecies_2012_0807_onCurly`: legacy ASP.NET MVC application source.
- `01-legacy-code-and-data/EcoSpeciesSql_new`: SQL Server database creation and lookup-population scripts.
- `01-legacy-code-and-data/InputFiles - TXT`: 92 source species life history text files.
- `01-legacy-code-and-data/OutputFiles - RTF`: 95 generated report outputs.
- `01-legacy-code-and-data/ecospecies-2/species-life-histories`: 95 paired `.txt` and `.sql` files representing a later export snapshot.
- `01-legacy-code-and-data/slh-mod-txt2sql`: Python parsing scripts and hand-edited SQL/text files used to ingest SLH content.
- `01-legacy-code-and-data/TextFilesAboutFLELMR_EcoSpecies`: manuals, contract/report artifacts, information architecture, and historical background.
- `02-docs`: project notes, species outlines, import notes, and spreadsheets.
## Legacy Capabilities Confirmed
The legacy ASP.NET MVC application exposes these core workflows:
- Public species list with taxonomic sorting and fielded search.
- Per-species detail pages.
- Heading and subheading navigation for life-history content.
- Report generation to `rtf`, `txt`, and "web only" output modes.
- Public/private visibility controls via `tbl_Slh.PublicView`.
- XML-template-driven report assembly.
Evidence:
- `HomeController.cs` provides `Home`, `About`, `Glossary`, and `Manual`.
- `OrganismController.cs` implements listing, filtering, details, node/subnode views, and CRUD.
- `ReportController.cs` implements report generation and batch export.
## Legacy Architecture Summary
### Application layer
- ASP.NET MVC 3-era application.
- Entity Framework database-first model (`EcoSpecies.edmx` and generated context/models).
- Razor views with jQuery/jQuery UI assets.
### Data layer
- SQL Server database `Eco_Species`.
- 31 SQL scripts in `EcoSpeciesSql_new` for database creation, schema population, lookup tables, admin user creation, and XML template support.
### Content pipeline
- Species Life History text files are semi-structured and heading-driven.
- Legacy Python parsing scripts (`slhparse.py`, `slhparse_2012_0801.py`) contain domain-specific cleanup and tag recognition logic.
- Generated outputs include SQL inserts and RTF/text reports.
## Important Migration Observations
### What is reusable
- Raw SLH text corpus.
- SQL schema as a source of domain concepts and relationship mapping.
- Parsing logic and tag dictionaries as institutional knowledge.
- Glossary/manual/about content for continuity.
- Existing report outputs for regression comparison.
### What should not be copied forward directly
- SQL Server-specific operational assumptions.
- Legacy publish/deploy practices.
- MVC 3 / EF database-first scaffolding.
- Generated binaries and `obj` artifacts.
### Data-model implications
The archive suggests the modern system needs first-class support for:
- species and taxonomic metadata
- one or more source documents per species
- hierarchical sections/headings/subheadings
- citations/references and authoring metadata
- visibility/publication state
- report/export templates
- ecological linkages suitable for graph-style visualization
## Risks and Gaps
- The source text format is inconsistent and sometimes noisy; ingest must tolerate malformed headings and spacing.
- The legacy system notes that some outlines used 4-5 levels while the implemented site handled only 3 levels.
- The current repository does not include a clean, already-normalized database dump for direct import.
- Image/assets provenance and usage permissions need review during migration.
## Acknowledgements To Preserve
The replacement app should preserve credit to:
- Dr. Peter Rubec for FLELMR-derived source material and species life history content.
- Dr. Diane Blackwood for the original EcoSpecies web application and architecture work.
- Dr. Welsbery R. Elsberry for consultation and Python programming support.
- Florida Fish and Wildlife Research Institute and related public-agency context described in the project materials.
## Immediate Migration Recommendation
Use the SLH text corpus as the initial authoritative ingest source, not the legacy MVC app. Treat the SQL schema and parser scripts as reference material for a modern normalized model and for ingest validation.