Alchemical Hands in the Hypnerotomachia Poliphili

Marginalia, Scholarship & Reception

Pipeline Scripts

This section exposes the deterministic pipeline behind the site. Rather than hiding the build logic, it presents these 31 Python scripts that initialize the database, parse filenames, extract references, match folios to images, seed and enrich metadata, validate assumptions, and generate the static pages. The aim is transparency and reproducibility.

The project's architectural stance is deliberately conservative: SQLite as source of truth, Python for transformation, JSON and static HTML for delivery, and as little framework machinery as possible. That simplicity is not an omission; it is part of the project's long-term durability model.

ScriptNameDescriptionLines
add_alchemist_descriptions.py Add Alchemist Descriptions Inserts 13 folio-specific scholarly descriptions for the two alchemist annotators from Russell Ch. 6-7. 352
add_bibliography.py Add Bibliography Populates bibliography (58 entries), scholars (29), timeline (39 events) from hardcoded research data. 327
add_hands.py Add Annotator Hands Creates 11 annotator hand profiles and attributes dissertation references to specific hands. 443
build_essay_data.py Build Essay Data Extracts structured evidence from DB and corpus for the Russell and Concordance essays. 318
build_reading_packets.py Build Reading Packets Assembles structured research packets from corpus search for dictionary enrichment. 158
build_scholar_profiles.py Build Scholar Profiles (Legacy) Original scholar page generator from summaries.json. Superseded by build_site.py. 283
build_signature_map.py Build Signature Map Generates the 448-entry signature-to-folio concordance from the Aldine collation formula (a-z, A-G). 103
build_site.py Build Site Unified site generator: exports data.json, builds all HTML pages (scholars, dictionary, marginalia, bibliography, docs, code, about). 4048
catalog_images.py Catalog Images Parses image filenames from BL and Siena collections into the images table with folio/side metadata. 224
chunk_documents.py Chunk Documents Splits markdown files into ~1500-word semantic chunks for RAG/retrieval systems. 261
corpus_search.py Corpus Search Keyword-based search across markdown chunks and documents with provenance tracking. 220
dictionary_audit.py Dictionary Audit Audits dictionary coverage: missing fields, duplicate slugs, orphaned links, weak terms. 160
enrich_dictionary.py Enrich Dictionary Populates dictionary fields from reading packets with source provenance and review status. 170
export_showcase_data.py Export Showcase Data (Legacy) Original data.json exporter for the gallery. Superseded by build_site.py. 116
extract_references.py Extract References Uses PyMuPDF + regex to extract 282 folio/signature references from Russell's PhD thesis PDF. 176
generate_dictionary_significance.py Generate Significance Generates significance_to_hp and significance_to_scholarship prose for all 80+ dictionary terms. 447
generate_scholar_overviews.py Generate Scholar Overviews Generates 2-3 paragraph overview prose for modern scholars and role descriptions for historical figures. 347
ingest_perplexity.py Ingest Perplexity Research Adds 9 bibliography entries and 3 timeline events from HPPERPLEXITY.txt web research. 217
init_db.py Initialize Database Creates SQLite schema (7 core tables) and catalogs PDFs/documents from the filesystem. 221
link_scholars.py Link Scholars Links scholars to bibliography, tags historical figures, matches summaries.json to bibliography entries. 205
match_refs_to_images.py Match Refs to Images SQL join pipeline matching dissertation references to manuscript images via the signature map. 142
migrate_dictionary_v2.py Dictionary Schema V2 Extends dictionary_terms with significance, source tracking, provenance, and confidence columns. 55
migrate_timeline.py Timeline Migration Adds category, medium, location, image_ref, confidence columns to timeline_events table. 41
migrate_v2.py Schema Migration V2 Adds annotations, annotators, doc_folio_refs, dictionary tables, review/provenance columns. Downgrades BL confidence. 389
pdf_to_markdown.py PDF to Markdown Extracts all PDFs to markdown with YAML frontmatter, page markers, and metadata lookup. 373
seed_copies.py Seed Copies Creates hp_copies table and seeds six annotated copies with full metadata from Russell 2014. 146
seed_dictionary.py Seed Dictionary Inserts 37 dictionary terms across 6 categories with 76 bidirectional cross-reference links. 429
seed_dictionary_v2.py Seed Dictionary V2 Seeds 43 HP entity terms: characters, places, architecture, gardens, processions, aesthetics, materials. 543
seed_dictionary_v3.py Seed Dictionary V3 Seeds 14 additional terms: narrative form, built form, aesthetics, alchemy, material culture. 260
seed_timeline_v2.py Seed Timeline V2 Seeds ~30 new timeline events: art, literary influence, scholarly milestones, garden design. 249
validate.py Validate & QA Checks data integrity (duplicate slugs, broken links, confidence distribution) and writes AUDIT_REPORT.md. 264