Skip to content

Runtime documentation -- design review and update proposal

This document inventories the existing pages under current-system/runtime/ and proposes the updates needed to align the section with the principles adopted in the Runtime Overview (Phase 0 of the email-integration project).

The proposal is intended to be read alongside, and applied after, the principle adoption. It does not propose changes to non-runtime documentation (Source view, Process view, Domain view).


The email-integration project introduces three things that the existing runtime documentation does not cleanly accommodate:

  1. External-service IaC — the infrastructure repository now defines resources that live in Postmark (servers, sending domains), 1Password (account-token items), and GitHub (Actions secrets). The existing IaC docs assume AWS CDK + Helm only.
  2. Operator-facing scripts beyond amm.shpostmark-foundations and gha-secrets follow the same execution-script pattern as amm.sh (operator-triggered, robust, self-defined, self-documenting) but the pattern is currently documented only as a description of amm.sh itself.
  3. The Corporate instance groupfree-kanban-tool is the first Corporate-instance asset the platform owns. The existing Platform Structure documentation describes only the Application Runtime hierarchy.

Adopting the principles in Runtime Overview makes the gaps explicit and provides a target shape; this document plans the cascading updates.


Although the principles motivate refactoring of pre-existing operator scripts (notably amm.sh and deploy-root.sh), the cost of revalidating and rerunning them is high. The constraint adopted for this proposal is:

  • Modifications to pre-existing scripts are kept to a minimum.
  • Every individual change to a pre-existing script is flagged inline in the proposal sections below and reviewed individually before implementation.
  • Larger principle-aligned refactors of pre-existing scripts are deferred to dedicated future projects, not bundled into this work.

New scripts are exempt from this constraint and are expected to follow the principles from the start.


PageToday’s scopeGap vs principles
runtime/index.mdSingle-sentence stubUpdated as part of this pass to carry the six guiding principles.
runtime/platform-structure.mdFive-level containment hierarchy (Root → Infrastructure → Partition → Component → Resource)Implicitly Application-Runtime-only. Does not acknowledge Corporate or OAM as instance groups.
runtime/environments.mdLists Application-Runtime Environments mapped to AWS accountsSame scoping issue.
runtime/iac/index.mdCDK + Helm tooling, repo layout, amm.sh pointerLists only AWS / Kubernetes tools. No place for third-party API IaC (Postmark, GitHub, 1Password). The tools/ directory mentioned in the layout is not specified anywhere.
runtime/iac/apps.mdApps deploy Infrastructure + PartitionApps are also needed for Corporate and Platform-level instances.
runtime/iac/constructs.mdCategories: compute / networking / storage / xgress / oam / platformNo category for external-service constructs. The pattern (Configuration / Props / Built) assumes CDK semantics.
runtime/iac/stacks.mdCFN exports / imports, AWS-centricNo equivalent pattern documented for “stack-equivalent” code that drives third-party APIs.
runtime/iac/orchestration.mdDeep description of amm.shConflates the pattern (operator-driven script with robust / self-defined / self-documenting properties) with one instance of it.
runtime/iac/failure-mode-analysis.mdSingle per-script analysis pinned to amm.shNo convention for per-script analyses; will need a sibling for each operator script.
runtime/dns-structure.md, runtime/build-and-deployment.md, runtime/development-pipelines.md, runtime/url-naming.md, runtime/network-routing.md, runtime/mtls.md, runtime/static-assets.mdDomain-specific runtime topicsNot affected by this pass; orthogonal to the principles change.
  • External resources as IaC entities — there is no documentation explaining that Postmark accounts, 1Password vaults, or the GitHub organisation are first-class runtime resources represented in this repository.
  • Operator-script pattern as a generalisation — the principles document robustness, self-definition, and self-documentation, but the contract is not formalised separately from amm.sh.
  • tools/ directory — ad-hoc operator utilities (e.g., a thin GitHub-Actions-secret CLI) that are intentionally outside the instance-group CLIs are not catalogued anywhere.
  • Drift detection — principle 4 introduces it; no page covers the convention or catalogues current drift tests.
  • Instance modelling — the instances/ directory and the platform/ vs instances/ distinction are not explained in their own right.

Updates are described as deltas. Each is small and focused; none rewrite a whole page. No item below modifies a pre-existing operator script. The proposal explicitly flags any code-side knock-on under the affected page.

  • Re-frame the document title and opening sentence: “Application Runtime Platform Structure” (not the only platform structure, but the one this page covers).
  • Add a final section “Other instance groups” briefly describing the existence of Root / OAM / Corporate, with cross-links to their (forthcoming) pages once authored.
  • Same re-framing — this page covers Application-Runtime environments only.
  • Add a leading note that Corporate-instance “environments” exist (e.g., the production Free Kanban Tool) but follow a different shape and are documented separately.
  • Expand the Tools table to include the SDK paths used for third-party-API IaC (Postmark API client, GitHub Octokit, 1Password SDK). Today only AWS-side tools are listed.
  • Replace the infrastructure/ directory map with one that:
    • Removes the speculative tools/ line in its current form and replaces it with a properly described tools/ directory for ad-hoc operator utilities (see §5.3).
    • Adds an entry for the operator-script convention (see §5.3).
    • Adds an entry for instances/ formally distinguished from apps/ (today they are explained side-by-side without naming the relationship).
  • Update the Code Organisation section to reference the three resource categories (AWS / external reference / third-party API).
  • Generalise from “Infrastructure + Partition” to “one or more top-level resources for a specific instance.” The pattern remains; the scope broadens.
  • Add a paragraph noting that some instances (Corporate, Root) deploy zero AWS-CDK stacks but still have an App-equivalent: a TypeScript entry point that fixes the instance’s configuration and invokes the relevant Stacks (which may, in turn, drive third-party APIs rather than CFN).
  • Add a category to the organisation table: external — for constructs that wrap a third-party API call surface (e.g., Postmark server / domain operations). Document that such constructs follow the same Configuration / Props / Built shape but Built may include API-side identifiers rather than CDK resource handles.
  • Add a section “Stacks for third-party services” — describes the analogous pattern when there is no CFN stack to publish to. The exportDefinition / exportValues machinery is replaced by an equivalent declared-output mechanism (e.g., a typed result object written to 1Password or to a file artifact). Concrete shape to be authored alongside the first migrated example.
  • Document the cross-instance reference convention (see resolution in §5.2): direct TypeScript import for compile-time constants; importValues(forElement) pattern for values resolved when resources are created.
  • Rename the page to amm.sh Deployment Orchestrator” to scope it explicitly. Doc-only; no amm.sh change.
  • Move the script principles (operator-driven, robust, self-defined, self-documenting) into a new sibling page (§5.3) and have this page reference them. Doc-only.
  • Convert this to a folder (iac/failure-mode-analyses/) with one file per analysed script: amm.md (the existing content), postmark-foundations.md (forthcoming), etc.
  • Add an index.md for the folder describing the analysis template.

4.9 apps/rootConfiguration/ directory rename (resolved — §6 was Q7)

Section titled “4.9 apps/rootConfiguration/ directory rename (resolved — §6 was Q7)”

Rename src/main/cdk/apps/rootConfiguration/ to src/main/cdk/apps/Root/ to align with the instance-group naming used elsewhere (apps/Al1x/, future apps/Corporate/) and with the new instances/Root/ directory.

Pre-existing-script flag: deploy-root.sh references the renamed path and will require a corresponding minimal change. This is acceptable under §2 because the change is mechanical (path-only) and easily revalidated. No other behavioural change to the script.


Listed in priority order.

How external resources (Postmark accounts, 1Password vaults, GitHub organisation, etc.) are represented in this repository. Covers:

  • The distinction between referenced (Arda does not create) and API-managed (Arda creates / reconciles via API).
  • Where references live: src/main/cdk/platform/<service>-configuration.ts.
  • How references propagate into instances and into operator scripts.
  • Worked examples: Postmark accounts, 1Password vaults, GitHub organisation.

How the instances/ directory is structured.

  • One subdirectory per instance group (Alpha001/, Alpha002/, SandboxKyle002/, Corporate/, Root/, OAM/).
  • Within each, one TypeScript file per top-level resource the instance owns.
  • The platform/ vs instances/ distinction: platform/ declares decoupled configuration that the repository does not own (external “magic” values, pre-defined patterns); instances/ declares concrete runtime objects under the repository’s responsibility.
  • Naming conventions for instance files and exported instance objects.
  • Cross-instance reference convention (resolved — §6 was Q6):
    • Compile-time TypeScript constants are referenced via direct import. Example: an instances/Corporate/free-kanban-tool.ts file that consumes a Root-owned zone identifier defined as a TS constant uses import { ARDAMAILS_ZONE_NAME } from '../Root/dns'.
    • Values resolved at resource-creation time (CFN export ARNs, runtime-generated ids) are consumed via the existing importValues(forElement) pattern documented in iac/stacks.md. The producing instance publishes the value via exportDefinition / exportValues; the consumer imports it.

Generalised contract for operator-facing scripts.

  • The three properties (robust, self-defined, self-documenting) restated with concrete examples.
  • Required script anatomy: prerequisite check, configuration sourcing, dry-run mode, structured output, idempotent reconcile, post-run summary.
  • Granularity: one operator script per class of resource, matching the pattern already established by amm.sh (Application-Runtime resources) and deploy-root.sh (Root-instance resources). A future Corporate CLI follows the same shape: one entry point per class of resource owned by Corporate.
  • tools/ directory (see §5.4) holds ad-hoc operator utilities that fall outside the instance-driven model. The catalogue distinguishes these from the principal operator scripts.
  • Catalogue of principal operator scripts: amm.sh (existing), deploy-root.sh (existing), postmark-foundations CLI (Phase 0). Future scripts append to this list.

Convention for the tools/ directory: a flat collection of small operator utilities that operators invoke at their discretion, not driven by an instance declaration.

  • Inclusion criteria: a tool belongs in tools/ when its inputs are operator-supplied at invocation (e.g., target repository, secret name, value source), and there is no single declarative source for those inputs in instances/ or platform/.
  • Examples expected in scope:
    • tools/gha-secret.ts — shallow CLI to set a GitHub Actions secret in a given repository, with the value sourced from a 1Password reference. Operator supplies --repo, --name, --op-ref at invocation. Used during the transition to fully-declarative repo-secret management.
    • Candidates from the current scripts/lib/ that may belong here — to be reviewed during the scripts re-organisation. Some lib/ content is library code (stays where it is or moves to src/main/cdk/utils/); some is operator-facing tooling (moves to tools/).
  • Each tool is independently testable and self-contained; tools are not allowed to depend on instance declarations (if they would, they should become subcommands of the relevant instance-group CLI instead).

The convention for periodic drift / integration tests.

  • What “drift” means in each resource category.
  • Where the tests live (per-script CI workflow, scheduled trigger).
  • How failures are surfaced (auto-opened GitHub issue with a fixed label set).
  • Catalogue: postmark-foundations:integration (Phase 0). Future drift tests append.

5.6 instance-groups.md (sibling of platform-structure.md)

Section titled “5.6 instance-groups.md (sibling of platform-structure.md)”

Top-level taxonomy of instance groups.

  • Application Runtimes (with a pointer to platform-structure.md for the deep structure).
  • Platform-level (Root, OAM) — short descriptions and current state.
  • Corporate — short description and current state, with free-kanban-tool as the worked example.
  • Naming and lifecycle conventions per group.

5.7 iac/root.md (priority — no longer deferred)

Section titled “5.7 iac/root.md (priority — no longer deferred)”

Promoted from “deferred” because Corporate-instance declarations need to make cross-instance references into Root for resources that Root owns (notably DNS zones). At minimum, Root needs a declaration of the ardamails.com zone (and any sub-zones it owns) so that Corporate’s free-kanban-tool declaration can reference the zone by symbol rather than by string id.

Initial scope: enumerate the DNS zones currently in platformRoot, their CDK / CFN stack of origin, and the export contract by which other instances consume them. Subsequent passes can add the rest of Root’s holdings.

5.8 (Deferred) iac/corporate.md, iac/oam.md

Section titled “5.8 (Deferred) iac/corporate.md, iac/oam.md”

Per-instance-group detail pages. Authored once each group has at least one concrete element documented beyond the worked example here.

iac/corporate.md becomes immediately useful as the home for the Free Kanban Tool description; iac/oam.md remains deferred until OAM has content.


StepScopeOwner
1. Adopt principlesruntime/index.md update — DONE in this pass.Done.
2. Light-touch updates to existing pages§4.1 — §4.4. Re-frame scope, add cross-links. No structural rewrites.Email-integration Phase 0 follow-up.
3. Author new pages, priority 5.1 — 5.5External resources, instance modelling, operator scripts, tools, drift detection.Email-integration Phase 0 follow-up.
4. apps/rootConfiguration/ rename§4.9. Mechanical path rename plus the corresponding minimal change to deploy-root.sh.Email-integration Phase 0 follow-up; flagged for individual review.
5. Author instance-groups.md and iac/root.md§5.6, §5.7. Root’s minimal declaration is required to enable Corporate cross-instance references.Email-integration Phase 0 follow-up.
6. Refactor iac/failure-mode-analysis.md into a folder§4.8.Triggered by the second per-script analysis (i.e., when postmark-foundations gets one).
7. Per-instance-group pages§5.8.As content becomes available.

Steps 2 and 3 can be a single PR.


(Q6 and Q7 from the prior version were resolved and folded into §5.2 and §4.9 respectively.)

  1. Terminology for non-AWS “stacks”: do we keep the word Stack for code that orchestrates third-party APIs without CFN, or introduce a parallel term (e.g., Choreography, Reconciler)? Reusing Stack preserves the construct → stack → app hierarchy mentally; introducing a new term avoids implying CFN semantics that do not apply.
  2. apps/ directory layout: today apps/Al1x/ groups Application-Runtime apps. Do Corporate apps go in apps/Corporate/, or is the App entry point co-located in instances/Corporate/? The latter keeps everything for an instance group in one folder; the former preserves a uniform top-level apps/ listing.
  3. Versioning of external references: when an external service materially changes (e.g., Postmark plan change, 1Password SDK major bump), how is that reflected — as a code change in platform/, with what review path? Worth a short policy section, possibly in external-resources.md.
  4. Where Helm fits in the hierarchy: Helm releases are arguably “third-party-API IaC” too, but the existing docs treat Helm as a parallel tool. Decide whether to fold Helm under the same Constructs / Stacks / Apps lens or document it separately.
  5. DNS zone ownership — Root vs partition: Phase 1 currently assigns per-partition mail sub-zones (prod.ardamails.com, dev.ardamails.com, …) to partition instances. The principle that Root owns cross-cutting platform resources suggests these sub-zones may belong to Root instead, with partitions consuming references. Resolution deferred — the current Phase 1 design is honoured — but flagged so the next iteration can reconsider.

  • Code reorganisation in the infrastructure repository beyond the apps/rootConfiguration/ → apps/Root/ rename in §4.9 — this proposal describes the target shape of the documentation. Other code moves are tracked separately under the email-integration scripts re-organisation effort.
  • Helm and Kubernetes-pipeline documentation changes.
  • Source-view and Process-view documentation — principles 1 (repo-as-reference) and 5 (instance groups) have implications there too, but those are pursued in a separate review.
  • Principle-aligned refactoring of pre-existing operator scripts (amm.sh, deploy-root.sh) beyond the minimal path change required by §4.9. These refactors are deferred to dedicated future projects to limit revalidation cost; see §2.

9. Learning note — how the project would have been split under these principles

Section titled “9. Learning note — how the project would have been split under these principles”

Recorded in this section for now; scheduled to be lifted to a top-level learnings.md file in the project directory at project close so subsequent projects can find it without browsing this review.

If the principles in Runtime Overview had been in place when this project was scoped, the email-integration work would have been split as:

PhaseScopeOutput
Phase-00Capture baseline Postmark resources as platform/ declarations: account references, vault references, plan attributes. Implement the supporting Constructs / Stacks for the Postmark API surface.Code in platform/, constructs/external/postmark-*, stacks/external/postmark-*. No live execution.
Phase-01Author the Root instance: declare and (where missing) deploy the DNS zones owned by Root. Update deploy-root.sh to source values from the Root instance declarations.instances/Root/dns.ts populated; apps/Root/... deploys it.
Phase-1aCreate the free-kanban-tool instance within Corporate: declaration in instances/Corporate/, App entry point under apps/Corporate/, dedicated CLI script. Cross-instance reference into Root for the sending-domain zone.New CLI; Free Kanban Tool live.
Phase-1bAdd per-partition mail resources to Application Runtime instances. Update apps/Al1x/... and amm.sh accordingly.Partition mail capability live.

Effect: Phase 0 (current) becomes a “build the code” phase whose execution is intentionally deferred until Root is properly set up (effectively Phase-1 of the current plan). The current Phase 1 spec corresponds to the merge of Phase-01 + Phase-1b above.

The split that actually happened conflates Phase-00 and Phase-1a (Free Kanban Tool live execution depends on Root’s prod.ardamails.com zone, which lives in the current Phase 1 spec). Recognising this mismatch motivated the separation of the CDK code completion from the live execution in Phase 0’s current state.