Skip to content

Phase 4 — Run 3: Operator Cascade (dev → stage and demo → prod)

Branch / PR: jmpicnic/email-integration-phase-4-run-3 (infra), base main (auto-retargets when PR #462 / Run-2 merges). Group(s): G-B + G-C + G-D operator-execution surface for all four active partitions (dev, stage, demo, prod). Code for all four already lands with Run-2 (PR #462) per DQ-R1-026. Tasks: T-O1 (per-partition pre-flight, four invocations), T-O3 (Pre-Deploy CLI runs, four invocations), T-O4 (Postmark Compliance reply after dev verifies), T-O5/T-O6/T-O7 (deploys, four invocations), T-D1 (per-partition V-check status updates in verification.md), T-D8 (infra CHANGELOG.md entry — single bullet describing the deployed-system state across all four partitions). Working directory: /Users/jmp/code/arda/projects/email-integration-worktrees/phase-4/infrastructure-run-3 (infra) and /Users/jmp/code/arda/projects/email-integration-worktrees/phase-4/documentation (docs). AWS impact: Resource-touching in Alpha002 (dev + stage *.ardamails.com zones, secrets, IAM roles) and Alpha001 (demo + prod *.ardamails.com zones, secrets, IAM roles). Production deploy (prod) lands inside this run and requires explicit operator confirmation in the execution log. Postmark accounts touched: PostmarkNonProd (dev, stage), PostmarkProd (demo, prod). Personas: User as operator for all T-O* tasks; devops-engineer only if a code fix surfaces during execution.

This run consolidates the original Run-3 (stage), Run-4 (demo), and Run-5 (prod) into a single operator-cascade run per DQ-R1-026. The dev partition deploy (originally a post-merge action of Run-2) moves into the cascade as its first entry.

  • PR #462 (Run-2) merged to Arda-cards/infrastructure main. All four active partitions’ mail blocks are present in platforms.ts; PartitionEmailStack, the Pre-Deploy CLI, and the amm.sh partition-mail step are on main.
  • 1Password vault state populated for all four partitions (verified via op item get in pre-flight per partition):
    • op://Arda-DevOAM/Postmark/credentialPostmarkNonProd account-level token
    • op://Arda-StageOAM/Postmark/credentialPostmarkNonProd account-level token
    • op://Arda-DemoOAM/Postmark/credentialPostmarkProd account-level token
    • op://Arda-ProdOAM/Postmark/credentialPostmarkProd account-level token
  • AWS SSO sessions authorisable by the operator: Alpha002-Admin (for dev + stage) and Admin-Alpha1 (for demo + prod). Sessions expire; the operator re-runs aws sso login --profile <name> before each partition’s deploy as needed. amm.sh invocations must include --profile <name> explicitly — the script’s auto-derivation (Admin-${infrastructure}) does not match either Phase-4 profile name. See the operator runbook for the per-partition command shape.
  • DMARC reporting mailbox dmarc-reports@arda.cards exists in Arda’s Google Workspace (provisioned per DQ-R1-015 prior to Phase 3 deploy; still required for Phase 4 partition DMARC records).
  • Phase-4 documentation worktree available at phase-4/documentation so the execution log entries can be written as each partition is deployed.

Cascade order (per DQ-R1-021 partial-order)

Section titled “Cascade order (per DQ-R1-021 partial-order)”
dev → T-O4 Postmark Compliance reply → { stage || demo } → prod
  1. dev (Alpha002 / PostmarkNonProd) — first partition deployed; informs the subsequent partition deploys. Failure here blocks all downstream partitions.
  2. T-O4 — after dev is verified end-to-end, operator replies to Postmark Compliance ticket #11236089 with the verified-domain evidence (dev.ardamails.com’s DKIM CNAME + Return-Path verified state). Wait for Postmark response or “more-evidence-needed” path before attempting Signatures on PostmarkNonProd’s other partition (stage).
  3. stage (Alpha002 / PostmarkNonProd) — independent of demo; may be deployed in either order or concurrently with demo.
  4. demo (Alpha001 / PostmarkProd) — first Alpha001 partition; operator switches AWS SSO profile to Admin-Alpha1. PostmarkProd account is already approved (per K-10) so no T-O4-equivalent is required.
  5. prod (Alpha001 / PostmarkProd) — production deploy. Both stage and demo must be verified before opening prod’s cascade entry. The operator requires explicit confirmation in the execution log that production deploy is proceeding intentionally.
TaskDescriptionOutputPersona
T-O1-devPre-flight checks for dev (1P token resolvable, AWS profile authed, target Postmark account reachable)Execution-log § dev / Pre-flightuser
T-O5-dev./amm.sh Alpha002 dev after PR #462 mergeExecution-log § dev / Deploy; populated cdk.context.json entry for partitionMail:Alpha002:devuser
T-O3-dev (post-deploy verify)dig checks for the four record types; Postmark Console verification; CFN exports populatedExecution-log § dev / Verification; V-check rows in verification.md populateduser
T-O4Operator replies to Postmark Compliance ticket #11236089 with dev.ardamails.com verified-domain evidenceEmail artefact captured in execution-log § T-O4; arda-nonprod approval status recordeduser
T-O1-stage, T-O5-stage, T-O3-stage (verify)Same shape for stage (Alpha002 / PostmarkNonProd)Execution-log § stageuser
T-O1-demo, T-O6-demo, T-O3-demo (verify)Same shape for demo (Alpha001 / PostmarkProd)Execution-log § demouser
T-O1-prod, T-O7-prod, T-O3-prod (verify)Same shape for prod, with explicit production-deploy confirmation prior to running amm.shExecution-log § produser
T-D1 (per partition)Update per-partition V-check status rows in verification.md as each partition verifiesverification.md (docs worktree)user
T-D8Single CHANGELOG.md entry on the infra PR — ### Added, one bullet describing the deployed-system state across all four partitionsCHANGELOG.md (infra worktree)user
Contingent: code fixIf a partition surfaces a code-level issue (e.g., a PostmarkProd-account quirk first hit on demo), the fix lands in the same Run-3 infra PRTS code filesdevops-engineer

The execution-log entry for each partition is the primary artefact; the validate-exit script captures the mechanical exit checks per partition; verification.md captures the formal V-check sign-off.

validate-exit.sh (in this directory) accepts an optional <partition> argument and --post-merge flag. With no argument it runs only the pre-merge code checks against the worktree (build / lint / test / synth of all four {infra}-{partition}-Email stacks). With <partition> --post-merge it adds the post-deploy AWS + Postmark verification checks for that partition. The operator runs it once with no argument before opening the PR, then once per partition with <partition> --post-merge after each amm.sh invocation.

Mechanical (verified by validate-exit.sh):

  • npm run build && npm run lint && npm test exit 0 in the infra worktree.
  • cdk synth --app apps/Al1x/partition produces {Alpha002,Alpha001}-{dev,stage,demo,prod}-Email stacks; templates valid.
  • For each of the four partitions after amm.sh runs: CFN stack {infra}-{partition}-Email in CREATE_COMPLETE or UPDATE_COMPLETE; {partition}.ardamails.com NS-delegated; SPF + DMARC TXT records resolve via public DNS; EmailEncryptionKey and EmailPostmarkAccountToken SM secrets exist.

Operator sign-off (captured in execution log):

  • For each partition: Postmark Console shows {partition}.ardamails.com Sender Signature with DKIM + Return-Path verified on the partition’s bound Postmark account.
  • T-O4 outcome recorded (arda-nonprod approval received OR more-evidence-needed path documented).
  • cdk.context.json in the PR diff contains a partitionMail:<infra>:<partition> block for each of the four partitions.
  • Production-deploy confirmation block in the execution log signed off explicitly before T-O7-prod runs.

PR-level:

  • Single infra PR opened from jmpicnic/email-integration-phase-4-run-3; base main; CHANGELOG entry present (single bullet under ### Added); CI green; reviewer approval; merged.

If a partition’s deploy fails mid-cascade:

  1. The operator captures the failure in the execution log under the partition’s section (CFN events, dig output, Postmark API response — whatever is diagnostic).
  2. If the cause is a code issue, a fix lands in the Run-3 infra PR; the operator re-runs amm.sh for the failed partition. The cascade resumes from that partition.
  3. If the cause is operator-environmental (expired SSO session, missing 1P entry, Postmark account quirk), the operator addresses it and re-runs that partition’s amm.sh; no PR change.
  4. Successfully-deployed prior partitions are not rolled back. Per-partition isolation (DQ-R1-021) holds at the resource level: each partition’s stack, secrets, zones, and IAM roles are independent.
  5. If a partition’s failure is non-recoverable in the current cycle (e.g., extended Postmark Compliance back-and-forth blocks stage), the operator may choose to close Run-3 with the partition unverified (PR captures only the verified partitions) and open a follow-up run for the unverified ones. Document this path explicitly in the execution log.

Single working directory. The existing phase-4/infrastructure-run-3 worktree (created on jmpicnic/email-integration-phase-4-run-3 branch, originally based on Run-2’s branch) is the cascade worktree. After PR #462 merges, the worktree’s branch auto-retargets to main; rebase the branch onto main before opening the cascade PR to make the diff against main clean.

The documentation worktree at phase-4/documentation accumulates the execution-log entries and verification.md V-check sign-off rows as each partition lands.


Copyright: (c) Arda Systems 2025-2026, All rights reserved