Skip to content

Phase 4 — Run Choreography

Phase 4 is decomposed into seven runs, sequenced as a 7-PR rollout. Each run is launched independently via the launch-team skill; the user is the choreographer between runs (operator gates, Postmark Compliance reply, per-partition cdk diff review). The decomposition rationale is in evaluation.md; the per-run plans are in runs/.

RunBranch / PRScopeWorking dirAWS impact
run-1-workspace-refactorsphase-4-G-AG-A: construct generalisation, byte-identity guard, accessor, reserved-words extension, helper extractionphase-4/infrastructure (+ docs verification entry)Synth-only + Root read-only diff
run-2-dev-rolloutphase-4-G-B-C-D-devG-B+C+D for dev; includes G-POSTMARK-5 arda-nonprod unlockphase-4/infrastructureResource-touching (Alpha002)
run-3-stage-rolloutphase-4-G-B-C-D-stageG-B+C+D for stagephase-4/infrastructureResource-touching (Alpha002)
run-4-demo-rolloutphase-4-G-B-C-D-demoG-B+C+D for demophase-4/infrastructureResource-touching (Alpha001)
run-5-prod-rolloutphase-4-G-B-C-D-prodG-B+C+D for prod (production deploy)phase-4/infrastructureResource-touching (Alpha001)
run-6-drift-workflowphase-4-G-EG-E: runtime-platform-drift.yml + driver + shared utility extractionphase-4/infrastructureNone
run-7-documentationphase-4-G-FG-F: current-system retrofit, rotation runbook, secret-delivery-pattern.md content fill, docs CHANGELOGphase-4/documentationNone

Per the project-decomposition skill, this document captures the cross-run choreography: the sequencing graph, the operator-gate handoffs, and the artifact dependencies that the per-run project-plan.md files alone cannot express.

2. Setup phase — harness prompt minimisation

Section titled “2. Setup phase — harness prompt minimisation”

Before launching run-1, perform the following one-time setup to minimise harness permission prompts triggered by bash command-shape variants during validation. Each is a sunk cost paid once; the savings compound across the seven runs and the per-partition validate-exit.sh invocations.

2.1 .claude/settings.local.json allowlist patches

Section titled “2.1 .claude/settings.local.json allowlist patches”

Add the following patterns to the project-level settings (preferred) or to the user’s ~/.claude/settings.json. Group them under a clear comment so future maintainers see why they were added.

{
"permissions": {
"allow": [
"Bash(bash */validate-exit.sh*)",
"Bash(dig *)",
"Bash(aws cloudformation describe-stacks*)",
"Bash(aws cloudformation get-template*)",
"Bash(aws cloudformation list-exports*)",
"Bash(aws secretsmanager describe-secret*)",
"Bash(aws iam get-role*)",
"Bash(aws sts get-caller-identity*)",
"Bash(gh pr view *)",
"Bash(gh pr checks *)",
"Bash(gh run view *)",
"Bash(gh workflow run *)",
"Bash(git -C * *)",
"Bash(npm --prefix * *)",
"Bash(make -C * *)"
]
}
}

Verify before launching run-1 by running bash plan/runs/run-1-workspace-refactors/validate-exit.sh and confirming zero ad-hoc permission prompts during execution.

Every validate-exit.sh follows the same shape to keep harness-visible bash invocations uniform:

#!/usr/bin/env bash
set -euo pipefail
PASS=0
FAIL=0
TOTAL=<n>
check() {
local desc="$1" cmd="$2" expected="$3"
if result=$(eval "$cmd" 2>&1); then
if [[ "$result" == *"$expected"* ]]; then
echo "PASS: $desc"; ((PASS++))
else
echo "FAIL: $desc (expected '$expected', got '$result')"; ((FAIL++))
fi
else
echo "FAIL: $desc (command failed: $result)"; ((FAIL++))
fi
}
# Entry / Exit checks below ...
[[ $FAIL -eq 0 ]] && echo "ALL CHECKS PASSED" || { echo "SOME CHECKS FAILED"; exit 1; }

The agent invokes the script with one Bash tool call; the script internally runs all dig / aws / gh checks without each one being a separately permission-gated harness call.

The following standards apply across all run plans, validate-exit scripts, and operator runbooks. They are enforced both by the new ESLint rules landed in PR #454 (no-cd-in-shell, no-aws-profile-prefix) and by reviewer convention.

  • git -C <absolute-path> <subcommand> — never cd <path> && git ....
  • npm --prefix <absolute-path> run <script> — never cd <path> && npm ....
  • make -C <absolute-path> <target> — never cd <path> && make ....
  • aws --profile <name> <command> — never AWS_PROFILE=<name> aws ....
  • Absolute worktree paths inside script bodies; positional args for partition / infrastructure / profile.
  • .claude/settings.local.json (or equivalent) contains the patterns above.
  • bash plan/runs/run-1-workspace-refactors/validate-exit.sh --dry-run (if supported) completes without prompts.
  • No cd <path> form appears in any validate-exit.sh (verifiable by grep -rn '^cd ' plan/runs/).

Once these criteria hold, run-1 may launch.

The 7-run dependency graph mirrors analysis.md § 13.1 lifted from group level to run level. Hard edges block the downstream run until the upstream run is merged and exit criteria pass; soft edges allow parallel authoring but verification still serialises.

The diagram below shows the run dependencies. run-1 is a hard prerequisite for run-2 (the construct generalisation + tools/lib/ helpers are imported by run-2’s code) and a soft prerequisite for run-6 (drift driver also imports from tools/lib/). run-2 through run-5 form a strict sequential chain per DQ-R1-021 (dev → stage → demo → prod). run-6 is a soft dependency from run-2 onwards (drift probes need at least one partition live). run-7 documentation lands last after every per-partition rollout merges so the docs reflect what was built.

PlantUML diagram

These are the human-in-the-loop steps the user performs to advance from one run to the next. Each gate is the natural pause point that drove the decomposition recommendation in evaluation.md.

BetweenOperator actionRequired artefact
run-1 → run-2Run T-O2 (Root no-drift verification) against deployed RootConfiguration. Confirm empty cdk diff. Record in verification sign-off table.Empty cdk diff output captured
run-2 → run-3Review cdk diff for dev in PR description; approve before merge. After deploy, confirm dig checks pass, Postmark Console shows verified Sender Signature. Send T-O4 reply to Postmark Compliance ticket #11236089 with arda-nonprod verified-domain evidence. Wait for response (or capture “more evidence needed” path).Postmark response captured
run-3 → run-4Review cdk diff for stage; approve before merge. Confirm dig + Postmark verification. Sign off.Sign-off table row populated
run-4 → run-5Same as run-3 → run-4 for demo. Note Alpha001 profile switch (Admin-Alpha1).Sign-off table row populated
run-5 → (end)Review cdk diff for prod with extra care (production deploy). Approve. Confirm dig + Postmark verification. Sign off.Sign-off table row populated
run-6 → run-7Manually trigger runtime-platform-drift.yml via workflow_dispatch. Confirm no spurious issue opened. Sign off T-O8.First-run workflow log captured
run-7 → completionmake pr-checks green on the documentation PR; technical-writer review findings addressed.Docs PR merged

run-6 can be opened in parallel with run-3 / run-4 provided run-2 (dev) is live (drift probes need at least one partition’s state).

ProducerArtefactConsumer(s)Form
run-1Generalised AllowCreatingNSRecordsRole construct (renamed) + postmarkCredentialOpReference accessor + tools/lib/* helpers + reserved-words list entriesrun-2 (CDK code imports + tools script)TypeScript imports + module exports
run-2PartitionEmailStack class file + apps/Al1x/partition.ts instantiation + register-partition-mail-signature.ts entry script + amm.sh partition-mail steprun-3, run-4, run-5 (code carries over; runs 3-5 only add instance config)Source files on main
run-2dev.ardamails.com zone + NS-delegation + SPF + DMARC + DKIM + Return-Path records + both SM secrets + both IAM rolesDrift workflow (run-6) probes; Phase 5b consumes via CFN exportsLive AWS resources
run-2Postmark Sender Signature on PostmarkNonProd for dev.ardamails.comT-O4 Postmark Compliance ticket #11236089 replyPostmark API state
run-3 / 4 / 5Per-partition equivalents of run-2’s AWS + Postmark artefactsDrift workflow probes; Phase 5b consumesLive AWS + Postmark state
run-6runtime-platform-drift.yml workflow + driver + extracted tools/lib/drift/ helpersScheduled drift checks; future runtime-platform drift checks beyond email; corporate-drift regression-tested with extracted helpersWorkflow + module exports
run-7Filled-in secret-delivery-pattern.md + per-partition mail pages in current-system/runtime/ + Postmark-service multi-Signature updates + encryption-key rotation runbook + docs CHANGELOG entryFuture maintainers; Phase 5b authors; operatorsMarkdown pages

Each run’s failure mode and recovery path:

RunFailure modeRecovery
run-1T-I2 byte-identity test fails on PR → cannot mergeDiagnose construct change → re-author T-I1 → re-run test. No deployed AWS state to roll back.
run-1T-O2 post-merge Root drift detectedStop. Investigate Root drift cause (Phase 4 construct change OR external drift unrelated). Resolve before any partition deploy.
run-2Phase A (register-partition-mail-signature.ts) fails on devIdempotent re-run after fixing root cause; no partial AWS state.
run-2Phase B (cdk deploy) fails on devCFN rolls back the stack. Investigate; re-run amm.sh.
run-2T-O4 Postmark Compliance reply doesn’t unlock arda-nonprodOperator follows Postmark Support’s direction; may need to provision additional Signatures (run-3 ahead of schedule). Update REQ-OPS-004 documented assumption.
run-3 / 4 / 5Per-partition deploy failsSame as run-2: CFN rollback; investigate; re-run for that partition. Prior partitions are unaffected (per the per-partition isolation drove the decomposition).
run-6Drift workflow fails on first scheduled runInspect logs; tune probe thresholds; re-run via workflow_dispatch. No AWS state change.
run-7make pr-checks failsFix the offending page; re-run locally. No AWS state change.

Phase 4 is complete when all of the following hold:

  • Runs 1 through 7 each have their validate-exit.sh exit 0.
  • All four active partition mail sub-zones live and delegated (REQ-PART-001..006 satisfied per ../design/verification.md V-PART-001..005).
  • Each partition’s Postmark Sender Signature registered and verified (REQ-PART-007..010, V-PART-007..010).
  • Per-partition encryption-key SM secret exists with RemovalPolicy.RETAIN (REQ-PART-014, V-PART-014).
  • Per-partition Postmark account-token SM secret exists, populated via δ.1 (REQ-PART-011, V-PART-011).
  • All six -API- CFN exports per partition (REQ-PART-002, 012, 015, 018, 020 + zone-name).
  • Both per-partition IAM roles exist (REQ-PART-017, 019; V-PART-017, 019).
  • arda-nonprod Postmark account approval received OR Postmark reply requesting more evidence captured (REQ-OPS-004).
  • runtime-platform-drift.yml has completed at least one successful scheduled run (REQ-CI-001, V-CI-001).
  • Root account’s RootDnsStack produces byte-identical CFN post-Phase-4 (REQ-IAC-002, V-IAC-002).
  • Documentation deliverables landed (REQ-DOC-001..004, V-DOC-001..004).
  • Operator sign-off table in ../design/verification.md fully populated.
  • All seven PRs merged to main on their respective repositories.

Orchestration metadata for the team-lead spawning agents per run. Skills below are consulted on demand, not bulk-loaded.

Skill names below correspond to workspace skill directories under workspace/instructions/claude/skills/<name>/. Some have public documentation pages on the Arda docs site; most are agent-only and resolved via the resolve-doc-page.sh helper at skill-load time.

SkillLoaded byUsed in
cdk-infrastructuredevops-engineerAll CDK construct / stack / app work (runs 1–6)
typescript-codingdevops-engineertools/ scripts and tools/lib/ helpers (runs 1, 2, 6)
unit-tests-infradevops-engineerCDK Template-matcher test surface (runs 1, 2, 6)
path-conventionsAll personasCross-system doc links and relative paths
document-writingtechnical-writerNew current-system/ pages and runbook (run 7)
pr-stewardAll personasLanding each PR (runs 1–7)
project-decompositionTeam-lead / userAuthoring this plan/ tree (already applied)

The Team Lead spawns the CI Observer at project start. The observer collects observations through runs 1–7 — repeated errors, slow iterations, friction patterns, deviations from the plan — and produces continuous-improvement-proposal.md at the project root at close. The proposal feeds the improvement-analyzer skill (run by the user, post-Phase-4) which decides which proposals become skill updates, agent updates, or template changes.

The CI Observer runs in the background for the full Phase 4 lifecycle. Each run’s validate-exit.sh is a natural data point: when a check fails, the observer notes the cause (script bug? convention violation? agent confusion?) and tags it.

Once Phase 4 completion criteria (§ 7) hold, perform the lifecycle wrap-up:

Author the following files under 4-runtime-platform-updates/implementation/ before retiring the project directory:

FileDescription
learnings.mdDurable insights from Phase 4 implementation — patterns, surprises, codebase lessons that should outlive the project.
suggestions.mdForward-looking improvements for Phase 5a / 5b and beyond — work that surfaced as worth doing but is out of Phase 4 scope.
phase-a-deploy.mdRun-1 outcomes; Root no-drift verification result.
phase-b-deploy-dev.mdRun-2 outcomes; cdk diff summary; sign-off record.
phase-b-deploy-stage.mdRun-3 outcomes.
phase-b-deploy-demo.mdRun-4 outcomes.
phase-b-deploy-prod.mdRun-5 outcomes (production deploy — extra detail).
phase-c-deploy.mdRun-6 outcomes; first scheduled drift run results.
phase-d-deploy.mdRun-7 outcomes; documentation review findings.
continuous-improvement-proposal.md (project root)Output of the CI Observer + improvement-analyzer consolidating structural improvements identified throughout Phase 4.

Per the project lifecycle convention, move the 4-runtime-platform-updates/ directory:

Terminal window
git -C /Users/jmp/code/arda/documentation mv \
src/content/docs/roadmap/in-progress/email-integration/4-runtime-platform-updates \
src/content/docs/roadmap/completed/email-integration/4-runtime-platform-updates

Update any inbound references in roadmap/completed/email-integration/ and verify make pr-checks still passes.

Remove the Phase 4 worktrees and local branches once all PRs are merged:

Terminal window
git -C /Users/jmp/code/arda/documentation worktree remove \
/Users/jmp/code/arda/projects/email-integration-worktrees/phase-4/documentation
git -C /Users/jmp/code/arda/infrastructure worktree remove \
/Users/jmp/code/arda/projects/email-integration-worktrees/phase-4/infrastructure
git -C /Users/jmp/code/arda/documentation branch -d jmpicnic/email-integration-phase-4
git -C /Users/jmp/code/arda/infrastructure branch -d jmpicnic/email-integration-phase-4
rmdir /Users/jmp/code/arda/projects/email-integration-worktrees/phase-4

The phase-5a/* and phase-5b/* worktrees stay in place — they continue with their own phase work.

Verify that ../../5b-email-module/pre-existing-decisions.md’s references to Phase 4 resolve correctly post-move. The cross-links to DQ-R1-019, DQ-R1-020, DQ-R1-023, and the per-partition -API- exports must point at the new roadmap/completed/ location.


Copyright: (c) Arda Systems 2025-2026, All rights reserved