Phase 4 — Run Choreography
1. Overview
Section titled “1. Overview”Phase 4 is decomposed into seven runs, sequenced as a 7-PR rollout. Each run is launched independently via the launch-team skill; the user is the choreographer between runs (operator gates, Postmark Compliance reply, per-partition cdk diff review). The decomposition rationale is in evaluation.md; the per-run plans are in runs/.
| Run | Branch / PR | Scope | Working dir | AWS impact |
|---|---|---|---|---|
run-1-workspace-refactors | phase-4-G-A | G-A: construct generalisation, byte-identity guard, accessor, reserved-words extension, helper extraction | phase-4/infrastructure (+ docs verification entry) | Synth-only + Root read-only diff |
run-2-dev-rollout | phase-4-G-B-C-D-dev | G-B+C+D for dev; includes G-POSTMARK-5 arda-nonprod unlock | phase-4/infrastructure | Resource-touching (Alpha002) |
run-3-stage-rollout | phase-4-G-B-C-D-stage | G-B+C+D for stage | phase-4/infrastructure | Resource-touching (Alpha002) |
run-4-demo-rollout | phase-4-G-B-C-D-demo | G-B+C+D for demo | phase-4/infrastructure | Resource-touching (Alpha001) |
run-5-prod-rollout | phase-4-G-B-C-D-prod | G-B+C+D for prod (production deploy) | phase-4/infrastructure | Resource-touching (Alpha001) |
run-6-drift-workflow | phase-4-G-E | G-E: runtime-platform-drift.yml + driver + shared utility extraction | phase-4/infrastructure | None |
run-7-documentation | phase-4-G-F | G-F: current-system retrofit, rotation runbook, secret-delivery-pattern.md content fill, docs CHANGELOG | phase-4/documentation | None |
Per the project-decomposition skill, this document captures the cross-run choreography: the sequencing graph, the operator-gate handoffs, and the artifact dependencies that the per-run project-plan.md files alone cannot express.
2. Setup phase — harness prompt minimisation
Section titled “2. Setup phase — harness prompt minimisation”Before launching run-1, perform the following one-time setup to minimise harness permission prompts triggered by bash command-shape variants during validation. Each is a sunk cost paid once; the savings compound across the seven runs and the per-partition validate-exit.sh invocations.
2.1 .claude/settings.local.json allowlist patches
Section titled “2.1 .claude/settings.local.json allowlist patches”Add the following patterns to the project-level settings (preferred) or to the user’s ~/.claude/settings.json. Group them under a clear comment so future maintainers see why they were added.
{ "permissions": { "allow": [ "Bash(bash */validate-exit.sh*)", "Bash(dig *)", "Bash(aws cloudformation describe-stacks*)", "Bash(aws cloudformation get-template*)", "Bash(aws cloudformation list-exports*)", "Bash(aws secretsmanager describe-secret*)", "Bash(aws iam get-role*)", "Bash(aws sts get-caller-identity*)", "Bash(gh pr view *)", "Bash(gh pr checks *)", "Bash(gh run view *)", "Bash(gh workflow run *)", "Bash(git -C * *)", "Bash(npm --prefix * *)", "Bash(make -C * *)" ] }}Verify before launching run-1 by running bash plan/runs/run-1-workspace-refactors/validate-exit.sh and confirming zero ad-hoc permission prompts during execution.
2.2 Wrapper script conventions
Section titled “2.2 Wrapper script conventions”Every validate-exit.sh follows the same shape to keep harness-visible bash invocations uniform:
#!/usr/bin/env bashset -euo pipefail
PASS=0FAIL=0TOTAL=<n>
check() { local desc="$1" cmd="$2" expected="$3" if result=$(eval "$cmd" 2>&1); then if [[ "$result" == *"$expected"* ]]; then echo "PASS: $desc"; ((PASS++)) else echo "FAIL: $desc (expected '$expected', got '$result')"; ((FAIL++)) fi else echo "FAIL: $desc (command failed: $result)"; ((FAIL++)) fi}
# Entry / Exit checks below ...
[[ $FAIL -eq 0 ]] && echo "ALL CHECKS PASSED" || { echo "SOME CHECKS FAILED"; exit 1; }The agent invokes the script with one Bash tool call; the script internally runs all dig / aws / gh checks without each one being a separately permission-gated harness call.
2.3 Command-shape standards
Section titled “2.3 Command-shape standards”The following standards apply across all run plans, validate-exit scripts, and operator runbooks. They are enforced both by the new ESLint rules landed in PR #454 (no-cd-in-shell, no-aws-profile-prefix) and by reviewer convention.
git -C <absolute-path> <subcommand>— nevercd <path> && git ....npm --prefix <absolute-path> run <script>— nevercd <path> && npm ....make -C <absolute-path> <target>— nevercd <path> && make ....aws --profile <name> <command>— neverAWS_PROFILE=<name> aws ....- Absolute worktree paths inside script bodies; positional args for partition / infrastructure / profile.
2.4 Setup exit criteria
Section titled “2.4 Setup exit criteria”.claude/settings.local.json(or equivalent) contains the patterns above.bash plan/runs/run-1-workspace-refactors/validate-exit.sh --dry-run(if supported) completes without prompts.- No
cd <path>form appears in anyvalidate-exit.sh(verifiable bygrep -rn '^cd ' plan/runs/).
Once these criteria hold, run-1 may launch.
3. Run-sequence DAG
Section titled “3. Run-sequence DAG”The 7-run dependency graph mirrors analysis.md § 13.1 lifted from group level to run level. Hard edges block the downstream run until the upstream run is merged and exit criteria pass; soft edges allow parallel authoring but verification still serialises.
The diagram below shows the run dependencies. run-1 is a hard prerequisite for run-2 (the construct generalisation + tools/lib/ helpers are imported by run-2’s code) and a soft prerequisite for run-6 (drift driver also imports from tools/lib/). run-2 through run-5 form a strict sequential chain per DQ-R1-021 (dev → stage → demo → prod). run-6 is a soft dependency from run-2 onwards (drift probes need at least one partition live). run-7 documentation lands last after every per-partition rollout merges so the docs reflect what was built.
4. Operator gates between runs
Section titled “4. Operator gates between runs”These are the human-in-the-loop steps the user performs to advance from one run to the next. Each gate is the natural pause point that drove the decomposition recommendation in evaluation.md.
| Between | Operator action | Required artefact |
|---|---|---|
| run-1 → run-2 | Run T-O2 (Root no-drift verification) against deployed RootConfiguration. Confirm empty cdk diff. Record in verification sign-off table. | Empty cdk diff output captured |
| run-2 → run-3 | Review cdk diff for dev in PR description; approve before merge. After deploy, confirm dig checks pass, Postmark Console shows verified Sender Signature. Send T-O4 reply to Postmark Compliance ticket #11236089 with arda-nonprod verified-domain evidence. Wait for response (or capture “more evidence needed” path). | Postmark response captured |
| run-3 → run-4 | Review cdk diff for stage; approve before merge. Confirm dig + Postmark verification. Sign off. | Sign-off table row populated |
| run-4 → run-5 | Same as run-3 → run-4 for demo. Note Alpha001 profile switch (Admin-Alpha1). | Sign-off table row populated |
| run-5 → (end) | Review cdk diff for prod with extra care (production deploy). Approve. Confirm dig + Postmark verification. Sign off. | Sign-off table row populated |
| run-6 → run-7 | Manually trigger runtime-platform-drift.yml via workflow_dispatch. Confirm no spurious issue opened. Sign off T-O8. | First-run workflow log captured |
| run-7 → completion | make pr-checks green on the documentation PR; technical-writer review findings addressed. | Docs PR merged |
run-6 can be opened in parallel with run-3 / run-4 provided run-2 (dev) is live (drift probes need at least one partition’s state).
5. Artifact dependencies
Section titled “5. Artifact dependencies”| Producer | Artefact | Consumer(s) | Form |
|---|---|---|---|
| run-1 | Generalised AllowCreatingNSRecordsRole construct (renamed) + postmarkCredentialOpReference accessor + tools/lib/* helpers + reserved-words list entries | run-2 (CDK code imports + tools script) | TypeScript imports + module exports |
| run-2 | PartitionEmailStack class file + apps/Al1x/partition.ts instantiation + register-partition-mail-signature.ts entry script + amm.sh partition-mail step | run-3, run-4, run-5 (code carries over; runs 3-5 only add instance config) | Source files on main |
| run-2 | dev.ardamails.com zone + NS-delegation + SPF + DMARC + DKIM + Return-Path records + both SM secrets + both IAM roles | Drift workflow (run-6) probes; Phase 5b consumes via CFN exports | Live AWS resources |
| run-2 | Postmark Sender Signature on PostmarkNonProd for dev.ardamails.com | T-O4 Postmark Compliance ticket #11236089 reply | Postmark API state |
| run-3 / 4 / 5 | Per-partition equivalents of run-2’s AWS + Postmark artefacts | Drift workflow probes; Phase 5b consumes | Live AWS + Postmark state |
| run-6 | runtime-platform-drift.yml workflow + driver + extracted tools/lib/drift/ helpers | Scheduled drift checks; future runtime-platform drift checks beyond email; corporate-drift regression-tested with extracted helpers | Workflow + module exports |
| run-7 | Filled-in secret-delivery-pattern.md + per-partition mail pages in current-system/runtime/ + Postmark-service multi-Signature updates + encryption-key rotation runbook + docs CHANGELOG entry | Future maintainers; Phase 5b authors; operators | Markdown pages |
6. Rollback semantics
Section titled “6. Rollback semantics”Each run’s failure mode and recovery path:
| Run | Failure mode | Recovery |
|---|---|---|
| run-1 | T-I2 byte-identity test fails on PR → cannot merge | Diagnose construct change → re-author T-I1 → re-run test. No deployed AWS state to roll back. |
| run-1 | T-O2 post-merge Root drift detected | Stop. Investigate Root drift cause (Phase 4 construct change OR external drift unrelated). Resolve before any partition deploy. |
| run-2 | Phase A (register-partition-mail-signature.ts) fails on dev | Idempotent re-run after fixing root cause; no partial AWS state. |
| run-2 | Phase B (cdk deploy) fails on dev | CFN rolls back the stack. Investigate; re-run amm.sh. |
| run-2 | T-O4 Postmark Compliance reply doesn’t unlock arda-nonprod | Operator follows Postmark Support’s direction; may need to provision additional Signatures (run-3 ahead of schedule). Update REQ-OPS-004 documented assumption. |
| run-3 / 4 / 5 | Per-partition deploy fails | Same as run-2: CFN rollback; investigate; re-run for that partition. Prior partitions are unaffected (per the per-partition isolation drove the decomposition). |
| run-6 | Drift workflow fails on first scheduled run | Inspect logs; tune probe thresholds; re-run via workflow_dispatch. No AWS state change. |
| run-7 | make pr-checks fails | Fix the offending page; re-run locally. No AWS state change. |
7. Phase 4 completion criteria
Section titled “7. Phase 4 completion criteria”Phase 4 is complete when all of the following hold:
- Runs 1 through 7 each have their
validate-exit.shexit 0. - All four active partition mail sub-zones live and delegated (REQ-PART-001..006 satisfied per
../design/verification.mdV-PART-001..005). - Each partition’s Postmark Sender Signature registered and verified (REQ-PART-007..010, V-PART-007..010).
- Per-partition encryption-key SM secret exists with
RemovalPolicy.RETAIN(REQ-PART-014, V-PART-014). - Per-partition Postmark account-token SM secret exists, populated via δ.1 (REQ-PART-011, V-PART-011).
- All six
-API-CFN exports per partition (REQ-PART-002, 012, 015, 018, 020 + zone-name). - Both per-partition IAM roles exist (REQ-PART-017, 019; V-PART-017, 019).
-
arda-nonprodPostmark account approval received OR Postmark reply requesting more evidence captured (REQ-OPS-004). -
runtime-platform-drift.ymlhas completed at least one successful scheduled run (REQ-CI-001, V-CI-001). - Root account’s
RootDnsStackproduces byte-identical CFN post-Phase-4 (REQ-IAC-002, V-IAC-002). - Documentation deliverables landed (REQ-DOC-001..004, V-DOC-001..004).
- Operator sign-off table in
../design/verification.mdfully populated. - All seven PRs merged to
mainon their respective repositories.
8. Agent skills consulted
Section titled “8. Agent skills consulted”Orchestration metadata for the team-lead spawning agents per run. Skills below are consulted on demand, not bulk-loaded.
Skill names below correspond to workspace skill directories under workspace/instructions/claude/skills/<name>/. Some have public documentation pages on the Arda docs site; most are agent-only and resolved via the resolve-doc-page.sh helper at skill-load time.
| Skill | Loaded by | Used in |
|---|---|---|
cdk-infrastructure | devops-engineer | All CDK construct / stack / app work (runs 1–6) |
typescript-coding | devops-engineer | tools/ scripts and tools/lib/ helpers (runs 1, 2, 6) |
unit-tests-infra | devops-engineer | CDK Template-matcher test surface (runs 1, 2, 6) |
path-conventions | All personas | Cross-system doc links and relative paths |
document-writing | technical-writer | New current-system/ pages and runbook (run 7) |
pr-steward | All personas | Landing each PR (runs 1–7) |
project-decomposition | Team-lead / user | Authoring this plan/ tree (already applied) |
9. Continuous-improvement observer
Section titled “9. Continuous-improvement observer”The Team Lead spawns the CI Observer at project start. The observer collects observations through runs 1–7 — repeated errors, slow iterations, friction patterns, deviations from the plan — and produces continuous-improvement-proposal.md at the project root at close. The proposal feeds the improvement-analyzer skill (run by the user, post-Phase-4) which decides which proposals become skill updates, agent updates, or template changes.
The CI Observer runs in the background for the full Phase 4 lifecycle. Each run’s validate-exit.sh is a natural data point: when a check fails, the observer notes the cause (script bug? convention violation? agent confusion?) and tags it.
10. Project closure (post-run-7)
Section titled “10. Project closure (post-run-7)”Once Phase 4 completion criteria (§ 7) hold, perform the lifecycle wrap-up:
10.1 Implementation byproducts
Section titled “10.1 Implementation byproducts”Author the following files under 4-runtime-platform-updates/implementation/ before retiring the project directory:
| File | Description |
|---|---|
learnings.md | Durable insights from Phase 4 implementation — patterns, surprises, codebase lessons that should outlive the project. |
suggestions.md | Forward-looking improvements for Phase 5a / 5b and beyond — work that surfaced as worth doing but is out of Phase 4 scope. |
phase-a-deploy.md | Run-1 outcomes; Root no-drift verification result. |
phase-b-deploy-dev.md | Run-2 outcomes; cdk diff summary; sign-off record. |
phase-b-deploy-stage.md | Run-3 outcomes. |
phase-b-deploy-demo.md | Run-4 outcomes. |
phase-b-deploy-prod.md | Run-5 outcomes (production deploy — extra detail). |
phase-c-deploy.md | Run-6 outcomes; first scheduled drift run results. |
phase-d-deploy.md | Run-7 outcomes; documentation review findings. |
continuous-improvement-proposal.md (project root) | Output of the CI Observer + improvement-analyzer consolidating structural improvements identified throughout Phase 4. |
10.2 Move project to roadmap/completed/
Section titled “10.2 Move project to roadmap/completed/”Per the project lifecycle convention, move the 4-runtime-platform-updates/ directory:
git -C /Users/jmp/code/arda/documentation mv \ src/content/docs/roadmap/in-progress/email-integration/4-runtime-platform-updates \ src/content/docs/roadmap/completed/email-integration/4-runtime-platform-updatesUpdate any inbound references in roadmap/completed/email-integration/ and verify make pr-checks still passes.
10.3 Worktree cleanup
Section titled “10.3 Worktree cleanup”Remove the Phase 4 worktrees and local branches once all PRs are merged:
git -C /Users/jmp/code/arda/documentation worktree remove \ /Users/jmp/code/arda/projects/email-integration-worktrees/phase-4/documentationgit -C /Users/jmp/code/arda/infrastructure worktree remove \ /Users/jmp/code/arda/projects/email-integration-worktrees/phase-4/infrastructuregit -C /Users/jmp/code/arda/documentation branch -d jmpicnic/email-integration-phase-4git -C /Users/jmp/code/arda/infrastructure branch -d jmpicnic/email-integration-phase-4rmdir /Users/jmp/code/arda/projects/email-integration-worktrees/phase-4The phase-5a/* and phase-5b/* worktrees stay in place — they continue with their own phase work.
10.4 Cross-link from Phase 5b
Section titled “10.4 Cross-link from Phase 5b”Verify that ../../5b-email-module/pre-existing-decisions.md’s references to Phase 4 resolve correctly post-move. The cross-links to DQ-R1-019, DQ-R1-020, DQ-R1-023, and the per-partition -API- exports must point at the new roadmap/completed/ location.
11. References
Section titled “11. References”evaluation.md— decomposition assessment + recommendation.runs/run-1-workspace-refactors/project-plan.mdthroughruns/run-7-documentation/project-plan.md— per-run plans.../design/specification.md— Phase 4 task contract.../design/analysis.md— capability decomposition + group-level DAG.../design/verification.md— operator sign-off table.../../decision-log.md— DQ-R1-021 partition order, DQ-R1-022 operator surface.process/craft/analysis-and-design/project-decomposition.md— canonical decomposition skill.
Copyright: (c) Arda Systems 2025-2026, All rights reserved
Copyright: © Arda Systems 2025-2026, All rights reserved