Goal: Email Integration Phase 4 -- Runtime Platform Updates
Bring per-partition mail capability online for the Application Runtime instance group, so that future tenant provisioning (Phase 5b) can attach a sending sub-domain, a Postmark server token, and the encryption material needed to store and use that token securely. Phase 4 is the platform-level prerequisite; no tenant traffic flows yet.
This phase is the Application-Runtime-side counterpart of Phase 3 (which established the same surface for the Corporate instance group). It reuses the constructs, conventions, and operator surfaces landed by phases 1-3 and adds the partition-specific instances of those abstractions across the four active partitions (prod, demo, dev, stage). The kyle partition is suspended at the time of Phase 4 (per DQ-R1-021) and is not included in this phase’s rollout; it can be re-introduced later by replaying the partition-deploy procedure if the partition resumes operation.
Tickets
Section titled “Tickets”- PDEV-433 — Phase 4 — Runtime Platform Updates:
This phase’s primary tracker. Status: In Progress (started 2026-05-11 with the
goal.mddraft). - PDEV-201 — Notifications / E-mail Pipeline: Parent umbrella for the whole email-integration project. PDEV-433 is one of its sub-issues.
- PDEV-455 — Phase 3 — Pending Closure Items:
Phase 3 close-out work running in parallel with Phase 4 start. Indirectly relevant: the first non-prod Sender Signature provisioned by Phase 4 also unblocks Postmark Compliance approval for the
arda-nonprodaccount (P3-2 / P3-3 progress).
Repositories
Section titled “Repositories”| Repository | Role | Planned Changes |
|---|---|---|
infrastructure | Primary | Per-partition mail sub-zones ({partition}.ardamails.com), per-partition partition-email stack, partition-aware Postmark credential resolution, per-partition encryption-key secret, DNS-provisioning IAM role per partition, NS-delegation writes upstream into the ardamails.com zone. corporate-drift workflow may be renamed / generalized. |
documentation | Primary | Phase 4 planning artifacts (this directory), DQ-R1-017 decision-log entry, phases.md patches, runtime-platform documentation reconciliation, completion byproducts at phase end. |
common-module | Out of scope for Phase 4 | Library additions consumed by the email module are Phase 5a’s responsibility. |
operations | Out of scope for Phase 4 | The backend ShopAccess/Email module is Phase 5b. |
Success Criteria
Section titled “Success Criteria”Phase 4 is complete when all of the following are true:
- All four active partition mail sub-zones are live and delegated.
dig NS <partition>.ardamails.com @8.8.8.8returns the partition zone’s nameservers via the rootardamails.comzone delegation for every active partition (prod,demo,dev,stage). Thekylepartition is excluded per DQ-R1-021 (suspended). - Each partition can resolve its Postmark account-level credential at deploy time.
postmarkCredentialOpReference(partition)(inplatform/postmark-service.ts) returns the partition’sop://Arda-{Env}OAM/Postmark/credentialreference;amm.shreads it viaop readat deploy time and passes the value tocdk deployas aNoEchoCFN parameter (CDK has no 1Password dependency). The reference resolves to the correct production-vs-non-production Postmark account token per the partition-to-account mapping. - Per-partition encryption-key secret exists and is delivered to pods via ESO. Each partition’s Secrets Manager carries the email-token encryption key; ESO projects it into a Kubernetes Secret in the partition cluster. Phase 5b’s email module will read from it, but Phase 4 only verifies the secret + ESO projection are operational.
- Per-partition DNS-provisioning and SM-fallback IAM roles exist. Each partition has two per-purpose IAM roles, declared in the
partition-emailstack and assumed by the operations component via STS from its IRSA-bound pod role (mirroring theImageUploadPreSigningRolepattern inimage-asset-bucket.ts): (a) the DNS-provisioning role is created by reusing the existingAllowCreatingNSRecordsRoleconstruct (Phase 2; generic Route53 record-set CRUD despite the name), generalized to accept a configurable trust principal — scoped toroute53:ChangeResourceRecordSets/route53:ListResourceRecordSets/route53:ListHostedZonesByNameon the partition’s mail sub-zone only; (b){fqn}-EmailEncryptionKeyFallbackRole— fresh role scoped tosecretsmanager:GetSecretValueon the partition’s encryption-key SM secret (for the Phase 5bTokenCipherSDK-fallback path per DQ-R1-019). Both roles trust an account principal with anArnLikecondition on{fqn}-*so any partition role — including the pod role — can chain in (DQ-R1-020). Hard constraint: the existing Root-account instantiation ofAllowCreatingNSRecordsRoleinroot-dns-stack.tsmust remain byte-identical post-generalization, guarded by a CDKTemplate-equality unit test and a post-deploy Root-no-drift verification. - A first sending domain on
arda-nonprodis verified. A partition (typicallydev) registers a Sender Signature on thearda-nonprodPostmark account. SPF / DKIM / Return-Path are all verified. This satisfies the Postmark Compliance prerequisite forarda-nonprodaccount approval and exercises the partition-shaped pattern end-to-end. runtime-platform-driftworkflow covers the partition surfaces. A new scheduled workflow, running in parallel with the existingcorporate-drift, asserts Postmark + DNS + cross-seam state for every active partition Signature, opening a labelled GitHub issue on any failure. Shared logic between the two workflows is factored into reusable shell scripts or GitHub Actions composite actions (DQ-R1-018).corporate-driftis not renamed.- DQ-R1-017 recorded in the decision log. The “per-partition Sender Signature granularity” decision is pinned: one Signature per partition sub-zone, leaves inherit DKIM, per-tenant Signatures deferred to Phase 5b. Includes the rationale developed in Phase 3’s
suggestions.mdS-1. phases.mdpatched. The Phase 4 section reflects the actual landed scope, exit criteria are self-contained (no Phase 5b gating from Phase 4 deliverables), and dependencies on phases 2 and 3 are explicit.
Context
Section titled “Context”What exists today (post-Phase-3)
Section titled “What exists today (post-Phase-3)”- Root zone
ardamails.com(Phase 2): owned by Root, deployed in the platformRoot account, exported asarda-ardamails-zone. NS-delegations are added by child stacks viaWriteNSRecordsToUpstreamDnsusingAllowCreatingNSRecordsRole(cross-account or same-account). AllowCreatingNSRecordsRole(Phase 2): the cross-account assume-role target in platformRoot’s account that lets child stacks write NS records intoardamails.com. Currently scoped to the entireardamails.comzone; per-child scoping is a future hardening (out of scope here).- Corporate Instance Group (Phase 3): the
arda.ardamails.comzone, SPF + DMARC at its apex, the Free Kanban Tool sending domain, theFreeKanbanToolPostmark server, and thecorporate-clioperator CLI. PostmarkProd Sender Signaturearda.ardamails.comis verified; DKIM is anchored at the parent per DQ-R1-009. - Typed source-of-truth for sender-domain placement:
sendingDomainPlacement()inplatform/constructs/postmark/sending-domain.tsreturns the four FQDNs (fromDomain,postmarkDomainName,dkimHostName,returnPathHostName) consumed identically by the CLI, the CDK construct, and the drift check. Phase 4 reuses the function with partition-shaped inputs. DnsEmailRecordsconstruct: takes absolutedkimRecordFqdnandreturnPathRecordFqdnand emits the two records in a hosted zone. Shape-neutral; the placement decision lives in the caller. Reusable for partitions without change.PostmarkSendingDomainthin-wrapper andPostmarkServerthin-wrapper: stack-level wrappers that validate context values fromcdk.context.json. Reusable.corporate-drift.tscross-seam pattern: probes PostmarkName/DKIMPendingHost/DKIMHost/ReturnPathDomainagainst the placement function plus live DNS. Generalizable to a partition-awaremail-drift.
What Phase 4 introduces
Section titled “What Phase 4 introduces”The partition-shaped instance of all of the above, multiplied across the four active partitions (prod, demo, dev, stage; kyle deferred per DQ-R1-021), plus the partition-aware deploy-time credential plumbing for Postmark account tokens that today is hard-pinned to Arda-SystemsOAM.
In scope
Section titled “In scope”- Per-partition mail sub-zones. A new
r53.PublicHostedZonefor each ofprod.ardamails.com,demo.ardamails.com,dev.ardamails.com,stage.ardamails.com. Each is created in the hosting Infrastructure’s AWS account:prodanddemozones in the Alpha001 account;devandstagezones in the Alpha002 account. Per-partition DKIM independence is preserved by having a distinct hosted zone per partition, even when two zones share an AWS account. (kyle.ardamails.comdeferred — kyle is suspended per DQ-R1-021.) - NS-delegation entries in the root
ardamails.comzone for each partition sub-zone. Same cross-accountWriteNSRecordsToUpstreamDnsmechanism as Phase 3. - Reserved-name list update: the partition names (
prod,demo,dev,stage, pluskylereserved-but-unused) are reserved at theardamails.comlevel (already partially done in Phase 3 forarda; this extends it).kylestays reserved even though its sub-zone is deferred, so it cannot be appropriated as a tenant slug while the partition is suspended. partition-emailstack understacks/purpose/. Owns the partition’s mail sub-zone, its SPF + DMARC records, the per-partition encryption-key Secrets Manager secret, the DNS-provisioning IAM role, and the Postmark Sender Signature wiring (viaPostmarkSendingDomain).- Partition Postmark Sender Signatures. One per partition, anchored at
{partition}.ardamails.com. Production partitions on thePostmarkProdaccount; non-production partitions onPostmarkNonProd. Per DQ-R1-017 (proposed), each partition has its own DKIM key (independent reputation per environment); future per-tenant Signatures are deferred to Phase 5b. - Partition-aware Postmark credential accessor.
platform/postmark-service.tsexposes apostmarkCredentialOpReference(partition)function returning theop://Arda-{Env}OAM/Postmark/credentialreference for the partition’s environment. The function is consumed byamm.sh(viaop read); CDK has no 1Password dependency. Each partition’sArda-{Env}OAMvault carries thePostmarkitem with the relevant account token. (This is the explicit Phase 4 prerequisite already documented in phases.md.) - Per-partition DNS-provisioning role, created by reusing the existing
AllowCreatingNSRecordsRoleconstruct (Phase 2;constructs/oam/allow-creating-ns-records-role.ts) generalized to accept a configurable trust principal. The construct’s permissions are already generic Route53 record-set CRUD; only the trust-policy hard-coding to Lambda needs to be parameterized. Instantiated per partition, withallowedParentHostedZoneIdsscoped to that single partition’s mail sub-zone (DQ-R1-020 least-privilege). Trust policy for the Phase-4 use case: account principal +ArnLikeon{fqn}-*so the partition pod role can chain in via STS (mirroringImageUploadPreSigningRoleinimage-asset-bucket.ts). Permissions:route53:ChangeResourceRecordSets,route53:ListResourceRecordSets,route53:ListHostedZonesByNameon the partition’s mail sub-zone only. (DQ-R1-020.)route53:GetChangeis intentionally omitted — it requiresarn:aws:route53:::change/*resource scope (account-wide, not zone-scoped) and the Email module does not wait on Route53 propagation; Postmark verification is API-driven. Hard constraint: the existing Root-account instantiation inroot-dns-stack.tsmust remain byte-identical at the CloudFormation level post-generalization — guarded by a CDKTemplate-equality unit test and a post-deploy Root-no-drift verification. - Per-partition
EmailEncryptionKeyFallbackRole. Fresh purpose-specific role declared inpartition-email. Same trust-policy shape as above. Permission:secretsmanager:GetSecretValueon${encryptionKeySecret.secretArn}*(full SM-secret ARN; the trailing wildcard tolerates the SM-appended random 6-character suffix — SM versions are selected at call time viaVersionId/VersionStage, not encoded in the resource ARN). Consumed by the Phase 5bTokenCipherSDK-fallback path for envelopes older thanAWSPREVIOUS(DQ-R1-019). (DQ-R1-020.) apps/Al1x/partition.tsupdates to instantiate the new partition-email stack per partition.- Parallel
runtime-platform-driftworkflow. A new.github/workflows/runtime-platform-drift.ymlplus drivertools/runtime-platform-drift.ts(or shell), running in parallel with the existingcorporate-drift. Asserts Postmark + DNS + cross-seam state for every active partition Signature. Shared logic between the two workflows is factored into reusable shell scripts / GitHub Actions composite actions so subsequent runtime-platform drift checks (unrelated to email) can plug into the same workflow without it becoming mail-centric (DQ-R1-018).corporate-driftis not renamed. - Operator surfaces integrated into
amm.sh. Per DQ-R1-022, Phase 4 work is treated as part of the product runtime platform deployment and is invoked throughamm.sh(and its rules: idempotency, security, pre-flight checks, partition selection). New per-partition Sender Signature provisioning, Postmark server creation, andcdk.context.jsonwrites are added as steps inamm.sh(or its callees), not as a standalonepartition-mail-cli. Reusable sub-scripts / utilities (bash or TypeScript) are extracted fromcorporate-cliso bothamm.sh’s partition path andcorporate-clican share logic; this includes refactoring Phase 3 deliverables as needed to keep each script’s complexity bounded. arda-nonprodPostmark Sender Signature. First partition Signature on the non-prod account; satisfies Postmark Compliance’s open ticket 11236089 forarda-nonprod. Recommended first partition:dev.- Documentation. This directory’s planning artifacts (
analysis.md,requirements.md,specification.md,verification.md,exports.md,email-server-key-encryption.md,quality-review-1.md,plan/evaluation.md,plan/choreography.md, per-runplan/runs/run-N-<name>/); reconciliation pages incurrent-system/runtime/reflecting the new partition mail surface; DQ-R1-017 decision-log entry.
Out of scope
Section titled “Out of scope”- Per-tenant Sender Signatures. Each partition’s Sender Signature anchors DKIM for that partition. Tenant leaves under it (
{tenant}.{partition}.ardamails.com) inherit by default; per-tenant Signatures (for stricter isolation) are deferred to Phase 5b where tenant traffic exists. - Backend email module. The
ShopAccess/Emailmodule that provisions tenants, sends mail, and handles webhooks is Phase 5b. - Common-module library additions. Helpers consumed by the email module are Phase 5a.
- Tightening
AllowCreatingNSRecordsRole.allowedParentHostedZoneIds. Phase-3-flagged future hardening; not in Phase 4. - Inbound email / Reply-To routing. Per the project goal, inbound is excluded.
Constraints
Section titled “Constraints”- Each partition deploys independently. No cross-partition build coupling at deploy time. A failure in one partition does not block another. Partitions sharing an Infrastructure account (e.g.,
prod+demoin Alpha001) deploy as separate CDK stacks targeting that account; the stacks are independent at the CFN level. - NS-delegation atomicity (DQ-R1-006): each partition’s stack writes its own NS record into the root zone via
WriteNSRecordsToUpstreamDns. If the cross-account assume-role fails post-zone-creation, the partition zone exists in its Infrastructure’s account but is not delegated; re-deploying the stack invokes the CR again on stack update. The same recovery semantics as Phase 3. - The typed source-of-truth pattern is preserved. All FQDN composition flows through
sendingDomainPlacement()(and partition-aware extensions); no inline composition at consumer sites. This is the structural lesson from Phase 3 (DQ-R1-009 divergence,learnings.mdL-1). - Cross-seam assertions are in place for every Sender Signature. Each partition’s Postmark
Name/DKIMPendingHost/ReturnPathDomainis asserted against the placement function and against live DNS by the drift workflow. - Partition Postmark account assignment is fixed.
prod,demoonPostmarkProd;dev,stage,kyleonPostmarkNonProd. No mixing; no per-partition rebinding. - CFN stack names are immutable. The
partition-emailstacks have stable CDK ids that map to CFN stack names; renaming would force delete-and-recreate (and lose the deployed mail sub-zone). Use CF stack refactoring if a rename is ever required. - Tenant slug reservations preserved. Partition names continue to be reserved at the
ardamails.comlevel so future tenants cannot appropriate them as slugs. - Layered-architecture rule enforced. ESLint-checked, per Phase 3’s
eslint.config.mjssettings. New partition-email stack obeys thescript → instances → apps → stacks → constructs → platformdependency direction. - CFN exports follow the
-API-convention. Every Phase 4 export (partition mail zone ID/name, IAM role ARNs, SM secret ARNs) is consumed by the operations Helm chart (non-CDK), so all exports use the${publishingPrefix}-API-<Key>form percdk-infrastructure.md§ Export naming. The closest precedent isinfrastructure/src/main/cdk/stacks/purpose/image-storage.ts, where every export uses-API-because the consumers are the operations component’spresigned-urlflow and Helm chart values. Phase 3’sCorporate-I-MailZoneIdused-I-because its consumer was a sibling CDK stack (FreeKanbanToolMailDns); Phase 4 has no equivalent CDK consumer. The-I-marker continues to appear on AWS resource names (e.g.,{fqn}-I-EmailEncryptionKey,{fqn}-I-EmailPostmarkAccountToken) — that marker indicates intra-partition AWS scope per thepartitionSecrets.cfn.yamlresource-naming convention and is independent of the CFN export-name marker.
Deliverables
Section titled “Deliverables”| # | Deliverable | Location |
|---|---|---|
| 1 | Phase 4 planning artifacts | roadmap/in-progress/email-integration/4-runtime-platform-updates/ (this directory): analysis.md, requirements.md, specification.md, verification.md, exports.md, email-server-key-encryption.md, quality-review-1.md, plan/evaluation.md, plan/choreography.md, and per-run plan/runs/run-N-<name>/{project-plan.md,validate-exit.sh} (runs 1–7). |
| 2 | DQ-R1-017 decision-log entry | roadmap/in-progress/email-integration/decision-log.md |
| 3 | Partition mail sub-zones (4×) | infrastructure/src/main/cdk/stacks/purpose/partition-email.ts; instances at infrastructure/src/main/cdk/instances/Alpha001/{prod,demo}.ts and Alpha002/{dev,stage}.ts (kyle deferred per DQ-R1-021). Exports: ${publishingPrefix}-API-PartitionMailZoneId, ${publishingPrefix}-API-PartitionMailZoneName. |
| 4 | Partition-aware Postmark credential accessor | infrastructure/src/main/cdk/platform/postmark-service.ts — exposes postmarkCredentialOpReference(partition): string returning the op://Arda-{Env}OAM/Postmark/credential reference; consumed by amm.sh (not by CDK directly). amm.sh reads the value via op read and passes it to cdk deploy as a NoEcho CFN parameter; the partition-email stack declares the parameter and populates the per-partition Postmark account-token SM secret (resource name {fqn}-I-EmailPostmarkAccountToken) via SecretValue.cfnParameter(). The SM secret ARN is exported as ${publishingPrefix}-API-EmailPostmarkAccountTokenArn for the operations Helm chart’s ESO ExternalSecret. Mirrors the established partitionSecrets.cfn.yaml pattern (B2 / δ.1). |
| 5 | Per-partition encryption-key Secrets Manager secret | declared in partition-email.ts (resource name {fqn}-I-EmailEncryptionKey); projected via ESO into the partition cluster (two ExternalSecret mounts — AWSCURRENT + AWSPREVIOUS). The SM secret ARN is exported as ${publishingPrefix}-API-EmailEncryptionKeyArn for the Helm chart’s ESO definitions. |
| 6 | Per-partition DNS-records role (via generalized AllowCreatingNSRecordsRole) + per-partition EmailEncryptionKeyFallbackRole | DNS-records role: reuse the existing AllowCreatingNSRecordsRole construct (Phase 2; constructs/oam/), generalized to parameterize its trust principal. Instantiated per partition-email.ts with the pod-STS trust principal and allowedParentHostedZoneIds scope to the partition’s mail sub-zone. Fallback role: fresh declaration in partition-email.ts. Both STS-chained from the partition pod role (mirrors ImageUploadPreSigningRole). Construct change carries a CDK Template-equality unit test ensuring the Root-account instantiation is byte-identical post-generalization. (DQ-R1-020.) Exports: ${publishingPrefix}-API-EmailDnsProvisioningRoleArn, ${publishingPrefix}-API-EmailEncryptionKeyFallbackRoleArn — both consumed by the operations Helm chart for STS-AssumeRole credential providers (per image-storage.ts precedent). |
| 6a | Construct generalization regression guard | CDK Template.fromStack() snapshot test in root-dns-stack.test.ts (or allow-creating-ns-records-role.test.ts) asserting that the synthesized template for the Root-account instantiation is byte-identical before and after the trust-principal parameterization. Fails closed: if the generalization changes Root output, the test fails before the PR can merge. |
| 6b | Root-account no-drift verification | Operator step in the rollout sequence: synthesize RootDnsStack post-generalization, diff against the deployed CFN template in the Root account. Expected diff is empty. Captured as V-PART-NNN in verification.md. Runs before any partition-mail deploy. |
| 7 | runtime-platform-drift workflow (parallel) | infrastructure/.github/workflows/runtime-platform-drift.yml + driver under infrastructure/tools/. Shares reusable scripts / composite actions with corporate-drift; corporate-drift is not renamed (DQ-R1-018) |
| 8 | amm.sh-integrated partition-mail steps | Phase 4 operator work lives inside amm.sh (or its callees), following the same idempotency / security / check rules. Reusable bash + TypeScript utilities extracted from corporate-cli are shared between amm.sh and corporate-cli (DQ-R1-022); includes refactoring Phase 3 deliverables as needed |
| 9 | arda-nonprod Postmark Sender Signature | First partition (e.g., dev.ardamails.com) registered on PostmarkNonProd; SPF/DKIM/Return-Path verified |
| 10 | phases.md patches | Phase 4 section finalised; exit criteria + dependency-arrow corrections |
| 11 | Documentation reconciliation | new pages in current-system/runtime/ for partition mail; updates to current-system/oam/postmark-service/ covering the multi-account / multi-Signature operator surface |
| 11b | Secret-delivery pattern documentation | new file current-system/oam/security/secret-delivery-pattern.md documenting the canonical op → amm.sh → CFN NoEcho parameter → SM secret → consumer flow (with partitionSecrets.cfn.yaml and the Phase 4 Postmark token as worked examples); cross-linked from secrets-vault.md. Covers GHA ::add-mask:: hygiene and rotation flow. |
| 12 | Phase 4 completion byproducts | implementation/learnings.md, implementation/suggestions.md, implementation/phase-{a,b}-deploy.md per partition, completion log |
Open design questions (proposed for Round R1-Phase4 of the decision log)
Section titled “Open design questions (proposed for Round R1-Phase4 of the decision log)”These are the load-bearing questions for Phase 4 that the planning artifacts should pin before implementation begins. The Guidance column captures the operator’s input on each question; rows marked FILL-IN await further input, rows marked Decided carry the resolution that the forthcoming decision-log entries will formalize.
| # | Decision ID | Topic | Working assumption | Guidance |
|---|---|---|---|---|
| 1 | DQ-R1-017 | Postmark Sender Signature granularity per partition | One Signature per partition sub-zone; leaves inherit DKIM; per-tenant Signatures deferred to Phase 5b. See suggestions.md S-1 for rationale. | Decided. Working assumption confirmed: one Postmark Sender Signature per partition sub-zone ({partition}.ardamails.com); leaves under each inherit DKIM via the partition’s signing key. Per-tenant Signatures remain deferred to Phase 5b. |
| 2 | DQ-R1-018 | corporate-drift rename and scope | Rename to mail-drift; single workflow asserts Corporate + all partition Signatures + cross-seam agreement for each. Alternative: keep corporate-drift and add a parallel partition-mail-drift. | Decided. Keep corporate-drift unchanged. Add a new runtime-platform-drift workflow in parallel, sharing logic via reusable shell scripts or GitHub Actions composite actions. Rationale: future runtime-platform drift checks unrelated to email will plug into this workflow without mail-centric naming. (Supersedes the mail-drift rename suggested in suggestions.md S-5.) |
| 3 | DQ-R1-019 | Per-partition encryption-key derivation and storage | Each partition has its own randomly-generated AES-256 key in Secrets Manager, projected into the partition cluster via ESO. No KMS-CMK-wrapped envelope encryption (per DQ-012 which decided the at-rest encryption approach). Open: key generation, rotation, and key-version handling. | Decided. Full design in design/email-server-key-encryption.md. Three sub-decisions: (1) Single SM secret per partition named {fqn}-I-EmailEncryptionKey (the -I- marker matches the convention as practiced for intra-partition resources, including ESO-consumed ones); passwordLength: 64; RemovalPolicy.RETAIN. No version suffix in the resource name — versioning is delegated to AWS Secrets Manager’s native per-secret versioning. (2) Two-axis envelope a{N}.k{SM-VERSION-ID}:<base64-payload>: a{N} is the algorithm version (rare; code-indexed; bumps require a release); k{...} is the AWS SM versionId (frequent; runtime-indexed). Algorithm and material have profoundly different lifecycles; separating them avoids churning the code-side dispatch table on every routine rotation. (3) Hot-swap rotation via SM-native versioning: rotation = aws secretsmanager update-secret (creates new SM versionId; auto-promotes to AWSCURRENT; demotes prior to AWSPREVIOUS). Helm in operations declares two ExternalSecrets (AWSCURRENT + AWSPREVIOUS); pod’s TokenCipher holds both derived keys. Migration is lazy + coroutine mop-up: first non-up-to-date read synchronously re-encrypts its own row, then launches a per-pod coroutine that mops up the rest. Rare reads of rows older than AWSPREVIOUS fall back to a direct AWS SM SDK fetch via the EmailEncryptionKeyFallbackRole (assumed via STS from the operations pod’s IRSA-bound pod role; the pod role itself does not carry secretsmanager:GetSecretValue, per DQ-R1-020). Phase 4 ships only the initial SM secret; the dispatch / migration machinery lands in Phase 5b. Future automated rotation via AWS SM Rotation Lambdas plugs in natively. |
| 4 | DQ-R1-020 | DNS-provisioning + SM-fallback IAM roles | The role’s trust policy authorizes only the operations component’s pod identity (IRSA). Permissions: Route53 Change*ResourceRecordSets on the partition’s mail sub-zone; secretsmanager:GetSecretValue on the partition’s encryption-key secret. Open: extend the existing pod role or follow the established pattern of fresh purpose-specific roles assumed via STS. | Decided. Full design in decision-log.md DQ-R1-020. Phase 4 provisions two per-purpose roles per partition: (a) the DNS-records role is created by reusing the existing AllowCreatingNSRecordsRole construct (Phase 2; generic Route53 record-set CRUD despite the name), generalized to accept a configurable trust principal — with the hard constraint that the existing Root-account instantiation must remain byte-identical post-generalization (guarded by a CDK Template-equality unit test and a Root no-drift verification); (b) EmailEncryptionKeyFallbackRole is a fresh declaration in partition-email.ts. Both roles use trust policy = account principal + ArnLike on {fqn}-*. The operations pod federates into {fqn}-EksPodRole via IRSA (existing), then performs sts:AssumeRole into the purpose-specific role at the call site (DQ-204 STS chain). Mirrors ImageUploadPreSigningRole in image-asset-bucket.ts. The pod role is not extended — consistent with every other partition workload. |
| 5 | DQ-R1-021 | Order of partition rollout | dev first (also satisfies arda-nonprod Postmark approval), then kyle, stage, demo, prod. Matches the convention noted in phases.md Phase 5b. | Decided. Order: dev → stage → demo → prod. kyle is excluded from Phase 4 (partition suspended/on-hold; the kyle.ardamails.com sub-zone is not provisioned). dev first still satisfies the arda-nonprod Postmark approval prerequisite. Supersedes the order in phases.md Phase 5b (which lists kyle in wave 1); phases.md should be patched to reflect the new convention. |
| 6 | DQ-R1-022 | Operator CLI shape | Generalize corporate-cli to take a partition argument (or asset+partition pair), rather than create a parallel partition-mail-cli. Open: whether the generalization lands as a Phase 4 refactor or whether a parallel CLI ships and corporate-cli is retired later. | Decided. Phase 4 partition-mail provisioning is part of the product runtime platform deployment and lives inside amm.sh (or its callees), following its rules: idempotency, security, pre-flight checks. Not a standalone partition-mail-cli. Extract reusable sub-scripts / utilities (bash or TypeScript) from corporate-cli so both amm.sh and corporate-cli share logic; refactor Phase 3 deliverables as needed to keep each script’s complexity bounded. |
Pre-design follow-ups (resolved)
Section titled “Pre-design follow-ups (resolved)”The Open Design Questions table above lists the load-bearing Round R1-Phase4 decisions (DQ-R1-017..022). Below are the smaller pre-design follow-ups closed during planning. Each is “pick the default and move on” rather than a load-bearing decision; none warrants a DQ-R1-NNN entry.
| ID | Item | Resolution |
|---|---|---|
| B1 | Phase 5a TokenCipher location | Ships in common-module as a general-purpose utility usable by any component needing an encrypted-field primitive — not Email-specific. Phase 5a’s goal.md will absorb the TokenCipher (HKDF + AES-256-GCM + two-axis envelope codec) into its deliverables. |
| B2 | postmarkCredentialOpReference(partition) shape | Adopt the established partitionSecrets.cfn.yaml pattern (option δ.1). amm.sh reads the token via op read and passes it to cdk deploy as a NoEcho parameter; the partition-email stack declares the parameter and creates the SM secret with SecretValue.cfnParameter(). The accessor in platform/postmark-service.ts collapses to a single function returning the op:// reference for amm.sh to consume; CDK has no 1Password dependency. Same SM secret serves the CR Lambda (Sender Signature registration) and the runtime ESO mount. (See deliverable row 4 + 11b.) |
| B3 | amm.sh extraction scope from corporate-cli | Minimal cut line: extract only the utilities amm.sh’s partition-mail steps actually need; backfill on demand. Avoids smuggling a Phase 3 refactor inside the Phase 4 PR. |
| B4 | kyle partition sub-zone reservation mechanics | Extend the same registry mechanism Phase 3 used to reserve arda at the ardamails.com level. No new mechanism. |
| B5 | Cross-partition deploy gating in CI | Operator-enforced (via amm.sh); not CI-enforced. No tools/cdk-runner.js matrix change. Document the recommended order (dev → stage → demo → prod) in verification.md. |
| C1 | CDK stack name / CFN logical id for the new stack | ${infrastructure}-${partition}-Email. Mirrors the existing ${infrastructure}-${partition}-Secrets / ${infrastructure}-${partition}-Amplify naming used by amm.sh. CFN stack names are immutable; locked at first deploy. |
| C2 | DMARC reporting mailbox per partition | Reuse dmarc-reports@arda.cards (the Corporate mailbox from DQ-R1-015) for all partitions. Per-partition mailboxes are unnecessary — DMARC report content already identifies the source domain. |
| C3 | runtime-platform-drift cron + label conventions | Daily cron (match corporate-drift). Failure-issue labels: drift + runtime-platform. Workflow shape parallels corporate-drift from day one. |
Reference Documents
Section titled “Reference Documents”- Project goal — overarching project intent and constraints.
- Phases plan — Phase 4 section is the canonical source for deliverables, dependencies, exit criteria.
- Architecture overview — Application Runtime / Infrastructure / Partition terminology; mail topology.
- Decision log — DQ-001..013 and DQ-R1-001..016 establish the constraints Phase 4 must honor; new entries DQ-R1-017..022 (Round R1-Phase4) will formalize the decisions captured in the Open design questions table below.
- Phase 3 implementation:
dqr1009-divergence.md— the structural lesson that shapes Phase 4’s testing posture. - Phase 3 implementation:
learnings.md— six durable insights from Phase 3, all relevant to Phase 4 design. - Phase 3 implementation:
suggestions.md— forward-looking items for Phase 4 (S-1 through S-5 are directly load-bearing). - CDK Infrastructure reference — repo-local construct conventions, cross-stack reference patterns, custom-resource fan-out, IaC layered-architecture rule.
- Postmark Service operator overview — Postmark account model, Sender Signatures, server tokens.
- Workspace-local session-handoff snapshot at Phase 4 start:
projects/email-integration-worktrees/project-checkpoint-20260511.md(outside the docs tree; not a Markdown link target).
Copyright: (c) Arda Systems 2025-2026, All rights reserved
Copyright: © Arda Systems 2025-2026, All rights reserved