Skip to content

Phase 5b -- Pre-Existing Decisions

Phase 5b implements the ShopAccess/Email module inside the operations component: per-tenant Postmark server provisioning, sending APIs, bounce / complaint webhook handling, suppression-list maintenance, and the TokenCipher / EmailConfigurationService infrastructure. It consumes the per-partition resources that Phase 4 lays down and the cross-cutting helpers that Phase 5a contributes.

This file enumerates every prior decision Phase 5b must honor, organised by source. When Phase 5b’s goal.md and design artifacts are written, they reference these decisions rather than re-deriving the underlying constraints. This is a reference document, not a planning artifact.

From ../goal.md:

  • A new tenant is configured with a dedicated sending domain; DNS records are requested, published, and verified automatically.
  • Email sending is gated by DKIM + Return-Path verification of the tenant’s sending domain.
  • Tenant email configuration can be locked, retried, or removed through the platform’s standard API surface — no ad-hoc operator action.
  • A consumer requests an email send; the request produces a delivery attempt observable end-to-end.
  • Delivery, bounce, and spam-complaint events from Postmark are received, authenticated, and reflected in the recorded job status.
  • Transactional-only. The EmailSender L2 interface accepts transactional sends exclusively (1-to-1, user-action-triggered, non-promotional). Every Postmark server provisioned for Application Runtime tenants uses the default Transactional Message Stream. Bulk / broadcast / marketing sends are out of scope for the Phase 5b interface; any future broadcast use case requires a separate Postmark Broadcast Message Stream and a distinct L2 API. Constraint recorded in cross-cutting-design § 7a and the Postmark operator runbook; it derives from Postmark policy and is reinforced as a deliverability prerequisite in Postmark’s account-approval correspondence.
  • The Free Kanban Tool (Phase 3 deliverable) sends transactional email from freekanban.arda.ardamails.com. Phase 5b’s email module does not consume the Free Kanban Tool’s flow; the Free Kanban Tool has its own runtime resolution path. The Phase 5b module is for Application Runtime tenants.
  • Periodic drift detection asserts external resources match the declared state. Phase 5b adds module-side drift to the existing corporate-drift (Corporate) + runtime-platform-drift (Phase 4) workflows: e.g., asserting that every persisted email_configuration row’s keyVersionId is reachable in the operations component pod’s key registry.
  • Aggregate operational management uses the Postmark Console (no Arda-side OAM surface for the email service).

From ../architecture-overview.md:

  • Application-layer is four sub-layers (L1 / L2 / L3 / L4 per DQ-201).
  • L1 — Protocol proxies: stateless wrappers around external HTTP / SDK APIs. One per external surface: postmarkAccountProxy, postmarkServerProxy, route53ZoneProxy. One credential strategy per proxy. runCatching-bodied methods returning Result<T>.
  • L2 — Capability composers: stateless. Choreograph L1 calls into capability operations (provision, decommission, verifyDns, sendOne). Map external errors to capability errors. No DB access.
  • L3 — Application services: stateful. Hold DB access via DataAuthority. Hold the encryption key (DQ-202 / DQ-203). Encrypt before persist; decrypt on demand. Spawn bounded DNS-verification polling rounds via per-pod activePolling map (DQ-207).
  • L4 — HTTP endpoints: REST entry points: email-configuration, email-job, postmark-events (webhook).
DecisionSubjectPhase 5b-relevant constraint
DQ-001Tenant sending-domain shape {tenant}.{partition}.{mail-root-domain} uniformlyThe Email module must produce / consume FQDNs in this shape; no special-casing for prod vs non-prod
DQ-002Multi-config strategy: {conf-slug}.{tenant}.{partition}.{mail-root-domain} sub-subdomain (v2+)v1 provisions at the tenant level only; schema includes nullable config_slug for future extension
DQ-003Tenant slug source from provisioning requestThe provisioning API contract must accept tenantEId / tenantName / tenantSlug
DQ-004Reply-To not user-editableSend dialog and BFF route honor this; system-resolved from procurement contact or user email
DQ-005Email order send paths: copy-paste + system both coexistSend service supports both paths
DQ-006CS alerting via ESP OOTB only in v1No Arda-built alerting in Phase 5b
DQ-007Document generation owned by calling featureEmail module accepts Blob/URL; does not generate documents
DQ-008Single-step send dialogUI-level (not Phase 5b directly)
DQ-009Mail root domain ardamails.comAll FQDN composition uses this constant
DQ-010Per-partition zone placementPartition zone is the mail sub-zone; tenant records land within it
DQ-011Bearer-token webhook auth via Postmark modern Webhooks APIpostmark-events route validates ARDA_API_KEY in Authorization: Bearer …
DQ-012Per-tenant server token encrypted in DB (application-level)EmailConfigurationService encrypts before persist; decrypts on demand. See DQ-R1-019 for the encryption-key design.
DQ-013IAM role stays in RootDnsStackNot extracted; informs Phase 4’s IAM-role design but not Phase 5b directly

These pin the module’s internal structure. Phase 5b must implement against these:

DecisionSubjectPhase 5b implementation note
DQ-201L1/L2/L3/L4 sub-layersModule package structure matches this layering; no upward dependencies
DQ-202AES-256-GCM versioned envelopeUpdated by DQ-R1-019 to a two-axis envelope a{N}.k{SM-VERSION-ID}
DQ-203HKDF derivation, info = "arda.email.serverToken.a{N}"Phase 5b implements EnvelopeAlgorithm.deriveKey per a{N}
DQ-204STS role chain (15-min session) for outbound AWSroute53ZoneProxy (L1) uses STS-chained credentials
DQ-205Persist-first lifecycleemail_configuration rows are written before external provisioning calls
DQ-206Encryption key held by L3 service onlyTokenCipher injected at L3; never reaches L2 or L1
DQ-207Per-pod bounded DNS-verification pollingL3 holds the activePolling map; bounded retries
DQ-208Async-tx boundaries owned by L3L1 / L2 are stateless; L3 owns the transaction boundary

Round R1-Phase1 decisions (DQ-R1-001..007)

Section titled “Round R1-Phase1 decisions (DQ-R1-001..007)”
DecisionSubjectPhase 5b note
DQ-R1-001external-resources-drift.yml filenamePhase 5b’s module-side drift extends the same naming convention if it ships drift checks
DQ-R1-002tools/drift-check.ts locationSame convention applies if Phase 5b ships a module-side drift driver
DQ-R1-003..005Operator runbook conventionsPhase 5b’s runbook follows the same Markdown sign-off format
DQ-R1-006Cross-zone NS-delegation locusPhase 5b’s tenant DNS provisioning writes upstream into the partition zone via the existing WriteNSRecordsToUpstreamDns pattern
DQ-R1-007Arda-CorporateOAM vault separation for Free Kanban Tool tokenPhase 5b’s per-tenant token storage is in DB (DQ-012), not 1Password — DQ-R1-007 is informational only
DecisionSubjectPhase 5b note
DQ-R1-008cdk import for the existing ardamails.com zoneAlready adopted; informational

Round R1-Phase3 decisions (DQ-R1-009..016)

Section titled “Round R1-Phase3 decisions (DQ-R1-009..016)”
DecisionSubjectPhase 5b note
DQ-R1-009Postmark domain verification at the parent (Corporate scope)Phase 5b’s per-tenant verification follows the per-tenant pattern (each tenant is its own Sender Signature); the parent-verification pattern applies to Phase 4 partition Signatures, not to per-tenant ones
DQ-R1-010NS-delegation write through WriteNSRecordsToUpstreamDns even when same-accountPhase 5b’s per-tenant DNS writes use the same pattern
DQ-R1-011dns-zone.ts construct renamePhase 5b can consume the renamed construct
DQ-R1-012corporate-drift.yml scopePhase 5b does not modify the Corporate drift; Phase 5b’s module-side drift is separate (see Phase 4 runtime-platform-drift)
DQ-R1-013Phase A failure ordering for Postmark server tokenPhase 5b’s per-tenant server creation follows the same in-memory buffer + retry pattern adapted to L1/L2/L3
DQ-R1-014cdk.context.json commit policyPhase 5b does not write to cdk.context.json; CDK-side concern only
DQ-R1-015DMARC mailbox dmarc-reports@arda.cardsDMARC reports for the partition mail tree route here; informational
DQ-R1-016Reserved-name registry at arda.ardamails.comThe per-tenant slug-validation logic in Phase 5b must reject reserved names

Round R1-Phase4 decisions (DQ-R1-017..022) and the Phase 5b open question (DQ-R1-023)

Section titled “Round R1-Phase4 decisions (DQ-R1-017..022) and the Phase 5b open question (DQ-R1-023)”
DecisionSubjectPhase 5b implementation note
DQ-R1-017One Postmark Sender Signature per partition; leaves inherit DKIMPhase 5b’s per-tenant sub-domains inherit DKIM via the partition’s signing key by default. Whether per-tenant Signatures are introduced is the separate open question DQ-R1-023 (below).
DQ-R1-018Parallel runtime-platform-drift workflowPhase 5b’s module-side drift extends runtime-platform-drift, sharing the reusable scripts established in Phase 4
DQ-R1-019Per-partition email server-token encryption key (two-axis envelope, hot-swap, lazy + coroutine migration, SDK fallback)Major Phase 5b deliverable. TokenCipher implements the dispatch model; EmailConfigurationService (L3) drives the lazy migration coroutine; Helm chart declares two ExternalSecret resources for AWSCURRENT and AWSPREVIOUS (both referencing the SM ARN from ${publishingPrefix}-API-EmailEncryptionKeyArn); the operations pod STS-chains into EmailEncryptionKeyFallbackRole (not the pod’s own IRSA role directly) for SDK cache-miss fetches. Full design in ../4-runtime-platform-updates/design/email-server-key-encryption.md.
DQ-R1-020DNS-provisioning + SM-fallback IAM rolesResolved. Two STS-chained per-purpose roles per partition; DNS-records role via reuse of the generalized AllowCreatingNSRecordsRole construct (Root no-drift guarded). Phase 5b consumes the role ARNs via the ${publishingPrefix}-API-EmailDnsProvisioningRoleArn and ${publishingPrefix}-API-EmailEncryptionKeyFallbackRoleArn CFN exports, threaded into Helm values by amm.sh (Phase 4 specifies the exact wiring).
DQ-R1-021Partition rollout order devstagedemoprod; kyle excludedPhase 5b deploys to one partition at a time in this order
DQ-R1-022Operator CLI integrated into amm.sh, no standalone partition-mail-cliPhase 5b’s tenant-provisioning admin tooling extends amm.sh (or its callees) rather than adding a sibling CLI. Reusable utilities are shared with corporate-cli (extracted by Phase 4).
DQ-R1-023Per-tenant Postmark Sender Signature introductionOpen — to be confirmed at Phase 5b planning. Four options (α status quo / β per-tenant from v1 / γ hybrid opt-in / δ remediation-only). If Phase 5b adopts β / γ / δ, the runtime tenant-onboarding flow registers per-tenant Sender Signatures via the Postmark Account API and writes per-tenant DKIM TXT + Return-Path CNAME records into the partition mail sub-zone using the EmailDnsProvisioningRole (provisioned by Phase 4 specifically for this purpose). If α, the role is unused in v1 but available for later introduction with no re-work. Decision should be informed by pilot-phase tenant volume + bounce/spam data, Postmark guidance at the time, and any tenant contractual reputation-isolation requirements. Full text in ../decision-log.md#dq-r1-023.

Phase 5b deploys on top of Phase 4’s per-partition surface. Each item below is a Phase 4 output Phase 5b reads / mounts / invokes:

Phase 4 deliverablePhase 5b consumption
Per-partition mail sub-zone ({partition}.ardamails.com)Per-tenant sub-domain records are created within this zone by the L1 route53ZoneProxy
WriteNSRecordsToUpstreamDns constructPer-tenant DNS writes use this pattern (DQ-R1-010)
Per-partition Postmark account-token SM secret ({fqn}-I-EmailPostmarkAccountToken)Read via the existing extras.email.postmarkAccountToken HOCON path
Per-partition encryption-key SM secret ({fqn}-I-EmailEncryptionKey)Read via two ExternalSecret mounts (AWSCURRENT + AWSPREVIOUS); fed into TokenCipher
Per-partition DNS-provisioning IAM rolePod IRSA assumes this role for tenant DNS writes
Partition-aware Postmark credential accessor in platform/postmark-service.tsDeploy-time tooling resolves the credential reference per partition
Per-partition Sender Signature at {partition}.ardamails.comPer-tenant leaves inherit DKIM from this Signature

From ../cross-cutting-design.md:

  • Threat model: Phase 5b owns DB-exposure defense via DQ-012 + DQ-R1-019 (application-level encryption). Pod / process compromise is platform-level (not Phase 5b’s scope).
  • Webhook auth: postmark-events route validates Authorization: Bearer ARDA_API_KEY in-component (no API-Gateway authorizer).
  • Outbound auth:
    • Postmark Account API uses X-Postmark-Account-Token from per-partition SM secret via ESO.
    • Postmark Server API (per-tenant) uses the per-tenant decrypted token, passed in-memory through L2 / L1; never logged.
    • Route53 uses STS-chained credentials (15-min sessions) per DQ-204.
  • Secret-handling rules apply to Phase 5b code:
    • Plaintext server tokens never logged.
    • Encryption key never logged.
    • Field-by-field log helpers exclude serverTokenEncrypted and any plaintext.
  • Drift detection: Phase 5b extends the partition-side runtime-platform-drift workflow with module-level assertions (e.g., every persisted keyVersionId is reachable; orphan Postmark servers vs DB rows).
  • operations is a component (the Kotlin/Ktor runtime deployable, single Helm release per partition).
  • ShopAccess/Email is a module (a functional element inside operations, communicating with other modules via Kotlin DI service interfaces).
  • The IRSA role, the pod’s identity, the Helm release, the ExternalSecrets — all are properties of the component.
  • The TokenCipher, EmailConfigurationService, EmailJobService, EmailSender, the L1 / L2 / L3 / L4 layering — all are properties of the module.

Phase 5b’s docs should preserve this distinction.

Other artifacts to consult during Phase 5b planning

Section titled “Other artifacts to consult during Phase 5b planning”

Copyright: (c) Arda Systems 2025-2026, All rights reserved