Email Integration -- Architectural Scenarios
Functional-level sequence diagrams for the key use cases of the ShopAccess/Email module. Participants represent functional components as defined in functional.md.
Scenario 1: Provision Tenant Email Configuration
Section titled “Scenario 1: Provision Tenant Email Configuration”A client system (CS script or future admin UI) provisions a new tenant email configuration with a unique tenant slug and config slug. The L3 application service runs pre-flight checks inside a DB transaction before any external mutation, then orchestrates the L2 capability composer to create external resources in a specific order (Postmark first, Route53 second), then persists the captured IDs and triggers the bounded DNS-verification polling round.
Decisions reflected in this scenario:
- DQ-201: two-level server module (L1 protocol proxies + L2 capability composers).
- DQ-202 / DQ-203: AES-256-GCM with versioned envelope; key derived via HKDF.
- DQ-204: STS auto-chained at module startup; the L1 Route53 proxy makes calls without per-call AssumeRole.
- DQ-205: persist-first lifecycle (PROVISIONING entry state); pre-flight checks; structured
Failure(PartialProgress)on partial failure. - DQ-205.k: all Postmark mutations before any Route53 mutation.
- DQ-205.m: Route53 record writes use UPSERT.
- DQ-206: slug resolution.
Pre-conditions:
- Tenant exists in the system with a valid
tenantEId. - Postmark account token, encryption key, Route53 role ARN, and zone ID available via HOCON (delivered by ESO at startup).
- Partition Route53 zone exists (infrastructure prerequisite).
Post-conditions on success:
- Postmark server, sending domain, and webhook created (in that order).
- DKIM TXT, Return-Path CNAME, and DMARC TXT records UPSERTed in the partition Route53 zone.
email_configurationrow persisted with encrypted server token, all external IDs, statusPENDING_VERIFICATION.- Client receives the configuration immediately; DNS verification proceeds asynchronously via Scenario 1b.1.
Post-conditions on partial failure (any step in section “Run External Mutations” fails):
- Row persisted with status
PROVISIONING_FAILED, partial external IDs captured, diagnostic message describing the failure point. - Operator triages via DELETE (best-effort decommission, see Scenario 4 / DQ-205.d).
Scenario 1b: Async DNS Verification (trigger-driven)
Section titled “Scenario 1b: Async DNS Verification (trigger-driven)”DNS verification is trigger-driven rather than continuously polled. Three triggers feed a single shared primitive — a bounded polling round of up to 5 attempts × 60 seconds — inside EmailConfigurationService. Verification success transitions the row to UNLOCKED; round exhaustion leaves it in PENDING_VERIFICATION until the next trigger fires. There is no automatic transition to VERIFICATION_FAILED in v1. See DQ-207.
The bounded polling round itself is identical across triggers; the three sub-scenarios below differ only in how the round is initiated. Scenario 1b.1 shows the full round; Scenarios 1b.2 and 1b.3 show only the trigger and reference 1b.1 for the polling block.
Scenario 1b.1: DNS Verification triggered by provisioning success
Section titled “Scenario 1b.1: DNS Verification triggered by provisioning success”Provisioning’s tail (Scenario 1, “Trigger bounded DNS verification”) kicks off a fire-and-forget bounded polling round on the pod that handled the provision request.
Pre-conditions:
- Row was just transitioned to
PENDING_VERIFICATION(fromPROVISIONING). verification_started_atset tonow().postmarkDomainIdpopulated.
Post-conditions on success:
- Row transitions to
UNLOCKED; idempotent UPDATE guarded byWHERE status = 'PENDING_VERIFICATION'.
Post-conditions on round exhaustion:
- Row stays in
PENDING_VERIFICATION. Pod-localactivePollingentry removed. Recovery awaits the next trigger (Scenario 1b.2 or 1b.3).
Scenario 1b.2: DNS Verification triggered by manual /retry-verification
Section titled “Scenario 1b.2: DNS Verification triggered by manual /retry-verification”CS or an operator hits the retry endpoint to kick off a fresh bounded round, typically in response to an operator-alert page or to recover from a previous round’s exhaustion.
Pre-conditions:
- Row in
PENDING_VERIFICATIONorVERIFICATION_FAILEDstate.
Post-conditions:
verification_started_atrefreshed; if fromVERIFICATION_FAILED, status transitions toPENDING_VERIFICATION.- A bounded polling round is kicked off (deduplicated via
activePolling). - Endpoint returns 200; further state transitions happen asynchronously per Scenario 1b.1.
Scenario 1b.3: DNS Verification triggered by send-time precondition fail
Section titled “Scenario 1b.3: DNS Verification triggered by send-time precondition fail”A send attempt against a PENDING_VERIFICATION row fails fast (the send is not delayed by a synchronous verify), but kicks off a fresh bounded polling round as a fire-and-forget side effect so the next send attempt is more likely to succeed. See DQ-207.b.
Pre-conditions:
- An
EmailJobcreate / send call landed. - Looked-up
EmailConfigurationis inPENDING_VERIFICATION(notUNLOCKED).
Post-conditions:
- The current send attempt fails fast with
PreconditionFailed. - A bounded polling round is kicked off (deduplicated via
activePolling). EmailJobrow transitions toFAILEDwith diagnostic per Scenario 2.
Scenario 2: Send Email
Section titled “Scenario 2: Send Email”A client system submits an email job with addressing, subject, body, and optional attachments. EmailJobService (L3) resolves the tenant’s email configuration via EmailConfigurationService.getUnlockedConfiguration(), hands the decrypted server token to EmailSender (L2), which calls postmarkServerProxy.sendEmail (L1) and reports back. If the configuration is in PENDING_VERIFICATION, the precondition check fails fast AND fires off a bounded DNS-verification polling round (Scenario 1b.3) so the next send attempt is more likely to succeed.
Decisions reflected in this scenario:
- DQ-201: two-level server module; sending uses
EmailSender(L2) andpostmarkServerProxy(L1). - DQ-202 / DQ-203: server token decryption.
- DQ-207.b: send-time precondition-fail kicks off Scenario 1b.3.
Pre-conditions:
- Tenant has a provisioned
EmailConfiguration(any status; behavior branches per status below). - Client provides:
To,Cc(optional),Reply-To,Subject,Body(HTML), attachments (optional, as Blob or URL). - Client provides:
emailConfigurationId— a UUID; resolved at runtime throughEmailConfigurationService’s interface, not via DB FK (cross-Universe; seeinformation-model.md§ 7.1).
Post-conditions:
- On
UNLOCKEDconfig:EmailJobpersisted asNEW, then transitioned toQUEUEDon Postmark acceptance.MessageIDstored for webhook correlation. Client receives the record. - On non-
UNLOCKEDconfig:EmailJobpersisted asFAILEDwith diagnostic. If config isPENDING_VERIFICATION, a bounded polling round is kicked off (Scenario 1b.3).
Transaction boundaries. Each EmailJob write (persistJob(NEW), the final transition to QUEUED / FAILED) is its own transaction in EmailJobUniverse. The intervening getUnlockedConfiguration call is a separate transaction in EmailConfigurationUniverse — the two services do not share a transaction. External HTTP calls to Postmark sit between transactions. See functional-design.md § 5 for the binding service-isolation rule.
Scenario 3: Receive Message Event via Webhook
Section titled “Scenario 3: Receive Message Event via Webhook”Postmark sends a delivery status event (Delivery, Bounce, or SpamComplaint) to the Arda webhook endpoint. The endpoint authenticates the request, correlates the event to an existing EmailJob via MessageID, and updates the job’s status.
Pre-conditions:
- Webhook configured on the Postmark server with Bearer token authentication (see DQ-011)
- An EmailJob exists in QUEUED or SENT status with a matching
MessageID
Post-conditions:
- EmailJob status updated to SENT, DELIVERED, BOUNCED, or COMPLAINED based on the event type
- Diagnostic information stored for adverse events (bounce reason, complaint type)
Event Type Mapping
Section titled “Event Type Mapping”Postmark RecordType | Source Status | Target Status | Diagnostic Fields |
|---|---|---|---|
Delivery | QUEUED / SENT | DELIVERED | DeliveredAt, Recipient |
Bounce | QUEUED / SENT | BOUNCED | Type (HardBounce, SoftBounce, …), Description, BouncedAt |
SpamComplaint | DELIVERED | COMPLAINED | Type, Recipient |
Note: Postmark may send a Delivery event before the internal status has transitioned from QUEUED to SENT. The service handles this by accepting valid forward transitions regardless of intermediate states (QUEUED to DELIVERED is valid if the SENT event was missed or arrived out of order).
Scenario 4: Tenant Decommission
Section titled “Scenario 4: Tenant Decommission”DELETE /email-configuration/<configId> removes a tenant configuration. The L3 service runs best-effort decommission of the external resources via the L2 capability composer, then deletes the DB row unconditionally, and returns an aggregated success/failure result. The deletion order at L2 is the inverse of provisioning’s mutation order: Route53 records first, then Postmark resources. See DQ-205.d and DQ-205.k.
Decisions reflected in this scenario:
- DQ-205.d: best-effort decommission; row deleted unconditionally; aggregated result returned.
- DQ-205.k: Route53 deletes precede Postmark deletes (inverse of provisioning’s order to avoid leaving DNS records pointing at a deleted Postmark domain).
- DQ-201.d: structured
DecommissionResultcarrying per-resource success/failure.
Pre-conditions:
- Row exists in any state except
PROVISIONING(in-flight provisioning is not directly DELETE-able; operator must wait or manually triage stuck rows per DQ-205.f). - Captured external IDs may be partial (especially for
PROVISIONING_FAILEDrows).
Post-conditions:
- DB row deleted.
- Best-effort attempts made to delete each external resource we have an ID for.
- Caller receives aggregated
DecommissionResultlisting which deletions succeeded and which failed (so any leftovers can be cleaned up manually).
Scenario 5: EmailJob lifecycle edges — Cancel and Resend (narrative)
Section titled “Scenario 5: EmailJob lifecycle edges — Cancel and Resend (narrative)”Send (Scenario 2) is the happy-path. The two non-trivial lifecycle edges are below. They are documented in narrative form because the architectural shape is the same as Scenario 2 with different state checks; downstream design (BFF, SPA UX, integration tests) can reference this as the source of truth for behavior.
5.a — Cancel an EmailJob
Section titled “5.a — Cancel an EmailJob”Endpoint: PUT /v1/shop-access/email/email-job/<jobId>/cancel.
Allowed only when the job is in status NEW. NEW means the job has been persisted but the L1 send call has not yet been issued. Once the job is QUEUED or beyond, Postmark already owns the message and we cannot recall it.
Behavior:
EmailJobService.cancelJob(jobId)reads the job row.- If status
NEW: UPDATEstatus='CANCELLED'with idempotency guardWHERE job_id = ? AND status = 'NEW'. Returns the updatedEmailJob. - Any other status: returns
Result.failure(PreconditionFailed("status=<status>"))→ HTTP 409.
No external systems are touched. No L2 / L1 calls. The cancellation is a pure DB transition.
5.b — Resend a previously sent EmailJob
Section titled “5.b — Resend a previously sent EmailJob”Endpoint: PUT /v1/shop-access/email/email-job/<jobId>/resend.
Allowed when the job is in status BOUNCED or FAILED. Creates a new EmailJob row referencing the original via originalJobId. The original row is left intact (audit trail).
Behavior:
EmailJobService.resendJob(jobId, overrides?)reads the original job.- If original status is
BOUNCEDorFAILED:- Construct a new job spec from the original’s content + caller-supplied overrides (typically
to/cc). - Run the same flow as Scenario 2 (Create EmailJob → Resolve Configuration → Compose and Send) for the new job.
- The new job has
originalJobIdset to the original’s id; its lifecycle proceeds independently.
- Construct a new job spec from the original’s content + caller-supplied overrides (typically
- Otherwise:
Result.failure(PreconditionFailed)→ HTTP 409.
The configuration check at send-time is identical to Scenario 2: a non-UNLOCKED config will fail fast and (if PENDING_VERIFICATION) trigger Scenario 1b.3.
Scenario 6: EmailConfiguration admin operations — Lock and Unlock (narrative)
Section titled “Scenario 6: EmailConfiguration admin operations — Lock and Unlock (narrative)”Endpoints:
PUT /v1/shop-access/email/email-configuration/<configId>/lockPUT /v1/shop-access/email/email-configuration/<configId>/unlock
These are pure DB transitions used by CS / admin tooling to disable or re-enable email sending for a tenant configuration. No external systems are touched.
Behavior:
lock(configId): allowed only fromUNLOCKED. UPDATEstatus='LOCKED'with guardWHERE config_id = ? AND status = 'UNLOCKED'. Other statuses return 409.unlock(configId): allowed only fromLOCKED. Symmetric.
Send-time interaction: EmailConfigurationService.getUnlockedConfiguration() checks status == UNLOCKED. A LOCKED configuration causes send-time precondition failure exactly like other non-UNLOCKED statuses (per Scenario 2’s else-branch). No DNS-verification kick-off happens for LOCKED (different from PENDING_VERIFICATION per Scenario 1b.3).
Race-condition note for downstream design: a lock that lands between L3’s status check and the actual Postmark send (i.e., during Scenario 2’s “Compose and Send” block) does not abort the in-flight send — the send proceeds with the already-decrypted token. At v1 concurrency this race window is small (sub-second) and the correctness impact is bounded (one email might be sent after a lock was issued). If this becomes a concern in v2, options include SELECT FOR UPDATE on the config row during send, or a generation-counter check at the L3 boundary. v1 accepts the small race as a known trade-off.
Copyright: © Arda Systems 2025-2026, All rights reserved