Email Integration -- Exploration & Handoff
Status: On hold. Resumable via the name “Arda Email Option”. Source: Exploration session, April 2026. This document consolidates the session for continuation via Claude Code CLI with access to Arda repositories. Audience: Downstream agents (design, implementation, test, documentation) and human reviewers resuming the work. Working mode assumption: Heavy agent-assisted development (Claude Code + claude.ai) on a small team.
Table of Contents
Section titled “Table of Contents”- Purpose & Scope of This Document
- Glossary
- Requirements, Prioritized
- Constraints and Non-Goals
- Architectures Explored
- Architecture 1: ESP Integration (Postmark) — “Arda Email Option”
- Architecture 2: Server-Side Tenant Mail Integration
- Architecture 3: Browser-Direct Tenant Mail Integration
- Architecture 4: Embedded iPaaS (Platform-Brokered Tenant Mail)
- Evaluation Criteria and Cross-Architecture Comparison
- Recommendation and Sequencing
- Domain Structure (Working Assumption C)
- Email Authentication Primer (DKIM/SPF/DMARC)
- Phase 1–4 Outline (Provisional, Needs Rewrite)
- Decision Log
- Open Questions by Owner and Urgency
- Dependencies on Other Arda Workstreams
- Related Arda Context and Pointers
- Evidence Sources and Currency
- Next Actions When Resuming
- Appendix: Agent-Assisted Development Implications
1. Purpose & Scope of This Document
Section titled “1. Purpose & Scope of This Document”This document is the resumption artifact for the Arda Email Option exploration. It consolidates the conversation that produced the held decision so that downstream work (architecture decision records, design documents, implementation plans, threat models) can continue with full context.
It is not itself an ADR or design document. It is the input to those artifacts, which should be produced in Arda’s standard templates under the documentation repository.
Scope of the exploration covered:
- Email sending from the Arda platform to business affiliates (primarily suppliers) on behalf of tenants
- Platform-originated email (system notifications, announcements)
- Eventual inbound reply capture (v2+)
- Multi-tenant isolation, deliverability, operational model, and tenant administration
Out of scope (excluded from the exploration):
- Bulk / marketing email capability
- Internal user-to-user email within Arda
- Chat / SMS / other channels
- Deep integration with any specific vertical’s regulatory regime
2. Glossary
Section titled “2. Glossary”| Term | Meaning |
|---|---|
| ESP | Email Service Provider (Postmark, SES, Mailgun, etc.) — transactional email infrastructure |
| Tenant | An Arda Cards customer organization (manufacturer) |
| Business Affiliate | A supplier, customer, or other external party a tenant communicates with (domain term from Arda’s information model) |
| Function | Arda’s naming axis for service type: io (API), app (UI), assets, and proposed mail |
| Infra | An AWS account, per Arda convention — roughly “where the zone/service lives” |
| Partition | Logical environment: dev, stage, prod, demo, or special-purpose |
| Connector | An adapter implementing Arda’s EmailService interface for a given provider (Postmark, Gmail API, Graph, SMTP) |
| Outbox pattern | Transactional outbox — domain writes + send intent persisted atomically; async dispatcher delivers |
| Postmark Server | Postmark’s isolation unit per tenant/environment — its own sending env, streams, suppressions, analytics |
| Sandbox server (Postmark) | A server in Postmark that accepts sends but drops to blackhole; counts toward volume |
| Sub-addressing | user+tag@domain convention for address-level routing tokens |
| DKIM/SPF/DMARC | See §13 Email Authentication Primer |
| Browser-direct | Architecture where the browser (not Arda BE) sends via Gmail API or Microsoft Graph using the logged-in user’s session |
| iPaaS / embedded iPaaS | Integration-Platform-as-a-Service, e.g., Nango, Merge, Paragon — brokers OAuth and normalizes provider APIs |
| Working assumption C | The chosen tenant FQDN shape (see §12) |
3. Requirements, Prioritized
Section titled “3. Requirements, Prioritized”V1 (must-have for first tenant email sent)
Section titled “V1 (must-have for first tenant email sent)”Functional
- Outbound transactional email from Arda to supplier/business-affiliate recipients
- First target flow: tenant sends a Purchase Order with PDF attachment to a supplier
- Platform-originated email to tenants for system operations and announcements
- Tenant-authored email to business affiliates, triggered interactively by authenticated users
- Attachments up to 10 MB
- Reply-To points to the user’s external mailbox (no inbound capture in v1)
Non-functional
- Tenant reputation isolation — one tenant’s sending behavior cannot materially affect another’s deliverability
- Tenant spoofing prevention — tenant A’s users cannot send as tenant B
- Per-tenant DKIM, SPF, DMARC posture
- Target scale: 100–150 tenants, tens to hundreds of emails/day/tenant (peak), single-recipient or single-plus-few-CC
- Centralized administration by Arda Customer Service
- CS operates via the ESP’s own console; no Arda-built admin UI in v1
- Tenant provisioning automated via Arda-side scripts invoked by CS (not a UI, but not manual)
- Transactional outbox pattern in Arda BE
- Dedicated Arda BE webhook endpoint (not Lambda) for provider events
- Sandbox/dev/stage/demo environments distinct from production
- Local development possible with zero provider dependency
Observability & operations
- Per-tenant bounce and complaint metrics
- DKIM/SPF/DMARC posture health checks per tenant
- Alerting on bounce rate > 5% or complaint rate > 0.1% per tenant per 24h
- CS daily runbook supportable by Postmark console + Arda dashboards
V2 (fast follow)
Section titled “V2 (fast follow)”- Platform-captured procurement inbox — inbound parse of replies to tenant-platform addresses, threaded back to originating entity (PO, card, etc.)
- Tenant-facing self-service administration UI (templates, suppression management, send history)
- Arda-side admin UI for CS (superset of tenant view)
- Richer template system: per-tenant overrides, versioning, authoring workflow
- Optional HubSpot engagement sync (sent / delivered / bounced as activities on Contact/Company)
- “Bring your own mail” — tenants can opt to send via their own Gmail / Microsoft 365 mailbox (browser-direct with fallback)
V2+ (on the horizon)
Section titled “V2+ (on the horizon)”- EU data residency / multi-partition support
- Multi-provider resilience (active/passive ESP failover)
- Inbound parsing and procurement inbox for non-ESP connectors (Gmail, Graph)
- On-prem Exchange / generic SMTP connector support, if demand materializes
- BIMI (branded sender logos) once DMARC
p=rejectis established and stable - Possible reimplementation to unify with or absorb functionality; 18-month reconsideration window accepted
4. Constraints and Non-Goals
Section titled “4. Constraints and Non-Goals”Constraints (do not violate)
- Arda runs on AWS with strong CDK-based IaC discipline. Anything inconsistent with that is wrong.
- AWS account structure: root account + per-infra accounts. Email infrastructure must fit this model.
- No tenant-owned sending domains in v1. Tenants send via Arda-managed infrastructure.
- No bulk sending, ever — this is transactional only, and the architecture reinforces that.
- No tenant IT involvement in v1 onboarding. Tenants should not need to touch DNS, OAuth consent, or mail-provider admin consoles.
- Tenant isolation is a hard requirement, not a preference. Spoofing (tenant A sending as tenant B) must be prevented at the technical layer, not only at the application layer.
- No EU residency commitment in v1. Revisit in 12–18 months.
Non-goals (deliberately excluded)
- Marketing automation, drip campaigns, broadcast/newsletter sending
- Arda-hosted user mailboxes
- Replacing tenants’ existing business email systems
- Acting as a general-purpose email platform for non-transactional use cases
- Building deep email client rendering test matrices in v1 (accept Postmark’s built-in previews)
- Real-time delivery SLAs tighter than what the ESP provides
Deferred (not rejected; parked)
- HubSpot CRM integration — useful later, not load-bearing for v1
- Tenant self-administration UI — v2+
- Procurement inbox — v2+
- EU residency — revisit 2027+
5. Architectures Explored
Section titled “5. Architectures Explored”Four architectures were explored in depth:
- ESP Integration — Arda is the sender identity; provider (Postmark) handles MTA, reputation, deliverability. Per-tenant isolation via provider abstractions (Postmark “servers”) and DNS subdomains.
- Server-side tenant mail integration — Arda acts as a proxy for tenants’ existing mail infrastructure (Gmail, Microsoft 365, Exchange, SMTP). Credentials held server-side in Arda’s vault.
- Browser-direct tenant mail integration — Arda’s web app obtains user OAuth tokens in the browser; sends originate directly from the browser to the provider API (Gmail / Graph). Arda never holds long-lived send credentials.
- Embedded iPaaS — Third-party platform (Nango, Merge, Paragon) brokers OAuth and normalizes Gmail / Graph APIs. Tactical engineering-cost reducer applicable to (2) or (3).
Also considered and rejected:
mailto:link fallback — cannot attach the PDF PO; disqualifying- Arda-hosted MTA (Postfix/similar) — explicitly rejected per stated preference to not build from scratch
- HubSpot as the ESP — wrong category; costly and inferior for transactional
6. Architecture 1: ESP Integration (Postmark) — “Arda Email Option”
Section titled “6. Architecture 1: ESP Integration (Postmark) — “Arda Email Option””Summary
Section titled “Summary”Arda sends on behalf of tenants via Postmark. Each tenant has its own Postmark server (isolation unit) and a dedicated subdomain under Arda’s mail domain, with per-tenant DKIM, SPF, and DMARC published at the tenant subdomain. Reply-To points to the sending user’s external mailbox in v1.
Provider candidates considered
Section titled “Provider candidates considered”| Provider | Fit | Decision |
|---|---|---|
| Postmark | Transactional-focused; “server per tenant” maps cleanly; immediate production access; strong deliverability reputation; good inbound primitives for v2 | Selected for v1 |
| Amazon SES | AWS-native; CDK constructs exist; cheapest at scale; but: sandbox-exit delay (days–weeks), no multi-tenant abstraction (flat identities), no out-of-box CS UI, inbound via S3/Lambda is more plumbing | Not v1; reconsider for v2+ migration |
| Resend | Excellent DX (React Email); weak multi-tenant primitives; team-level rate limit shared across tenants; newer company, shorter track record | Track for 12-month review |
| Mailgun | Mature; multi-domain; no compelling advantage over Postmark at Arda’s shape | Not selected |
| SendGrid | Marketing-feature bloat; inconsistent deliverability reputation; multi-tenant (subuser) model clunkier than Postmark’s | Excluded |
| HubSpot Transactional | Wrong category; expensive; marketing-flavored; inferior API | Excluded |
Key design decisions
Section titled “Key design decisions”- Two Postmark accounts: production + non-production. Credentials cannot cross environments. Cost overhead negligible (~$36/mo base for Platform plan × 2); blast radius benefit decisive.
- Postmark Platform plan: unlimited servers, users, message streams, sending domains per account.
- Server-per-tenant in production; each is an isolated sending environment.
- Sandbox server fixtures for dev/stage; live server for demo (because stakeholders need real delivery).
- Dummy/local dev: no Postmark dependency; a local stub implements the
EmailServiceinterface, synthesizing fake delivery events. - Webhook receiver: dedicated Arda BE endpoint (Kotlin), not Lambda, so it sits near the email domain logic and uses Arda’s existing deploy/observability patterns.
- Transactional outbox: domain writes + send intent persisted atomically; async dispatcher pulls from outbox; provider message ID + status stored against each row.
- Provider-agnostic interface (
EmailService) from day one, so Postmark is one connector among potential others. Preserves v2+ optionality (browser-direct, SES migration) without refactor debt. - Template source of truth: Arda’s repo/DB, not Postmark’s hosted templates. Postmark templates (if used) are rendering targets, not canonical storage. Revisit in v2+ given agent-assisted development makes richer template systems cheaper.
- Fastest path to v1 GTM: ~3–4 weeks elapsed engineering with agent assistance, ~4–6 weeks otherwise
- Immediate production access — no external review gates
- CS operates from Postmark console on day one; zero Arda UI build for email ops
- Strong deliverability out of the box; Postmark has vendor accountability
- Inbound parsing ready when procurement inbox becomes v2 scope
- Per-tenant reputation isolation via subdomain + DKIM
- ESP dependency; cost scales with volume (roughly $100/mo @ 100k emails)
- Tenants send as
*@<tenant>.mail.arda.cards, not as their own domain — some tenants may want the latter - No Sent-folder presence in tenant users’ mailboxes
- 18-month reconsideration window accepted for possible migration (SES) or replacement
Cost estimate (v1, 100–150 tenants, ~100k emails/month)
Section titled “Cost estimate (v1, 100–150 tenants, ~100k emails/month)”- ~$100/mo Postmark production (Platform + overage)
- ~$20–30/mo Postmark non-production
- Negligible AWS costs (Route53, Secrets Manager, EventBridge)
- Total: ~$125–150/mo for email infrastructure
7. Architecture 2: Server-Side Tenant Mail Integration
Section titled “7. Architecture 2: Server-Side Tenant Mail Integration”Summary
Section titled “Summary”Arda holds tenant-granted credentials (OAuth refresh tokens, service account keys, or basic auth) in a server-side vault and sends through the tenant’s mail infrastructure (Gmail, Microsoft 365, Exchange, SMTP).
Provider variability
Section titled “Provider variability”| Axis | Gmail Workspace | Microsoft 365 | On-prem Exchange | Generic SMTP |
|---|---|---|---|---|
| Send API | Gmail API (REST) or SMTP relay | Microsoft Graph (REST) or SMTP AUTH | EWS (deprecated) or SMTP AUTH | SMTP submission only |
| Auth | OAuth2, service account + DWD | OAuth2, app + admin consent, certificate-based | Basic / NTLM / Kerberos / hybrid OAuth | Username/password, app password |
| Sent folder | Auto via API | Auto via API | Auto via EWS; manual IMAP APPEND for SMTP | Manual IMAP APPEND |
| Quotas | 2000 recipients/day/user | 10,000/day/user | Admin-configured | Host-defined |
| Event delivery | Pub/Sub (Gmail watch) or poll | Graph change notifications | EWS notifications or poll | DSN bounce parsing only |
- Tenant domain reputation — deliverability inherited
- Sent-folder presence in users’ mailboxes
- No Arda-side DKIM/SPF/DMARC work per tenant
- Compliance/archive/DLP inheritance
- Credential blast radius — a breach of Arda’s vault gives attacker send-as-anyone on all connected tenants’ mail systems. Materially worse than an ESP breach.
- Heterogeneity cost — two to four connector implementations needed for meaningful coverage
- External review gates — Google CASA security review (4–12 weeks calendar) for
gmail.sendscope; Microsoft app review - Tenant IT coordination per onboarding — admin consent, OAuth flow, Conditional Access considerations
- Federated observability — no single dashboard across tenants
- Tenant’s mail outage = Arda can’t send
- Rate limits fragment per user/tenant
Conclusion
Section titled “Conclusion”Not v1 material. Slower, more complex, higher credential risk, and poorer match to stated priorities (GTM speed, low CS burden, simple tenant onboarding). Preserved as a v2+ option via the connector abstraction established in v1.
Browser-direct (§8) dominates this architecture on security properties and is generally preferred when the tenant-mail path is adopted.
8. Architecture 3: Browser-Direct Tenant Mail Integration
Section titled “8. Architecture 3: Browser-Direct Tenant Mail Integration”Summary
Section titled “Summary”Arda’s web app acquires a user’s OAuth token in the browser (Google Identity Services or MSAL.js) and sends messages directly to Gmail API or Microsoft Graph from JavaScript. Arda’s BE holds only the send intent and reconciliation state; long-lived credentials never reach the server (except optionally a refresh token for reconciliation).
Scope: interactive, user-initiated sends only. Gmail + Microsoft 365 coverage (roughly 70–95% of Arda’s target market). On-prem Exchange and generic SMTP not supported via this path.
- Deliverability inherited from tenant domain reputation
- Sent folder presence — significant UX win, especially for sales/ops users
- No Arda credential vault for send credentials
- Credential blast radius narrower (XSS on one user ≠ send-as-all-tenants)
- V2 procurement inbox inexpensive — same OAuth grants inbox read; Gmail watch / Graph notifications provide inbound
- No DKIM/SPF/DMARC Arda-side work
- No ESP dependency on the user-interactive path
- External review gate — Google CASA (4–12 weeks); calendar-bound. Critical path to GTM.
- No automated / scheduled sends — reminders, escalations, system-triggered notifications need an ESP fallback (hybrid model)
- “Send status unknown” as a first-class state — browser may not report back; reconciliation via Sent-folder polling requires retained refresh token server-side
- Cross-device state — send initiated in desktop must synchronize to mobile Arda session
- Per-user OAuth consent UX — friction at first send per user
- XSS becomes more consequential — strict CSP, SRI, memory-only tokens mandatory
- CS tooling — no unified dashboard; CS troubleshoots via Google Workspace admin + Microsoft 365 admin consoles
- Audit gap — Arda logs intent; provider logs delivery; join requires browser-reported provider message ID (trust dimension)
- Reviews and integration effort push elapsed GTM to ~10–14 weeks vs. ~4–6 for ESP option
Conclusion
Section titled “Conclusion”Compelling v2+ product differentiator for tenants who want “send from our domain / Sent folder.” Do not attempt in v1. Hybrid model (user-interactive via browser-direct, automated via ESP fallback) is the likely production shape.
Design recommendation: initiate Google CASA verification during v1 so the calendar gate is cleared when v2 development starts.
9. Architecture 4: Embedded iPaaS (Platform-Brokered Tenant Mail)
Section titled “9. Architecture 4: Embedded iPaaS (Platform-Brokered Tenant Mail)”Summary
Section titled “Summary”A third-party platform (Nango, Merge, Paragon) handles OAuth flows, credential vault, token refresh, and provides a normalized API across Gmail / Graph. Arda uses the platform’s SDK rather than integrating providers directly.
Candidates
Section titled “Candidates”| Platform | Fit | Notes |
|---|---|---|
| Nango | OAuth-focused, 400+ connectors, developer-first, open-source core + hosted | Closest fit for Arda’s needs |
| Merge.dev | Unified API across categories; email category thinner | Category mismatch |
| Paragon | Visual workflow builder + embedded; good for business-logic-in-canvas | Heavier, more opinionated |
| Workato Embedded | Enterprise-heavy, expensive | Wrong scale |
- ~50% engineering reduction vs. rolling browser-direct from scratch
- OAuth flows, token management, vault, refresh, re-auth detection handled
- Multi-provider normalization saves connector duplication
- Amortizes across future Arda integrations (CRM, ERP, shipping) if those are on the roadmap
- Does not shorten external review calendar — Google CASA still on critical path; Arda typically still owns the OAuth app
- Platform dependency — outage, pricing change, acquisition, pivot risk
- Abstraction leakage — normalized APIs lose provider-specific features; pass-through needed for edge cases
- Credential custody still Arda’s contractual problem even if technically held by platform
- Cost — $500–1500/mo at 100–150 tenants × 1–5 connections each; rounding error vs. engineering but not free
Conclusion
Section titled “Conclusion”Tactical engineering-cost reducer for browser-direct (v2+), not a GTM accelerator for v1. Evaluate when:
- Browser-direct is the chosen path
- Additional third-party integrations (ERP, accounting, shipping) make the platform a strategic investment beyond email alone
If email is the only third-party integration for the foreseeable future, rolling OAuth directly when the time comes may be cleaner.
10. Evaluation Criteria and Cross-Architecture Comparison
Section titled “10. Evaluation Criteria and Cross-Architecture Comparison”Criteria used (in priority order per stated constraints)
Section titled “Criteria used (in priority order per stated constraints)”- Time to v1 GTM — weeks to first production tenant email
- Ops burden on centralized CS — can one team operate 100–150 tenants?
- Per-tenant isolation primitives — both reputation and spoofing
- AWS/CDK fit — alignment with existing IaC discipline
- Deliverability track record — vendor reputation, default posture
- Inbound maturity — matters in v2 (procurement inbox)
- Migration portability — 18-month reconsideration is accepted but should be cheap
- Engineering cost — team is 4 people; expensive builds are disproportionately costly
- Cost at scale — material but not dominant at projected volumes
Comparison matrix
Section titled “Comparison matrix”| Criterion | ESP (Postmark) | Server-tenant-SMTP | Browser-direct | iPaaS-brokered |
|---|---|---|---|---|
| V1 GTM (weeks) | 3–6 | 12–18 | 10–14 | 8–12 |
| CS ops burden | Low (one console) | High (provider per tenant) | High (provider per tenant) | Medium |
| Isolation — reputation | Per-tenant subdomain + DKIM | Tenant-owned | Tenant-owned | Tenant-owned |
| Isolation — spoofing | DNS + IAM + app logic | OAuth scope + app logic | OAuth scope + app logic | OAuth scope + app logic |
| AWS/CDK fit | Medium (DNS only) | Medium | Medium | Low (external platform) |
| Deliverability | Strong | Strong (inherited) | Strong (inherited) | Strong (inherited) |
| Inbound maturity | Good | Good | Excellent (native) | Good |
| Migration cost (18mo) | Medium | High | High | High |
| Engineering cost | Low | Very high | High | Medium |
| Cost at scale ($/mo) | ~$125–150 | ~$50 platform | ~$0 platform | $500–1500 |
| External review gate | None | Google CASA (4–12wk) | Google CASA (4–12wk) | Google CASA (4–12wk) |
Summary by architecture
Section titled “Summary by architecture”- ESP (Postmark): optimizes for GTM speed and ops simplicity at the cost of branded sender and ESP dependency
- Server-tenant-SMTP: high credential risk, slow, and dominated by browser-direct on security — de-prioritized
- Browser-direct: best security properties and best v2 inbox story; external-review-gated; requires active user sessions
- iPaaS-brokered: engineering reducer for browser-direct, not a GTM accelerator for v1
11. Recommendation and Sequencing
Section titled “11. Recommendation and Sequencing”- Implement the ESP (Postmark) option as previously scoped — “Arda Email Option”
- Build the
EmailServiceconnector abstraction from day one; Postmark is one connector - Initiate Google app verification (CASA) in parallel, before v2 starts
- No browser-direct, no tenant-mail, no iPaaS in v1
- Add browser-direct as an opt-in tenant upgrade — for tenants wanting to send from their own domain / Sent folder
- Use Nango (or similar embedded iPaaS) for the OAuth layer if email remains primary third-party integration; roll direct if other integrations share the investment
- Hybrid sending model — interactive sends via browser-direct, automated sends via ESP fallback
- Procurement inbox — implement on both paths: Postmark inbound parse for ESP tenants, Gmail watch / Graph notifications for connected tenants
- Tenant self-service UI — CS retains console fallback
- EU residency / multi-partition
- Multi-provider resilience
- On-prem Exchange / generic SMTP if demand materializes
- Possible migration from Postmark to SES (now feasible with connector abstraction) if cost, AWS-native posture, or other factors justify
Why this sequencing
Section titled “Why this sequencing”- Ships fastest (ESP v1 is ~3–4 weeks with agent assistance)
- Preserves optionality — the connector abstraction means v2 additions are opt-in per tenant, not a platform rewrite
- Kills the Google review calendar gate before it becomes v2-critical
- Doesn’t throw away v1 work — Postmark remains the automation/fallback path in v2
- Aligns with stated acceptance of 18-month reimplementation risk
12. Domain Structure (Working Assumption C)
Section titled “12. Domain Structure (Working Assumption C)”- Prod tenants:
<tenant>.mail.arda.cards - Non-prod tenants:
<tenant>.<partition>.mail.arda.cardswhere partition ∈ {dev,stage,demo} - Example prod:
procurement@acme.mail.arda.cards - Example stage:
procurement@acme.stage.mail.arda.cards
Zone hosting
Section titled “Zone hosting”Per Arda convention (function + infra records in respective accounts):
GoDaddy: arda.cards apex NS mail.arda.cards → Route53 in prod infra AWS account
Prod infra AWS account: Zone: mail.arda.cards - TXT @, _dmarc, MX dmarc (function-parent records) - NS dev.mail.arda.cards → dev infra account - NS stage.mail.arda.cards → stage infra account - NS demo.mail.arda.cards → demo infra account - Tenant records at <tenant>.mail.arda.cards (runtime, per tenant)
Dev / stage / demo infra accounts: Zone: <partition>.mail.arda.cards - Fixture tenant records at <tenant>.<partition>.mail.arda.cards (runtime)Deliberate deviation from existing Arda pattern
Section titled “Deliberate deviation from existing Arda pattern”The function-parent zone (mail.arda.cards) lives in the prod infra account, not the root account, because prod tenant records write into it. This optimizes supplier-facing FQDN length (4 labels) over perfect pattern symmetry. Confirmed acceptable per “user ergonomics over pattern consistency” direction.
IAM and separation
Section titled “IAM and separation”- CDK owns: hosted zones, parent records, NS delegations, reserved subdomains (
mail,dmarc,postmaster,abuse,api,www,admin) - Runtime provisioning service owns: tenant records via AWS SDK, scoped IAM policy that can only write
*.<partition>.mail.arda.cardsexcluding reserved names - IAM role per env, each scoped to a single zone ARN — prod provisioning cannot touch non-prod zones and vice versa
- External state: tenant → records mapping in Arda DB (
tenant_email_config) - Tenant slug reserved-word list enforced at tenant creation and re-validated at email provisioning
Reversibility option: separate root mail domain
Section titled “Reversibility option: separate root mail domain”Held as reversibility. If mail.arda.cards needs to be replaced with (e.g.) arda-mail.com later:
- Tenant FQDNs are data in Arda DB (
sending_domaincolumn), not hardcoded - Per-tenant DKIM selectors are provider-tagged (
pm2026._domainkey...), not parent-tagged - Templates reference sender via variable substitution
- Migration cost: DKIM reputation warming per tenant + supplier address book updates; feasible but not cheap
Triggers for migration: deliverability incident on app domain, regulatory separation, EU partition with distinct brand.
13. Email Authentication Primer (DKIM/SPF/DMARC)
Section titled “13. Email Authentication Primer (DKIM/SPF/DMARC)”Reference for downstream agents who need the background.
- SPF — TXT record listing authorized sending IPs for the envelope-sender domain. Binary
(domain, IP)authorization. Breaks on forwarding. Weak alone. - DKIM — Per-message asymmetric signature; public key at
<selector>._domainkey.<domain>. Proves origin + integrity. Survives forwarding. Primary tenant-isolation primitive — each tenant gets its own DKIM key per subdomain. - DMARC — Policy at
_dmarc.<domain>; requires SPF or DKIM to align with visibleFrom:domain. Policies:p=none(monitor),p=quarantine,p=reject. Reports torua=mailto:.... - Return-Path / Envelope Sender — bounce address, different from visible From. ESPs default to their own; custom Return-Path aligns SPF with From.
- MX — inbound routing. Not needed in v1 per Reply-To-external decision.
- Domain reputation — per-domain, cumulative. Per-tenant subdomain = per-tenant reputation. Main tenant-isolation mechanism.
- BIMI — branded logo display; requires
p=reject+ VMC certificate (~$1500/yr). V2+ if ever.
Arda-specific decisions
Section titled “Arda-specific decisions”arda.cardsis the app brand, never a sending domainmail.arda.cardsis the email function parent, separate from app- Each tenant subdomain has own DKIM selector, own DMARC policy
- Parent
mail.arda.cardsDMARC:p=reject; sp=rejectfrom day one - Tenant-level DMARC: ramp
p=none(2–4 weeks) →p=quarantine→p=reject, gated on clean metrics - DKIM selector naming: provider-tagged (
pm2026._domainkey...) to support parallel providers during migration
14. Phase 1–4 Outline (Provisional, Needs Rewrite)
Section titled “14. Phase 1–4 Outline (Provisional, Needs Rewrite)”Status: This outline was produced mid-session before the corrected axes (partition vs. infra) and final domain structure were confirmed. It is provisional and needs rewrite as a formal implementation plan document. Structure is sound; specific identifiers need updating.
Phase 1 — Service setup & integration (one-time, ~1–2 weeks)
Section titled “Phase 1 — Service setup & integration (one-time, ~1–2 weeks)”- Register
mail.arda.cardszone in prod infra AWS account; delegate from GoDaddy - Create partition zones (
dev,stage,demo) in respective infra accounts - Publish parent SPF, DMARC, MX dmarc records via CDK
- Create two Postmark accounts (prod + non-prod), Platform plan
- Store account-level API tokens in Secrets Manager under env-scoped paths
- Build
EmailServiceconnector abstraction + Postmark adapter - Build webhook receiver in Arda BE (Kotlin), normalize events, publish to EventBridge or internal bus
- Persist:
tenant_email_configandemail_send_logtables - Build transactional outbox schema and dispatcher
- Build local stub adapter for dev/CI
- CDK stacks:
EmailStackper env with zone, secrets placeholders, EventBridge bus, webhook handler - Observability: per-tenant metrics, alarms, dashboards
- Exit criteria: integration test creates throwaway Postmark sandbox server, publishes DNS, sends, receives webhook, normalizes event, tears down
Phase 2 — First tenant provisioning (automatable, <30 min per run)
Section titled “Phase 2 — First tenant provisioning (automatable, <30 min per run)”- Input: tenant ID + slug (validated against reserved words)
- Compute sending domain from partition + parent
- Create Postmark server (account token)
- Create sending domain in Postmark; receive DKIM CNAME, Return-Path CNAME
- Publish DNS records in Route53 via SDK (idempotent)
- Publish tenant DMARC at
p=none - Poll Postmark verification
- Persist tenant config; store per-tenant server token
- Smoke test: send to
seed@mail.arda.cards, assert DKIM/SPF/DMARC pass - Register DMARC ramp schedule (→
quarantineat T+14, →rejectat T+30)
Phase 3 — First email sent
Section titled “Phase 3 — First email sent”- Pre-reqs: tenant active, PO PDF pipeline functional, supplier contact resolved, user authorized
- Arda app trigger: user clicks “Send PO”
- Authorization check: user on tenant, supplier on tenant
- Render PDF + email body (Arda-stored templates)
- Compose
OutboundMessage: From=procurement@<tenant>.mail.arda.cards, Reply-To=user’s external email, To=supplier - Send via
EmailService→ Postmark adapter - Persist in
email_send_log - Webhook lifecycle: Delivery →
delivered, Bounce/Complaint → update + CS alert - Verification: CS confirms in Postmark console; supplier receives with DKIM pass
Phase 4 — CS daily runbook (production only)
Section titled “Phase 4 — CS daily runbook (production only)”- Morning: aggregate dashboard check, red threshold review
- Triage: bounce review, complaint handling, tenant inquiries
- Weekly: DMARC report review, ramp advancement, suppression audit
- Incident response: deliverability degradation, provider outage, suspected spoofing
- CS never-touches: direct DNS edits, server deletion, DMARC policy outside ramp
Rewrite needed: formalize as implementation-plan document using Arda’s template; reconcile with working assumption C for domain structure; add concrete Kotlin interface signatures and CDK construct sketches.
15. Decision Log
Section titled “15. Decision Log”| # | Decision | Status | Notes |
|---|---|---|---|
| 1 | V1 ESP provider: Postmark | Decided | Alternatives (SES, Resend, Mailgun, SendGrid) compared; see §6 |
| 2 | Two Postmark accounts (prod + non-prod) | Decided | Blast radius isolation decisive over $18/mo cost |
| 3 | Postmark Platform plan | Decided | Unlimited servers per account; required for 100–150 tenants |
| 4 | Server-per-tenant in prod | Decided | Maps cleanly to tenant boundary; auto-isolation |
| 5 | Sandbox servers in dev/stage; live server in demo | Decided | Demo needs real delivery for stakeholder visibility |
| 6 | Local dev: stub adapter, no Postmark | Decided | Zero external dependency; exercises full event loop |
| 7 | Webhook receiver: dedicated BE endpoint (Kotlin), not Lambda | Decided | Co-locates with email domain logic |
| 8 | Transactional outbox pattern | Decided | Durable intent queue; simplifies provider swap and outage handling |
| 9 | V1 Reply-To: user’s external mailbox | Decided | No inbound parsing in v1; no MX on tenant subdomains in v1 |
| 10 | V2 Reply-To: platform-captured | Decided | Procurement inbox requires MX + parse |
| 11 | Separate mail brand vs. subdomain of app | Decided (subdomain) | mail.arda.cards chosen over separate domain; reversibility retained |
| 12 | Tenant FQDN shape: working assumption C | Working assumption | 4-label prod (<tenant>.mail.arda.cards), 5-label non-prod; see §12 |
| 13 | mail.arda.cards zone hosted in prod infra account | Working assumption | Deliberate deviation from pattern for prod FQDN ergonomics |
| 14 | CDK owns zones + parent records; runtime service owns tenant records | Decided | IAM scoping enforces the boundary; no per-tenant CDK stacks |
| 15 | CS admin surface v1: Postmark console (no Arda UI) | Decided | V2+ brings Arda-built admin |
| 16 | Tenant provisioning in v1: CS-invoked Arda scripts against ESP API | Decided | Not manual, not UI; same endpoint backs v2 admin |
| 17 | EmailService connector abstraction from day one | Decided | Preserves v2+ browser-direct and v2+ SES migration cheaply |
| 18 | Template source of truth: Arda (not Postmark hosted templates) | Decided | Provider templates are rendering targets; revisit in v2+ |
| 19 | HubSpot integration: deferred to v2+ | Decided | Not load-bearing for v1; CRM-side value when supplier features mature |
| 20 | EU residency: defer to 2027+ | Decided | 12–18 month revisit window |
| 21 | Browser-direct tenant mail: v2+ opt-in, not v1 | Decided | External review calendar + scope |
| 22 | iPaaS for OAuth brokerage: v2+ tactical decision | Decided | Not a v1 GTM accelerator |
| 23 | Initiate Google CASA verification during v1 | Recommendation | So calendar gate is cleared for v2 |
| 24 | Provider-agnostic interface covers Postmark, Gmail, Graph, SMTP | Decided | Single interface across all future connectors |
16. Open Questions by Owner and Urgency
Section titled “16. Open Questions by Owner and Urgency”Blocking v1 implementation start
Section titled “Blocking v1 implementation start”| Q | Owner | Urgency |
|---|---|---|
| Confirm no MX on tenant subdomains in v1 (Reply-To external) | CTO | Before Phase 1 |
Confirm mail.arda.cards as parent (vs. separate brand) | CTO | Before DNS setup |
| Confirm working assumption C over full-canonical FQDN | CTO + DNS convention owner | Before DNS setup |
Shapeable during v1 implementation
Section titled “Shapeable during v1 implementation”| Q | Owner | Urgency |
|---|---|---|
Parent domain final choice (mail.arda.cards confirmed; decision to revisit?) | CTO | Before first tenant provision |
| Webhook endpoint exposure: public + signature-verified vs. IP allowlist to Postmark CIDRs | Engineering | Before go-live |
| Template promotion mechanics: code-gated vs. DB-stored | Engineering | Before template system build |
| DMARC aggregate report parser: dmarcian free tier vs. self-built vs. paid | Engineering + CS | Before first 2 weeks post-launch |
| HubSpot event logging scope, if any in v1 | Product + CTO | Before v2 planning |
Deferred to v2 planning
Section titled “Deferred to v2 planning”| Q | Owner | Urgency |
|---|---|---|
| Postmark templates vs. Arda-side template store | Engineering | V2 scoping |
| Tenant self-service UI scope and surface | Product + Engineering | V2 planning |
| Procurement inbox correlation model (plain inbox vs. sub-addressed) | Product + Engineering | V2 design |
| HubSpot engagement granularity and direction of data flow | Product | V2 planning |
| Browser-direct target scope (Gmail + M365; Exchange?) | Product + CTO | V2 scoping |
| iPaaS vs. roll-own OAuth decision | CTO | V2 scoping |
Deferred to v2+ / future
Section titled “Deferred to v2+ / future”| Q | Owner | Urgency |
|---|---|---|
| EU residency strategy and partition rollout | CTO | 2027+ |
| SES migration trigger criteria | CTO | 18-month review |
| BIMI adoption criteria | Product | Post-DMARC-reject stability |
| On-prem Exchange / generic SMTP connector demand | Product | Demand-driven |
17. Dependencies on Other Arda Workstreams
Section titled “17. Dependencies on Other Arda Workstreams”These intersect with email and downstream agents will hit them:
- PDF generation pipeline — PO PDFs are the first real payload. GrapesJS + Puppeteer path was explored separately; whatever v1 pipeline exists must be functional before Phase 3. Open thread from earlier work.
- Tenant admin / provisioning services — email provisioning is one facet of a broader tenant lifecycle; the Arda admin API surface should accommodate it consistently.
- Frontend component architecture — the send UI for POs, reply drafts (v2), template editing (v2+) all consume from the Arda component library; ensure form primitives, rich-text editor, attachment handler components exist or are scoped.
- CDK IaC conventions — the
EmailStackmust fit Arda’s multi-account CDK pattern; cross-account zone delegation is non-trivial and should follow any existing precedent. - Kotlin BE conventions —
Result<T>error handling, railway-oriented patterns per Arda’s documentation; connector abstraction should follow. - Secrets Manager conventions — per-env paths, rotation policies; align with any existing Arda secrets pattern.
- EventBridge / internal event bus usage — email events should consume and produce in the same shape as other domain events.
- Observability stack — CloudWatch, any internal dashboards Arda uses; don’t fork observability for email.
- MCP server authentication architecture (open Arda thread) — if CS tooling becomes MCP-accessible, authentication of CS operators against the email admin API matters.
- Claude Code / Claude Projects setup — agent-assisted development workflow affects how design docs and implementation plans are consumed.
18. Related Arda Context and Pointers
Section titled “18. Related Arda Context and Pointers”Documentation
Section titled “Documentation”- Arda documentation site: https://arda-cards.github.io/documentation (updated regularly)
- Source: https://github.com/Arda-cards/documentation
- Relevant sections for this work:
current-system(architecture, runtime, OAM),domain(information model, business affiliates),technology,decisions,process - Note: No per-page digest exists yet; agents should fetch sections on-demand. A
DIGEST.mdorllms.txtat the site root is a standing improvement
Templates (about/templates/ in docs repo)
Section titled “Templates (about/templates/ in docs repo)”For this work, the following templates are relevant inputs:
architecture-decision-record.md— for each decided item in §15 that warrants a standalone ADRdesign-document.md— for the email service architecturethreat-model.md— for tenant isolation, credential handling, spoofing boundariesimplementation-plan.md— for the rewritten Phase 1–4runbook.md— for the CS daily runbookfeature-requirements.md/new-service.md— if the email service is a new BE service
Repository layout
Section titled “Repository layout”- Arda runs multi-repo: 1 IaC, 4 BE, 1 doc, 3 FE + tooling/actions
- Email work likely touches: IaC repo (CDK
EmailStack), a BE service (new or existing), the doc repo (ADRs/design docs), and minimally the main FE repo (send UI for POs) - CDK conventions:
<partition>.<infra>.<function>.arda.cardsfor existing services; see §12 for email-specific adaptation
Stack reminders
Section titled “Stack reminders”- Backend: Kotlin, Result-based error handling, Railway-Oriented Programming
- Frontend: React, Redux, Radix, ShadCN, Tailwind, AG Grid, Next.js (BFF)
- IaC: CDK (TypeScript)
- Infra: AWS, multi-account (root + per-infra accounts)
19. Evidence Sources and Currency
Section titled “19. Evidence Sources and Currency”Web-searched facts (verify on resumption if >3 months old)
Section titled “Web-searched facts (verify on resumption if >3 months old)”- Postmark pricing and Platform plan features — searched April 2026; re-verify at https://postmarkapp.com/pricing
- Postmark sandbox primitives — searched April 2026; current per Postmark docs
- Resend pricing and rate limits — searched April 2026
- AWS SES CDK constructs (
EmailIdentity,ConfigurationSet,DkimIdentity) — searched April 2026 - Google CASA / app verification timelines — industry estimates, not fetched directly; re-verify via Google’s current verification docs
- Microsoft Graph + MSAL.js browser flows — industry standard, re-verify at Microsoft Identity docs
Architectural judgment (no single source)
Section titled “Architectural judgment (no single source)”- Multi-tenant reputation isolation strategies
- Credential blast-radius reasoning
- Browser-direct vs. server-side trade-offs
- Phased sequencing of v1 vs. v2+
Arda-internal context (per session memory)
Section titled “Arda-internal context (per session memory)”- Team size, AWS/CDK posture, FE/BE stack, multi-repo layout, tenant target volume, EU timeline
- Documentation site structure
- Naming convention
<partition>.<infra>.<function>.arda.cards
Recommendation: On resumption, re-fetch vendor pricing/feature pages and verify no breaking changes to Postmark API or AWS SES CDK constructs since April 2026.
20. Next Actions When Resuming
Section titled “20. Next Actions When Resuming”Ordered. Each action is agent-actionable or explicitly human-gated.
- Re-verify vendor currency — fetch Postmark pricing and Platform plan features; confirm no breaking API changes. Agent-actionable.
- Resolve the three v1-blocking open questions (§16, top section). Human (CTO).
- Produce ADR #1: ESP provider selection (Postmark) using Arda’s
architecture-decision-record.mdtemplate. Summarize alternatives (SES, Resend, Mailgun) with pros/cons. Agent-actionable. - Produce ADR #2: Tenant email domain strategy — working assumption C, reversibility retained. Agent-actionable.
- Produce ADR #3: Two-Postmark-account separation for prod/non-prod. Agent-actionable.
- Produce ADR #4:
EmailServiceconnector abstraction — interface shape, rationale, relationship to future connectors. Agent-actionable. - Produce design document: email service architecture — full shape, modules, data model, event flow, outbox, webhook handler. Use
design-document.mdtemplate. Agent-actionable with human review. - Produce threat model: email service tenant isolation — spoofing, credential leakage, DMARC bypass scenarios, DNS hijack, webhook replay. Use
threat-model.mdtemplate. Agent-actionable with human review. - Produce implementation plan: Phase 1–4 rewrite — with corrected axes, concrete CDK sketches, Kotlin interface signatures, CI/test harness. Use
implementation-plan.mdtemplate. Agent-actionable. - Produce runbook: CS daily email operations using
runbook.mdtemplate. Agent-actionable. - Initiate Google CASA verification for
gmail.sendscope in parallel with v1 build. Human (CTO + DevOps). - Begin Phase 1 implementation — CDK zones, Postmark account setup, connector abstraction skeleton. Agent-assisted, with human architectural review at milestones.
21. Appendix: Agent-Assisted Development Implications
Section titled “21. Appendix: Agent-Assisted Development Implications”The team is operating with heavy agent assistance (Claude Code + claude.ai). This changes several things about how this work should be approached downstream.
What gets cheaper
Section titled “What gets cheaper”- Implementation effort — rough estimate 50% reduction vs. unassisted baseline
- Test coverage — integration test scaffolding is much cheaper to generate; aim for wider coverage than historical norm
- Idempotent, robust provisioning scripts — edge cases become affordable
- ADR / design doc production — these are the human-guided artifacts that feed agents
- Runbooks-as-executable — many CS procedures become Claude Code commands or internal tools
What doesn’t change
Section titled “What doesn’t change”- External calendar gates (DNS propagation, Postmark verification, Google CASA, DMARC ramp windows)
- Human decision-making on architecture and trade-offs
- Cross-team alignment and stakeholder sign-offs
- Security review of credential handling patterns
- CS training and pilot-tenant coordination
Promotion from v2 → v1 enabled by agent leverage
Section titled “Promotion from v2 → v1 enabled by agent leverage”Because internal quality is cheaper, the following were promoted from v2 to v1 scope:
- Richer provisioning automation (idempotency, drift detection, health checks, decommissioning)
- Observability and alerting from day one (not just defaults)
- Executable CS runbooks
- Fuller integration test matrix including webhook handlers and DMARC alignment checks
- Template versioning system (revisit whether fully included)
Scope-creep discipline
Section titled “Scope-creep discipline”Agent leverage creates pressure to expand v1 scope because “it’s cheap now.” Hold the user-facing scope firm:
- One tenant, PO send to supplier, CS in Postmark console, no tenant admin UI
- Internal quality promotion is safe; product-scope promotion needs explicit revalidation
- If a v2 product feature starts feeling cheap, that’s a signal to revalidate v2 roadmap shape, not to slip it into v1
Failure mode shift
Section titled “Failure mode shift”Without agents: typical failure is underbuilt (thin tests, fragile happy-path). With agents: typical failure is overbuilt and misaligned (code that looks thorough but solves the wrong problem, premature abstractions).
Mitigation: architectural decisions, interface design, and test scenarios are human-led and captured in ADRs before agent implementation begins. Review burden shifts from “is this done?” to “is this solving the actual problem?”
Documentation-as-context dependency
Section titled “Documentation-as-context dependency”Agent output quality is directly gated on the quality and currency of Arda’s documentation. Documentation investment pays back more than it historically did. Specifically for this work:
current-systemmulti-tenancy model must be accurate for agents to ground tenant isolation correctlytechnologyKotlin conventions must be accurate for connector abstraction to match Arda idiomsdomaininformation model must be accurate for agents to correctly modelBusinessAffiliateandPurchaseOrderas send targets
Revised timeline estimate with agent assistance
Section titled “Revised timeline estimate with agent assistance”- First production send: ~3–4 weeks elapsed, ~2 weeks focused engineering
- First 50 tenants onboarded: ~6–8 weeks total
- Original baseline without agent assistance: 4–6 weeks first send, 8–10 weeks first 50 tenants
End of document.
Copyright: © Arda Systems 2025-2026, All rights reserved