Skip to content

AWS CDK Infrastructure

AWS Cloud Development Kit (CDK) is the infrastructure-as-code framework used throughout the Arda-cards/infrastructure repository. Every Arda cloud resource — Route53 zones, IAM roles, CloudFront distributions, Lambda functions, RDS clusters — is defined in TypeScript CDK and synthesized to CloudFormation before deployment.

CDK version: aws-cdk-lib 2.237.0, aws-cdk CLI ^2.1106.1 (pinned in package.json). Language: TypeScript only. All constructs, stacks, and apps are .ts files compiled with tsc before synthesis or deploy. Node.js runtime: >=20.

This page covers how CDK is used in this codebase: primitives, repo-local conventions, cross-stack patterns, custom resources, removal policies, cdk import, testing, and common pitfalls. For why the codebase is structured into its layers and what each layer is responsible for, see the IaC Functional Design.


Every CDK construct in this codebase follows a three-interface pattern: Configuration, Props, and Built. The full mechanics — including field-type rules, the ArdaConstruct<P> base class, and naming tables — live in the infrastructure repository’s knowledge-base/cdk-construct-patterns.md. The short version:

// Configuration: design-time parameters the consumer decides
export interface Configuration {
readonly locator: purpose.Locator;
readonly name: string;
}
// Props: Configuration + runtime dependencies injected by the stack
export interface Props extends Configuration {
readonly bucketClientRoleArn: string;
readonly loggingBucket: s3.Bucket;
}
// Built: what the construct exposes after construction
export interface Built {
readonly bucket: s3.Bucket;
readonly preSigningRole: iam.Role;
}

The construct class extends Construct and populates this.built in its constructor. A static validateProps() method returns Error[]; the constructor calls it before super() and throws misc.MultiError if any errors accumulate:

static validateProps(props: Props): Error[] {
const errors: Error[] = [];
if (props.clientRoleArn.length === 0) {
errors.push(new Error("clientRoleArn must not be empty"));
}
return errors;
}
constructor(scope: Construct, id: string, props: Props) {
const errors = MyConstruct.validateProps(props);
if (errors.length > 0) {
throw new misc.MultiError("Errors with configuration", errors);
}
super(scope, id);
// ... build resources, populate this.built
}

The same validateProps / MultiError pattern applies at the Stack layer (RootDnsStack.validateProps in stacks/root/root-dns-stack.ts is the canonical example).


Stacks extend cdk.Stack. The constructor signature in this codebase takes a prefix positional argument in addition to the CDK id and props:

constructor(scope: Construct, prefix: string, id: string, props: Props & cdk.StackProps) {
super(scope, id, props);
// ...
}

The second positional argument to cdk.Stack (called id here, called stackName in the CDK API type) is the CloudFormation stack name. Changing it deletes and recreates the stack, losing all its resources.

The RootDnsStack constructor call in apps/Root/r53-zones.ts demonstrates the preservation discipline:

// The literal string "RootConfiguration" is the deployed CFN stack name.
// It MUST NOT change — changing the id recreates the stack and deletes all
// hosted zones inside it. If a rename is required, use CloudFormation stack
// refactoring, not a CDK id change.
const stack = new RootDnsStack(app, "ROOT", "RootConfiguration", { ... });

The unit tests in stacks/root/root-dns-stack.test.ts lock this:

// V-IAC-003: CFN-name preservation
const stack = new RootDnsStack(app, "ROOT", "RootConfiguration", { ... });

Any change to the third argument is a test failure.


Stacks that publish outputs follow a typed-key convention defined in stacks/types.ts. Each exporting stack declares:

  1. An ExportKeys string-union type.
  2. An exportDefinition factory (see forms below).
  3. A publish() method that materializes cdk.CfnOutput resources.
  4. An importValues() function that reconstructs the values at deploy time via Fn.importValue.

Two coexisting forms produce the same consumer surface (importValues() and publish()):

Legacy form (used in stacks/root/root-dns-stack.ts): a flat ExportDefinition interface typed as Record<ExportKeys, StackIODefinition>, a standalone importValues() function, and a publish() method on the stack class:

stacks/root/root-dns-stack.ts
export type ExportKeys =
| "appZone" | "ioZone" | "authZone" | "assetsZone"
| "ardamailsZone" | "allowCreateNsRecordRole";
const exportDefinition = {
ardamailsZone: {
exportName: "arda-ardamails-zone",
description: "The Hosted Zone Id for the ardamails.com mail-root zone",
},
// ...
};
export function importValues(): ExportValues {
return stackTypes.readImports(exportDefinition); // wraps Fn.importValue
}

Newer form (used in stacks/purpose/image-storage.ts, partition-authn.ts, purpose-storage.ts, and stacks/infrastructure/eks-stack.ts): a class that extends stackTypes.ExportDefinitions<ExportKeys> and a factory exportDefinition(publishingPrefix: string). The class encapsulates the prefix-aware key transformation and the publish / readImports machinery:

// stacks/purpose/image-storage.ts (abbreviated)
export type ExportKeys = "ImageAssetBucketArn_I" | "ImageAssetBucketArn_API";
class ImageStorageExports extends stackTypes.ExportDefinitions<ExportKeys> {
constructor(publishingPrefix: string) {
super(publishingPrefix, {
ImageAssetBucketArn_I: { description: "Image bucket ARN (internal)" },
ImageAssetBucketArn_API: { description: "Image bucket ARN (public)" },
});
}
}
export function exportDefinition(publishingPrefix: string): ImageStorageExports {
return new ImageStorageExports(publishingPrefix);
}

The newer form is the correct choice for all new stacks. Both forms are supported; do not migrate working legacy stacks just for the sake of form uniformity.

Callers read the published values with importValues() — they never copy-paste the export-name strings. The ROOT_EXPORT_KEYS constant in instances/Root/dns.ts enumerates the legacy keys for navigation and test assertions:

instances/Root/dns.ts
export const ROOT_EXPORT_KEYS = [
"appZone", "ioZone", "authZone", "assetsZone",
"ardamailsZone", "allowCreateNsRecordRole",
] as const;

Export names carry a visibility marker that encodes who may safely consume them:

  • ${publishingPrefix}-API-<Key> — public; readable by non-CDK consumers (raw CloudFormation templates, scripts, other AWS accounts). Key suffix conventionally ends …API.
  • ${publishingPrefix}-I-<Key> — internal/protected; intended for CDK-to-CDK consumption only. Key suffix conventionally ends …_I.

publish() enforces the alignment via regex: a key ending …API must map to an export name containing -API-, and a key ending …_I must map to one containing -I-. A mismatch throws at synthesis time. Without a marker the value is treated as fully private and should not be imported cross-stack.

Real examples drawn from the codebase:

Alpha001-demo-API-ImageAssetBucketArn # from image-storage.ts; safe for cross-account
Alpha001-demo-I-NlbTargetGroup80Arn # from purpose-ingress.ts; CDK-to-CDK only

Stacks that depend on another stack’s Built properties must be instantiated after them. Use addDependency() to enforce CloudFormation ordering when the dependency is not captured in a Built reference:

const myStack = new MyStack(app, `${prefix}-MyStack`, { ... });
myStack.addDependency(importedStack);
myStack.publish(); // call publish() explicitly after construction

publish() is never called inside the stack constructor — it is called from the App wiring after the entire stack graph is assembled. This discipline applies uniformly across all Apps: apps/Al1x/partition.ts for partition stacks, apps/Corporate/index.ts for the Corporate App, and any future App wired the same way. The App is the only place that calls publish().

publish() wraps every exported value with an internal marker (CFN_IO_MARKER) and a separator (CFN_IO_SEPARATOR) so that empty values and list values round-trip cleanly through CloudFormation’s string-only export channel. readImports() strips the markers on the way back. If you see strings like ##|::|<value> in a cdk synth output or a CloudFormation console, those are the markers — the framework hides them from consumers; no author-level handling is required.

The Corporate App (landed in Phase 3 of the email-integration project) demonstrates the newer form end-to-end:

// stacks/corporate/corporate-mail-dns.ts (post-refactor)
export type ExportKeys = "mailZoneId_I";
class CorporateMailDnsExports extends stackTypes.ExportDefinitions<ExportKeys> {
constructor(publishingPrefix: string) {
super(publishingPrefix, {
mailZoneId_I: { description: "Route53 zone ID for arda.ardamails.com (internal)" },
});
}
}
export function exportDefinition(publishingPrefix: string): CorporateMailDnsExports {
return new CorporateMailDnsExports(publishingPrefix);
}

At runtime publish() emits two CloudFormation Outputs per key, both visible in the cdk deploy Outputs list:

Export nameValueRole
Corporate-I-MailZoneIdthe clean zone ID (e.g. Z059300336Y7ZG0WVQOF6)CDK-to-CDK consumable; what readImports() resolves
Corporate-MailZoneIdthe same zone ID, wrapped in the CFN_IO_MARKER (e.g. ##|::|Z059300336Y7ZG0WVQOF6)guarded witness — visible to humans inspecting the stack, but the marker-wrapped value is intentionally unusable as a raw Fn::ImportValue consumer; this is how the framework discourages bypassing the typed API

A consumer that calls Fn::importValue("Corporate-MailZoneId") directly receives the marker-wrapped string and will fail downstream, which is by design. The -I- form is the only intended import path; readImports() strips the marker on its way back when consuming via the typed API.

The App wires it:

apps/Corporate/index.ts
const mailDns = new CorporateMailDnsStack(app, "Corporate", "CorporateMailDns", { ... });
mailDns.addDependency(rootImportStack);
mailDns.publish(); // called here, never inside the constructor

Consumers call CorporateMailDns.importValues() to get the typed value; nobody hardcodes the string Corporate-I-MailZoneId.

To consume a stack’s exports from a different CDK application, use ImportingStack (apps/Al1x/util.ts). It reconstructs exported infrastructure resources from CloudFormation via Fn.importValue. Adding new exports to one app requires a matching import entry in ImportingStack.


Custom resources allow CDK to invoke arbitrary logic during CloudFormation lifecycle events (Create, Update, Delete). This codebase uses the aws-cdk-lib/custom-resources Provider framework.

The WriteNSRecordsToUpstreamDns construct (constructs/xgress/write-ns-records-to-upstream-dns.ts) is the canonical example: it creates an NS delegation record in a Route53 zone in a different AWS account by assuming a cross-account IAM role from a Lambda.

The three moving parts are always:

import * as custom_resources from "aws-cdk-lib/custom-resources";
// 1. The Lambda function — handles the Create/Update/Delete events
const onEventHandler = new lambda_nodejs.NodejsFunction(this, "Handler", {
entry: path.join(__dirname, "../inline-lambdas/my-handler.ts"),
handler: "handler",
role: lambdaExecutionRole,
timeout: cdk.Duration.minutes(3),
// Use logGroup (not logRetention) — see pitfalls section
});
// 2. The Provider — connects Lambda to the Custom Resource framework
const provider = new custom_resources.Provider(this, "Provider", {
onEventHandler,
});
// 3. The Custom Resource — the actual CloudFormation resource
new cdk.CustomResource(this, "Resource", {
serviceToken: provider.serviceToken,
resourceType: "Custom::MyResourceType", // must start with "Custom::"
properties: {
// values passed to the Lambda's event.ResourceProperties
targetAccountId: props.targetAccountId,
parentZoneName: props.hostingZoneName,
},
});

The AllowCreatingNSRecordsRole IAM construct (constructs/oam/allow-creating-ns-records-role.ts) is the trust-policy counterpart — it defines the IAM role in the target account that the Custom Resource Lambda assumes.

Every use of custom_resources.Provider produces more Lambdas than it appears to at first glance. Understanding the fan-out is important when reviewing PRs that add new Custom Resources, because Lambda count affects cold-start latency budgets, concurrency limits, and per-function CloudWatch log group proliferation.

Per-instance fan-out: two Lambdas per Provider

Section titled “Per-instance fan-out: two Lambdas per Provider”

Each new custom_resources.Provider(this, ...) call creates two Lambda functions:

  1. The user-supplied handler — the function you pass as onEventHandler. It executes your Create / Update / Delete logic.
  2. framework-onEvent — a Lambda injected by the CDK Provider framework itself. It wraps your handler, manages the async polling loop, and posts the CloudFormation signal back to the pre-signed S3 URL.

These two Lambdas are not deduplicated across Provider instances. Each Provider instance gets its own pair. So N constructs that each instantiate one Provider → 2N Lambdas from that source alone.

Stack-singleton: one LogRetention Lambda per stack

Section titled “Stack-singleton: one LogRetention Lambda per stack”

Any CDK construct that uses the logRetention: property causes CDK to synthesize a stack-level singleton Custom Resource of type Custom::LogRetention. Regardless of how many individual constructs in the same stack use logRetention:, CDK emits exactly one:

  • One LogRetentionFunction Lambda
  • One IAM execution role
  • One set of IAM policies

This singleton appears once in the CloudFormation template under the form LogRetention<hash>, where <hash> is CDK’s well-known LogRetention construct UUID with dashes stripped — for example, the CorporateMailDns stack emits LogRetentionaae0aa3c5b4d4f87b02d85b201efdd8a (the literal logical ID observed in the Phase 3 deploy log). It is a per-stack singleton, not a per-account or per-region singleton — two stacks that each have a logRetention: usage will each carry their own LogRetention Lambda, and each will share the same <hash> (it is the UUID of the construct class, not of a particular stack instance).

The logRetention: property is deprecated (see § 9). Newer constructs in this codebase pass an explicit logGroup: instead, which avoids the hidden Custom Resource entirely. Several existing constructs — including WriteNSRecordsToUpstreamDns — still use the deprecated form, which is the source of the @deprecated warnings emitted during cdk synth.

Worked example: NS-delegation Lambda counts

Section titled “Worked example: NS-delegation Lambda counts”

WriteNSRecordsToUpstreamDns (constructs/xgress/write-ns-records-to-upstream-dns.ts) is used in two different stack contexts with markedly different instance counts:

Corporate stack (CorporateMailDns): one instance of WriteNSRecordsToUpstreamDns — the single NS delegation from the root ardamails.com zone to arda.ardamails.com.

SourceCount
WriteNSRecordsToUpstreamDns handler + framework-onEvent2
Stack-singleton LogRetention (from logRetention: in WriteNSRecordsToUpstreamDns)1
Total3

Partition Ingress stack (per partition): four instances of WriteNSRecordsToUpstreamDns (one per zone family: io, app, auth, assets) plus four instances of a Route53 cleanup Custom Resource (used for clean stack delete):

SourceCount
4 × WriteNSRecordsToUpstreamDns handler + framework-onEvent8
4 × Route53 cleanup CR handler + framework-onEvent8
Stack-singleton LogRetention1
Total17

Reviewer checklist when a PR adds a Custom::* resource

Section titled “Reviewer checklist when a PR adds a Custom::* resource”

Before approving a PR that introduces a new Custom Resource, ask:

  • How many CR instances does the construct create? Multiply by 2 to get the Lambda contribution from that construct.
  • Does the stack already have a logRetention: usage elsewhere? If not, the first usage in the stack adds +1 for the singleton. If yes, the singleton already exists and this is no additional cost.
  • Can a native CloudFormation resource replace the CR? Native CFN-supported resources contribute zero Lambdas and zero provider framework overhead.

Prefer native CloudFormation resources; they are zero-Lambda and have no provider framework overhead. Reach for a Custom Resource only when no native resource suffices:

  • Cross-account writes that cannot be expressed as IAM-policy-controlled resource operations (e.g., writing an NS record into a Route53 zone owned by a different account).
  • Post-deploy API calls that must be idempotent across Create / Update / Delete lifecycle events (e.g., registering a Postmark sending domain or triggering a DKIM verification).
  • Deploy-time data fetching that requires AWS API calls whose results must be available to downstream resources in the same stack (e.g., resolving an ACM certificate ARN by domain name when cdk.context.json lookup is not appropriate).

If the use-case fits any of these categories, document in the construct’s class-level JSDoc comment which category applies and why a native resource was not sufficient.

For broader context on where Custom Resources fit within the layered IaC architecture, see the IaC Functional Design.


cdk.RemovalPolicy.RETAIN is applied to any resource that must not be deleted when the CloudFormation stack is destroyed or the CDK id of the resource changes.

The ardamails.com hosted zone is the live example:

const ardamailsZone = new r53.PublicHostedZone(this, "ArdamailsZone", {
zoneName: rootDns.ROOT_ZONE_NAMES.ardamails,
comment: "HostedZone created by Route53 Registrar",
});
ardamailsZone.applyRemovalPolicy(cdk.RemovalPolicy.RETAIN);

Apply RETAIN whenever:

  • The resource was adopted via cdk import (it pre-existed; CDK does not own its lifecycle).
  • The resource is a production DNS zone — accidental deletion causes an outage that is hard to reverse.
  • The resource holds data that would be lost on re-creation (e.g., a Secrets Manager secret).

Decision log precedent: DQ-R1-008.


6. cdk import — adopting existing AWS resources

Section titled “6. cdk import — adopting existing AWS resources”

cdk import brings an existing AWS resource under CDK management without deleting and recreating it. The choreography used in this codebase (established in Phase 2 for the ardamails.com zone) is a two-phase deploy:

Phase A — IMPORT change-set only:

  1. Match the CDK resource properties to the live AWS resource exactly. For a Route53 zone auto-created by Route53 Domains, this means:
    new r53.PublicHostedZone(this, "ArdamailsZone", {
    zoneName: "ardamails.com",
    comment: "HostedZone created by Route53 Registrar", // AWS default; must match
    });
    ardamailsZone.applyRemovalPolicy(cdk.RemovalPolicy.RETAIN);
  2. Generate a stripped template — the deployed state plus the new resource only, with no new Outputs and no other modifications. CloudFormation forbids Outputs changes in an IMPORT change-set.
  3. Execute the CFN IMPORT change-set. Verify cdk diff shows zero differences.

Phase B — Normal deploy:

  1. Run a normal cdk deploy to add Outputs, reconcile CDKMetadata, and restore any Outputs section.

Lock the imported resource’s exact property block with a unit test using Template.hasResource(...) before pushing (see § 8).

Full rationale: DQ-R1-008 in the decision log.


Tests use aws-cdk-lib/assertions. The standard pattern is:

import { Match, Template } from "aws-cdk-lib/assertions";
function buildStack(): { stack: RootDnsStack; template: Template } {
const app = new cdk.App();
const stack = new RootDnsStack(app, "ROOT", "RootConfiguration", {
env: { account: platforms.ROOT.id, region: platforms.ROOT.region },
});
stack.publish();
return { stack, template: Template.fromStack(stack) };
}
// Assert a resource exists with specific properties
template.hasResourceProperties("AWS::Route53::HostedZone", {
Name: "ardamails.com.",
});
// Assert a resource exists with full block (including DeletionPolicy)
template.hasResource("AWS::Route53::HostedZone", {
Properties: {
Name: "ardamails.com.",
HostedZoneConfig: { Comment: "HostedZone created by Route53 Registrar" },
},
DeletionPolicy: "Retain",
UpdateReplacePolicy: "Retain",
});
// Assert exact count
template.resourceCountIs("AWS::Route53::HostedZone", 5);
// Assert an Output exists
const outputs = template.findOutputs("*", {
Export: { Name: "arda-ardamails-zone" },
});
expect(Object.keys(outputs)).toHaveLength(1);
// Partial match (Match.objectLike is the default; use Match.exact for strict)
template.hasResourceProperties("AWS::IAM::Role", {
AssumeRolePolicyDocument: Match.objectLike({
Statement: Match.arrayWith([
Match.objectLike({ Effect: "Allow", Principal: { Service: "lambda.amazonaws.com" } }),
]),
}),
});

Jest config note: Jest config lives in jest.config.js (not .ts) because moduleResolution: NodeNext in tsconfig.json causes .ts jest configs to fail. Path aliases (arda/*src/main/cdk/*) are declared in moduleNameMapper. eslint-plugin-jest is not installed; do not reference jest/* rules in ESLint config.


Synthesis produces CloudFormation templates without touching AWS. Deploy synthesizes and then submits the change-set. Diff compares the synthesized template to what is currently deployed.

Terminal window
# Synthesize (no AWS credentials required)
npm run synth:plroot
# or for partition apps:
npm run synth:named
# Deploy (requires AWS SSO login first)
aws sso login --profile Admin-Alpha1 # Root account
npm run deploy:plroot
# Diff against live stack
npx cdk diff --app '...' --profile Admin-Alpha1

Profile flag: always use --profile <name>, not the AWS_PROFILE= env var prefix (workspace memory rule). Admin-Alpha1 targets the Root/Alpha001 account (demo/prod); Alpha002-Admin targets Alpha002 (dev/stage).

The Root app is not in the CI matrix (tools/cdk-runner.js is data-driven over partition apps). The convenience scripts synth:plroot / deploy:plroot are operator-only.

CDK caches context lookups (DNS zone IDs, SSM parameter values, etc.) in cdk.context.json. This file is committed to source control in this repository so that cdk synth is deterministic on a fresh checkout without requiring live AWS credentials (e.g., in CI).

What belongs in cdk.context.json: public values only — hosted zone IDs, AMI IDs, cached lookup results. Never commit secrets or account-scoped tokens.

Phase 3 adds Postmark server context keys (public values: server IDs, DKIM selectors) produced by the corporate-cli.ts Phase A run. These keys are committed following the same policy.

Decision log: DQ-R1-014.


logRetention is deprecated — use logGroup

Section titled “logRetention is deprecated — use logGroup”

aws_cdk_lib.aws_lambda.FunctionOptions#logRetention is deprecated. It creates a hidden custom resource that manages log group retention, which conflicts when you also define a LogGroup construct explicitly. Use the logGroup property instead:

// Deprecated — avoid
new lambda.Function(this, "Fn", {
logRetention: logs.RetentionDays.ONE_MONTH, // ⚠ deprecated
// ...
});
// Correct pattern (used in constructs/compute/lambda-function.ts)
const logGroup = new logs.LogGroup(this, "FnLog", {
logGroupName: `/aws/lambda/${fqn}`,
retention: logs.RetentionDays.TWO_WEEKS,
removalPolicy: cdk.RemovalPolicy.DESTROY,
});
new lambda.Function(this, "Fn", {
logGroup,
// ...
});

The same applies to NodejsFunction and lambda.Function — pass logGroup, not logRetention.

app.node.tryGetContext(key) returns unknown. Do not pass the CDK App object into a construct just to call tryGetContext inside it — constructs must not know they live in a CDK App. Resolve the context value at the stack layer, validate and coerce it, then pass the typed result to the construct:

// In the Stack constructor (correct)
const rawServerId = this.node.tryGetContext(postmarkServer.contextKey("FreeKanbanTool"));
const server = new postmarkServer.PostmarkServer({ serverId: rawServerId, ... });
// PostmarkServer validates and coerces rawServerId to number internally
// Anti-pattern — construct takes the App to call tryGetContext itself
// → constructs/xgress/bad-example.ts (hypothetical)
constructor(app: cdk.App, ...) {
const id = app.node.tryGetContext("my-key"); // ❌ construct knows about App
}

This separation is formalized in DQ-R1-013 (failure ordering) and the F-013 quality-review finding. The PostmarkServer thin-wrapper in platform/constructs/postmark/server.ts accepts serverId: unknown exactly so that call-sites can pass this.node.tryGetContext(...) without the construct needing to know how the value was obtained.

Changing the CFN stack id deletes and recreates

Section titled “Changing the CFN stack id deletes and recreates”

CDK uses the second positional argument to new cdk.Stack(...) as the CloudFormation stack name. Any rename causes CloudFormation to delete the current stack (and all its resources) and create a new one. If a rename is unavoidable, use a CloudFormation stack rename/refactor operation rather than editing the CDK id. The workspace memory rule feedback_cf_stack_names.md is the authoritative statement.

Never use cd <path> in scripts or tooling commands that touch this codebase. Use git -C <path> and make -C <path> to keep the shell working directory stable and avoid agent permission prompts. This applies to any Makefile helper, tools/*.ts operator scripts, and agent bash calls alike.


  • IaC Functional Design — the layered script → instances → apps → stacks → constructs architecture, layer responsibilities, the dependency-direction rule, and the anti-patterns it rules out.
  • Infrastructure Architecture Patterns — index of infrastructure architecture pages in this documentation site.
  • infrastructure/knowledge-base/cdk-construct-patterns.md — repo-local mechanics for the Configuration / Props / Built pattern, ExportKeys, publish(), naming tables, partition composition, and domain conventions.
  • Decision log entries in email-integration/decision-log.md: DQ-R1-008 (cdk import choreography), DQ-R1-013 (context value vs. source), DQ-R1-014 (cdk.context.json commit policy).
  • cdk-infrastructure skill — agent-loaded skill that pulls these conventions into a working session when CDK code is being authored or reviewed.