Phase 2 -- Implementation Learnings
Substantive insights from Phase 2 implementation that future phases (and any future hosted-zone work) should benefit from. Each learning ties back to a concrete moment in the implementation and an artifact that captures it.
L-1: A live AWS Route53 zone can pre-date its CDK declaration
Section titled “L-1: A live AWS Route53 zone can pre-date its CDK declaration”When cdk diff was run against the deployed RootConfiguration stack for Phase 2 validation (Gate 3 of T-V tasks), it reported the expected additive-only result: a new AWS::Route53::HostedZone resource for ardamails.com. and a new Outputs.ardaArdamailsZone. The diff is CFN-template-relative, not Route53-state-relative. Because the deployed stack didn’t contain an ArdamailsZone resource, CFN had no way to surface that the zone already existed in Route53.
The pre-existing zone (Z0721066239FWCD47EJDX, two records: apex NS + SOA, four AWS-default nameservers) had been auto-created by Route53 Domains when Miguel originally registered ardamails.com through the AWS registrar service. The deploy-as-coded path would have created a second hosted zone for the same DNS name with a different NS set; the registrar would still point at the original four nameservers; the new CFN-managed zone would have been orphaned at the DNS layer and would not have served any queries.
The discovery came from a single offhand challenge at a STOP gate: “I suspect there is a disconnect. Use AWS CLI to explore the R53 zones available in the root account.” Without that question, the deploy would have silently shipped a duplicate zone.
Take-away: before deploying any CDK construct for a “new” hosted zone whose name is human-meaningful (especially one tied to a domain registered through Route53 Domains, an external registrar with delegation to AWS, or any prior manual setup), run aws route53 list-hosted-zones-by-name --dns-name <name> to confirm Route53 doesn’t already hold one. The verification is two-keystrokes-cheap and saves a failed delegation chain. Recorded as DQ-R1-008 and applied to all future zone-creation work in this project.
L-2: CFN IMPORT change-sets reject Output additions and any unrelated resource modifications
Section titled “L-2: CFN IMPORT change-sets reject Output additions and any unrelated resource modifications”Once DQ-R1-008 Option A (cdk import) was chosen, the first attempt to build an IMPORT change-set used the full synthesized template (cdk synth) directly. CFN rejected it twice in succession, each time with a different error:
-
Attempt 1 (full synth template):
As part of the import operation, you cannot modify or add the following: [Outputs]Cause: the synthesized template added
Outputs.ardaArdamailsZonealongside theArdamailsZoneresource. IMPORT change-sets only accept resource imports. -
Attempt 2 (full synth template minus the Output):
You have modified resources [CDKMetadata] in your template that are not being imported.Cause: even after stripping the Output, the synth template’s
CDKMetadatacontent differed from the deployed stack’sCDKMetadata(CDK’s metadata block carries asset hashes and stack composition fingerprints; both had drifted in the new branch).
The working pattern (Attempt 3) builds the import template from the deployed-state template, adding only the new ArdamailsZone resource:
aws cloudformation get-template --stack-name RootConfiguration --template-stage Original \ > scratch/deployed.jsonjq '.Resources.ArdamailsZone1DCDDC15 = $newRes' \ --argjson newRes "$(jq '.Resources.ArdamailsZone1DCDDC15' scratch/synth.json)" \ scratch/deployed.json > scratch/importv2.jsonaws cloudformation create-change-set \ --stack-name RootConfiguration --change-set-name import-ardamails-v2 \ --change-set-type IMPORT \ --resources-to-import "ResourceType=AWS::Route53::HostedZone,LogicalResourceId=ArdamailsZone1DCDDC15,ResourceIdentifier={Id=Z0721066239FWCD47EJDX}" \ --template-body file://scratch/importv2.jsonCFN reported: Action: Import, Replacement: null, Scope: [] — a true read-only import.
The full deploy (cdk deploy) was then a separate, normal operation, on its own template. CFN handled the Output addition and CDKMetadata reconciliation in UPDATE mode where modifications are allowed.
Take-away: CFN IMPORT change-sets are strictly about adopting existing AWS resources into a stack with zero other template changes. The tactical pattern is a two-phase deploy: (1) IMPORT change-set with a hand-built deployed-state + new-resource-only template, (2) normal cdk deploy that converges to the synth template. The two-phase choreography is documented in DQ-R1-008 and was rehearsed once during Phase 2 implementation.
L-3: Match the imported zone’s properties exactly, then lock them with a strict-equality test
Section titled “L-3: Match the imported zone’s properties exactly, then lock them with a strict-equality test”The first IMPORT attempt also surfaced a CFN read-only requirement that’s specific to AWS::Route53::HostedZone: the HostedZoneConfig.Comment field on the import-target must match the live zone’s comment. If the synth template declares no comment but the live zone’s comment is "HostedZone created by Route53 Registrar" (the AWS-default comment for registrar-created zones), CFN reports a Scope: [Properties] rather than Scope: [] — meaning the import would write the property change, which conflicts with the read-only intent.
The fix:
- Add
comment: "HostedZone created by Route53 Registrar"to the CDK code so the synthesized resource’sHostedZoneConfig.Commentmatches the live zone byte-for-byte. - Add
applyRemovalPolicy(cdk.RemovalPolicy.RETAIN)so a futurecdk destroyofRootConfigurationcannot delete the production zone.RemovalPolicy.RETAINtranslates toDeletionPolicy: "Retain"andUpdateReplacePolicy: "Retain"in the CFN template. - Lock both with a Jest
Template.hasResource('AWS::Route53::HostedZone', { Properties: { Name, HostedZoneConfig: { Comment } }, DeletionPolicy: "Retain", UpdateReplacePolicy: "Retain" })strict-equality assertion inroot-dns-stack.test.ts. Future drift in the CDK code that would change the property block fails at unit-test time, before any CFN operation is attempted.
Take-away: when adopting a live AWS resource into CFN, mirroring the live properties exactly is part of the contract. CDK code that is “agnostic about properties because they’re optional” is unsafe in this context. Bake the live values into the code; lock them with a unit test that an unaware editor would have to consciously break.
L-4: cdk diff is CFN-relative, not state-relative
Section titled “L-4: cdk diff is CFN-relative, not state-relative”A side-effect of L-1: cdk diff <stack> compares the locally synthesized template to the deployed stack’s template. It does not compare against the AWS account’s actual resource graph. So a CFN-template additive diff for ArdamailsZone (“we will create a new zone”) is reassuring only when the CFN template’s view of the world is complete — which it isn’t if a resource exists in AWS but isn’t in the stack.
For Phase 2 this meant the standard “diff before deploy” gate cleared cleanly even though the deploy would have caused a duplication. The verification regime in verification.md already required aws route53 list-hosted-zones-by-name --dns-name ardamails.com (V-ROOT-002) to succeed after the deploy; the missing pre-condition was the same lookup before the deploy.
Take-away: for new zones (or any AWS resource type that can be created outside CFN — IAM roles, Lambda functions, KMS keys, hosted zones), the verification gate checklist needs both an “expected present after” and an “expected absent before” step. The Phase-2 verification regime has been updated to make this an explicit pre-deploy gate for any hosted-zone-creation phase (currently Phase 3 and Phase 4).
L-5: CFN logical IDs are mangled deterministically — import requires the mangled form
Section titled “L-5: CFN logical IDs are mangled deterministically — import requires the mangled form”CDK’s CFN-rendering pass mangles the construct path (Stack/ArdamailsZone/Resource) into a deterministic logical ID (ArdamailsZone1DCDDC15). The hash suffix is part of the API contract — changing the construct path changes the suffix, which would break a subsequent IMPORT (CFN matches by exact logical ID).
The IMPORT change-set’s ResourceIdentifier parameter therefore needs the mangled form:
ResourceType=AWS::Route53::HostedZone,LogicalResourceId=ArdamailsZone1DCDDC15,ResourceIdentifier={Id=Z0721066239FWCD47EJDX}Reading it from the synth template (jq '.Resources | keys[] | select(startswith("ArdamailsZone"))' scratch/synth.json) was the safest way to get it right.
Take-away: don’t hand-write CFN logical IDs in import operations. Read them from the synth output. If the construct path ever moves (e.g., a stack split), the mangled ID changes and the IMPORT pattern needs to be re-derived.
L-6: The Phase 2 / Phase 3 NS-delegation ownership split is structural, not stylistic
Section titled “L-6: The Phase 2 / Phase 3 NS-delegation ownership split is structural, not stylistic”DQ-R1-006 was settled before Phase 2 implementation began, but the implementation pass made the rationale concrete. A Root-stack-writes pattern (Option A) would have meant Phase 2’s deploy could not complete until Phase 3’s Corporate stack and Phase 4’s partition stacks had each provisioned their child zones — the Root NS-record writes would each need a nameServers token sourced from a downstream stack.
Instead, the existing WriteNSRecordsToUpstreamDns construct (mirrored from the arda.cards family pattern, where every IngressStack writes its own NS records into Root) means Phase 2’s Root deliverable depends only on the assume-role IAM target (already present in RootDnsStack) and the new mail-root zone. Phase 3 and Phase 4 then own their own NS writes in their own stacks.
The pattern’s value isn’t visible until the second consumer arrives (Phase 3); the arda.cards family is the existence proof.
Take-away: when the choice is between “centralised in the parent” and “distributed to each child”, consider deploy-order coupling. If the centralised option requires the parent to wait on every future child, the distributed option pays back its slight extra-construct-per-child cost the first time a new child arrives.
L-7: git mv plus a class rename is invisible to CFN; the id argument is everything
Section titled “L-7: git mv plus a class rename is invisible to CFN; the id argument is everything”Tasks T-I1 and T-I2 renamed the folder (apps/rootConfiguration/ → apps/Root/), the file (root-configuration-stack.ts → root-dns-stack.ts), and the class (RootConfigurationStack → RootDnsStack). None of those produce CFN-template-visible changes, because the CDK id argument ("RootConfiguration") is the source of the CFN stack name — not the file path or the class name.
The inline comment above the constructor call (// CFN stack name MUST remain "RootConfiguration" -- changing it would force CloudFormation to delete and recreate the stack.) is the operator-readable artifact that defends against a future “let me also rename the id while I’m here” edit. V-IAC-003 grep-asserts the comment’s presence so the safety net travels in the test suite.
Take-away: code-side renames and CFN-side renames are independent axes. The convention in cdk-infrastructure skill (“CFN stack-name immutability is grep-tested at the call site”) is the right discipline, and Phase 2 inherits it from the existing arda.cards family stacks.
L-8: The two-phase deploy is benign for a downstream consumer
Section titled “L-8: The two-phase deploy is benign for a downstream consumer”Phase 3 (Corporate) and Phase 4 (per-partition) consumers of the arda-ardamails-zone CFN export do not need to know that the Phase 2 zone was imported rather than created. The export name resolves to the same hosted zone ID (Z0721066239FWCD47EJDX) either way; CDK’s cdk.Fn.importValue("arda-ardamails-zone") returns a Token that resolves at deploy time. The one-time import detour is invisible at the export-consumer interface.
Take-away: a CFN IMPORT pattern is a deploy-time concern, not a consumer-API concern. Downstream phases can rely on the export contract without inheriting the import complexity.
What worked well
Section titled “What worked well”- STOP gates per task in the spec (T-I1..T-I7) caught the duplication-risk case at the right moment: cdk-diff cleared, but the next gate was a domain-context check that the operator could chase with one command.
- CFN
describe-change-setJSON output as the deploy-time gospel (Verification step D in the implementation checklist). When a hand-built import template was being staged, CFN’s own report (Action,Replacement,Scope) was the unambiguous final word on whether the operation was read-only. - Strict-equality unit test pattern (
root-dns-stack.test.tsV-ROOT-001 strict-match). Codifies the imported zone’s properties + retention policies so future CDK edits that would drift from the import target fail at test time, before deploy. Carries forward to every future “imported resource” surface. - Per-phase worktree convention (
CLAUDE.md). The Phase-2 implementation surface (folder rename, stack rename, new zone, IMPORT detour) was contained inphase-2/infrastructure/; Phase-1 PR review on PR #69 / PR #446 proceeded inphase-1/...worktrees concurrently. The convention scaled cleanly. - Scratch-directory discipline. The hand-built import template (
scratch/importv2.json), the change-set JSON (scratch/change-set.json), and the deploy logs all stayed underinfrastructure/scratch/— never committed; never leaked into the diff. Standard convention; paid off here.
Copyright: © Arda Systems 2025-2026, All rights reserved