Deployment Orchestration (amm.sh)
The amm.sh script (“Arda Money Making”) is the top-level deployment orchestrator for the Arda platform. It provisions an entire Infrastructure and one or more Partitions in a single run, coordinating CDK, CloudFormation, Helm, and kubectl commands in sequence.
Command Line
Section titled “Command Line”./amm.sh [--profile <aws_profile>] [--region <aws_region>] <infrastructure> <partition...>| Argument | Required | Description |
|---|---|---|
--profile <profile> | No | Sets AWS_PROFILE. Defaults to Admin-<infrastructure> via AWS_DEFAULT_PROFILE when running locally. |
--region <region> | No | Sets AWS_REGION. When omitted, the region is inferred from the AWS profile. |
<infrastructure> | Yes | The Infrastructure name: Alpha001, Alpha002, or SandboxKyle002. |
<partition...> | Yes | One or more Partition names, or the keyword all. |
The all keyword expands to a predefined list per Infrastructure:
| Infrastructure | all expands to |
|---|---|
Alpha001 | demo, prod |
Alpha002 | dev, stage |
SandboxKyle002 | kyle |
Examples:
# Deploy Alpha002 infrastructure + dev partition (local, with SSO)./amm.sh Alpha002 dev
# Deploy Alpha001 with both partitions./amm.sh Alpha001 all
# Explicit profile and region./amm.sh --profile Admin-Alpha002 --region us-east-1 Alpha002 dev stageGitHub Actions Workflow
Section titled “GitHub Actions Workflow”The amm.yml workflow provides a workflow_dispatch trigger with a dropdown of Infrastructure/Partition combinations:
Alpha001/demo,Alpha001/prodAlpha002/dev(default),Alpha002/stageSandboxKyle002/kyle
The workflow:
- Splits the
environmentinput intoinfrastructureandpartition. - Fetches AWS account ID and region from the
purpose-configuration-actionusing a locator URL. - Assumes the IAM role
<Infrastructure>-I-GitHubActionInfrastructurevia OIDC (id-token: write). - Runs
npm install, then invokes./amm.sh <infrastructure> <partition>.
Secrets are passed as environment variables — the script detects GITHUB_ACTIONS=true and skips the local 1Password / SSO login paths.
Required Credentials and Secrets
Section titled “Required Credentials and Secrets”Local runs (interactive)
Section titled “Local runs (interactive)”The script uses 1Password CLI (op read) to resolve secrets at runtime. The operator must be signed into 1Password and have access to the following vaults:
| Vault | Secrets | Used for |
|---|---|---|
Arda-SystemsOAM | Amplify_GitHub_AccessToken, GPR-Read token | Amplify GitHub integration, GitHub Packages auth |
Arda-ProdOAM | ARDA-SIGNUP-KEY | HubSpot signup authentication |
Arda-StageOAM | HubSpot/client_secret, HubSpot/private_access_token, Pylon/widget_secret | Third-party integrations |
| Per-partition vault | ARDA-API-KEY | Partition API key |
The per-partition vault is resolved via the PARTITION_VAULT_MAP:
| Partition | 1Password Vault |
|---|---|
dev | Arda-DevOAM |
stage | Arda-StageOAM |
demo | Arda-DemoOAM |
prod | Arda-SystemsOAM |
kyle | Arda-SandboxKyle |
AWS authentication uses SSO — the script calls aws sso login before the Infrastructure step and again before each Partition step.
GitHub Actions runs
Section titled “GitHub Actions runs”All secrets are stored as GitHub Actions repository secrets:
| Secret | Value |
|---|---|
AMPLIFY_GITHUB_ACCESSTOKEN | GitHub PAT for Amplify source access |
ARDA_API_KEY_<partition> | Per-partition API key (e.g., ARDA_API_KEY_dev) |
ARDA_SIGNUP_KEY_KYLE | HubSpot signup key |
HUBSPOT_CLIENT_KEY_STAGE | HubSpot client secret |
HUBSPOT_PAT_STAGE | HubSpot private access token |
PYLON_WIDGET_KEY_STAGE | Pylon widget secret |
GPR_READ_KEY | GitHub Packages read token |
The IAM role is assumed via OIDC federation (role-to-assume), not long-lived credentials.
Dry Run and Validation
Section titled “Dry Run and Validation”The script does not have a --dry-run flag. Each tool it orchestrates has its own preview mechanism that must be invoked individually.
CDK: synth and diff
Section titled “CDK: synth and diff”synth generates CloudFormation templates without deploying. It does not require AWS credentials and runs as part of CI on every push and PR (the synth-each-cdk-app matrix job in ci.yaml).
# Synth a specific Infrastructure or Partition targetnpm run synth:named -- Alpha002/infranpm run synth:named -- Alpha002/devdiff compares the synthesized templates against the currently deployed stacks. This requires valid AWS credentials.
npx cdk diff \ --app 'npx ts-node -r tsconfig-paths/register --prefer-ts-exts src/main/cdk/instances/Alpha002/infra.ts'CloudFormation: no-execute-changeset
Section titled “CloudFormation: no-execute-changeset”For the raw CloudFormation templates (src/main/cfn/*.cfn.yaml), use --no-execute-changeset to create and inspect a changeset without applying it:
aws cloudformation deploy \ --stack-name Alpha002-dev-Secrets \ --template-file src/main/cfn/partitionSecrets.cfn.yaml \ --no-execute-changeset \ --parameter-overrides Infrastructure=Alpha002 Partition=dev \ ArdaApiKey=... ArdaSignupKey=... HubspotClientKey=... HubspotPAT=... PylonWidgetKey=...The changeset appears in the CloudFormation console for review. Delete it after inspection to avoid stale changesets blocking future deploys.
Helm: dry-run
Section titled “Helm: dry-run”helm upgrade --install --dry-run \ --version 4.13.0 \ --repo https://kubernetes.github.io/ingress-nginx \ --namespace dev-ingress-nginx \ --set "controller.ingressClass=dev-nginx" \ ingress-nginx ingress-nginxThis renders the manifests and validates them against the cluster API without applying changes.
kubectl: dry-run
Section titled “kubectl: dry-run”kubectl apply --dry-run=client -f <manifest>Use --dry-run=server for server-side validation (requires cluster connectivity).
CI as a validation gate
Section titled “CI as a validation gate”The ci.yaml workflow synthesizes every Infrastructure/Partition combination in a matrix:
Alpha001/infra, Alpha001/demo, Alpha001/prod,Alpha002/infra, Alpha002/dev, Alpha002/stage,SandboxKyle002/infra, SandboxKyle002/kyleThis catches CDK compilation errors, construct misconfiguration, and missing exports before any deployment. The all-synth-results job gates the pipeline — all targets must synth successfully for the build to pass.
Effects
Section titled “Effects”Infrastructure Phase (runs once per invocation)
Section titled “Infrastructure Phase (runs once per invocation)”Pre-existing state assumptions
Section titled “Pre-existing state assumptions”- Green-field: The AWS account must exist and CDK must have been bootstrapped (
cdk bootstrap). The script bootstraps automatically, but a pre-existingCDKToolkitstack from a different bootstrap version may require manual cleanup. - Upgrade: All prior CloudFormation stacks from the Infrastructure layer must be in a stable state (
CREATE_COMPLETE,UPDATE_COMPLETE).ROLLBACK_COMPLETEstacks must be deleted manually before re-running.
Resources created or updated
Section titled “Resources created or updated”| Step | Tool | Resources |
|---|---|---|
| CloudWatch logging | CloudFormation (cloudWatch.cfn.yaml) | Log group /arda/oam/deployments with 14-day retention; log stream for the current date |
| CDK bootstrap | cdk bootstrap | CDKToolkit stack (S3 staging bucket, IAM roles) |
| Infrastructure CDK | cdk deploy (all stacks via instances/<Infra>/infra.ts) | VPC, EKS cluster, IAM roles, Route53 hosted zones, NLBs, security groups — everything in the Infrastructure layer |
| EKS kubeconfig | aws eks update-kubeconfig | Local ~/.kube/config entry for the cluster |
| Fluent Bit logging | kubectl apply | aws-observability namespace, aws-logging ConfigMap (Fluent Bit → CloudWatch /<infra>/eks-logs) |
| AWS Load Balancer Controller | Helm (aws-load-balancer-controller v1.13.4) | Namespace aws-load-balancer-controller, LBC deployment, ServiceAccount with IAM role annotation |
| External Secrets Operator | Helm (external-secrets v0.19.1) | Namespace external-secrets, ESO deployment (cluster-scoped CRDs disabled) |
Partition Phase (repeats for each partition)
Section titled “Partition Phase (repeats for each partition)”Pre-existing state assumptions
Section titled “Pre-existing state assumptions”- Green-field: The Infrastructure phase must have completed successfully. CloudFormation exports from the Infrastructure layer (e.g.,
<Infra>-I-EksClusterName, NLB target group ARNs) must exist. - Upgrade: Partition CloudFormation stacks must be in a stable state. For Amplify targets, the
<Infra>-<Part>-Amplifystack must exist before the branch/domain stack can be deployed.
Resources created or updated
Section titled “Resources created or updated”| Step | Tool | Resources |
|---|---|---|
| Partition CDK | cdk deploy (via instances/<Infra>/<partition>.ts) | Cognito user pools, API Gateway, DynamoDB tables, S3 buckets, Lambda functions, CloudFront distributions — everything in the Partition layer |
| nginx Ingress | Helm (ingress-nginx v4.13.0) | Namespace <partition>-ingress-nginx, nginx controller (2 replicas, ClusterIP), IngressClass <partition>-nginx |
| Target Group Bindings | kubectl apply | TargetGroupBinding CRs linking nginx to the NLB target groups (HTTP port 80, HTTPS port 443). Stale bindings are deleted. |
| Partition secrets | CloudFormation (partitionSecrets.cfn.yaml) | 5 Secrets Manager secrets: ArdaApiKey, ArdaSignupSecretKey, HubspotClientSecret, HubspotPrivateAccessToken, PylonWidgetSecret |
| Amplify (full targets) | CloudFormation | See Amplify deployment |
| Amplify (manual targets) | CloudFormation + AWS CLI | See Amplify deployment |
Amplify Deployment
Section titled “Amplify Deployment”The script handles two Amplify paths depending on whether the Infrastructure:Partition pair is in the AMPLIFY_DEPLOY_TARGETS list.
Full Amplify targets (SandboxKyle002:kyle, Alpha001:demo):
amplify.cfn.yaml— Creates the Amplify app, IAM service role, compute role, and wires environment variables from CloudFormation exports and Secrets Manager references.amplifyBranch.cfn.yaml— Creates the branch resource, domain association, and optionally a PR preview branch (enabled only fordev).- Compute role workaround — Works around aws-cdk#34992 by calling
aws amplify update-appif the compute role ARN drifts. - Initial deployment — Triggers an Amplify
RELEASEjob.
Auto-build is disabled for demo; PR preview is enabled only for dev.
Manual Amplify targets (all other partitions: Alpha001:prod, Alpha002:dev, Alpha002:stage):
amplifyComputeRole.cfn.yaml— Creates only the IAM compute role (SecretsManager, Cognito, Logging).- Attaches the role to the existing Amplify app via
aws amplify update-app. - Merges
INFRASTRUCTURE,PARTITION,NEXT_PUBLIC_INFRASTRUCTURE,NEXT_PUBLIC_PARTITION, and (if available)CLOUDFRONT_KEY_PAIR_IDinto the app’s existing environment variables.
Deployment Flow
Section titled “Deployment Flow”Overall Sequence
Section titled “Overall Sequence”Decision Logic Reference
Section titled “Decision Logic Reference”The flow diagram above contains several branching points. This section documents the exact logic behind each decision.
”Running locally?” (credential resolution)
Section titled “”Running locally?” (credential resolution)”Evaluated by checking the GITHUB_ACTIONS environment variable and AWS_DEFAULT_PROFILE:
if [[ "${GITHUB_ACTIONS:-}" != "true" && (! -v AWS_DEFAULT_PROFILE || -z "${AWS_DEFAULT_PROFILE}") ]]; then # Local path: resolve secrets from 1Password, set AWS_DEFAULT_PROFILEfiWhen GITHUB_ACTIONS=true, the script assumes all secrets are already present in the environment (injected by the workflow’s env block) and skips 1Password resolution entirely. The aws sso login calls throughout the script are also gated on this variable — they are no-ops in CI.
When AWS_DEFAULT_PROFILE is already set (even outside CI), the script also skips credential resolution, allowing operators to pre-configure their environment.
”Full Amplify target?” (Amplify deployment path)
Section titled “”Full Amplify target?” (Amplify deployment path)”The script maintains a hardcoded list of Infrastructure:Partition pairs that receive full Amplify deployment (app creation, branch, domain, initial job):
AMPLIFY_DEPLOY_TARGETS=("SandboxKyle002:kyle" "Alpha001:demo")The check is a substring match against this array:
amplify_target="${infrastructure}:${partition}"if [[ " ${AMPLIFY_DEPLOY_TARGETS[*]} " == *" ${amplify_target} "* ]]; then # Full path: deploy amplify.cfn.yaml + amplifyBranch.cfn.yaml + workaround + initial jobelse # Manual path: deploy amplifyComputeRole.cfn.yaml + attach role + merge env varsfiAll other partitions (Alpha001:prod, Alpha002:dev, Alpha002:stage) follow the “manual” path — they have Amplify apps created outside this script (e.g., via the AWS console or a separate process), and amm.sh only manages the compute role and environment variables.
Amplify auto-build and PR preview flags
Section titled “Amplify auto-build and PR preview flags”Within the full Amplify path, two boolean flags are derived from the partition name:
| Flag | Default | Exception | Rationale |
|---|---|---|---|
enable_auto_build | true | false for demo | Demo deployments are triggered manually to control when changes go live |
enable_pr_preview | false | true for dev | Only the dev partition creates a secondary main branch resource to enable Amplify PR preview builds |
Amplify repository and branch resolution
Section titled “Amplify repository and branch resolution”The script uses two associative arrays to map each Infrastructure:Partition pair to its GitHub repository and branch:
declare -A AMPLIFY_APP_REPOS=( [SandboxKyle002:kyle]="Arda-cards/kyle-frontend-app" [Alpha001:demo]="Arda-cards/arda-frontend-app" [Alpha002:dev]="Arda-cards/arda-frontend-app" [Alpha002:stage]="Arda-cards/arda-frontend-app" [Alpha001:prod]="Arda-cards/arda-frontend-app")declare -A AMPLIFY_BRANCH_NAMES=( [dev]="main" [stage]="main" [demo]="main" [prod]="main" [kyle]="main")Currently all partitions deploy the main branch. The AMPLIFY_APP_REPOS map allows different partitions to point at different frontend repositories (e.g., kyle uses a separate fork).
Compute role workaround
Section titled “Compute role workaround”After deploying the Amplify app and branch stacks, the script checks whether the computeRoleArn on the live Amplify app matches the CloudFormation export. This works around aws-cdk#34992 where CloudFormation silently fails to set the property:
COMPUTE_ROLE_ARN_VALUE="$(aws amplify get-app --app-id "${APP_ID}" --query "app.computeRoleArn" --output text)"if [[ "${COMPUTE_ROLE_ARN_VALUE}" != "${COMPUTE_ROLE_ARN}" ]]; then aws amplify update-app --app-id "${APP_ID}" --compute-role-arn "${COMPUTE_ROLE_ARN}"fiThis is a conditional fix — it only calls update-app when there is actual drift.
ARDA_API_KEY resolution (per-partition, local only)
Section titled “ARDA_API_KEY resolution (per-partition, local only)”Inside the partition loop, the API key is resolved only when running locally and the environment variable is not already set:
if [[ "${GITHUB_ACTIONS:-}" != "true" && -z "${ARDA_API_KEY:-}" ]]; then ARDA_API_KEY="$(resolve_arda_api_key "${partition}")"fiThe resolve_arda_api_key function looks up the partition name in PARTITION_VAULT_MAP and calls op read against the corresponding 1Password vault. In CI, ARDA_API_KEY is injected per-partition by the workflow using the secrets[format('ARDA_API_KEY_{0}', partition)] pattern.
CloudFront key pair ID (manual Amplify path)
Section titled “CloudFront key pair ID (manual Amplify path)”When merging environment variables for manually-created Amplify apps, the script conditionally includes CLOUDFRONT_KEY_PAIR_ID only if the CloudFormation export exists:
KEY_PAIR_ID="$(aws cloudformation list-exports --output text \ --query "Exports[?Name=='${infrastructure}-${partition}-API-ImageCdnSigningKeyId'].Value")"if [[ -n "${KEY_PAIR_ID}" && "${KEY_PAIR_ID}" != "None" ]]; then # Include CLOUDFRONT_KEY_PAIR_ID in the merged env varsfiThis handles partitions that do not have the ImageStorageStack deployed (e.g., early-stage environments without image CDN support).
Deployment Logging Detail
Section titled “Deployment Logging Detail”The script records structured JSON to CloudWatch throughout the run:
Failure Modes and Diagnostics
Section titled “Failure Modes and Diagnostics”CDK Bootstrap Failures
Section titled “CDK Bootstrap Failures”| Symptom | Cause | Resolution |
|---|---|---|
CDKToolkit stack in ROLLBACK_COMPLETE | Previous bootstrap failed mid-way | Delete the CDKToolkit stack manually, then re-run |
already exists error during bootstrap | Stale CDKToolkit from different bootstrap version | Delete and re-bootstrap, or run cdk bootstrap --force |
CDK Deploy Failures
Section titled “CDK Deploy Failures”| Symptom | Cause | Resolution |
|---|---|---|
Stack in ROLLBACK_COMPLETE | A previous create failed | Delete the stack in CloudFormation console, then re-run |
UPDATE_ROLLBACK_COMPLETE | A previous update failed and rolled back | The stack is usable; re-run will attempt another update |
Resource already exists | RETAIN-policy resource survived a rollback | Manually delete the resource (follow the RETAIN cleanup order in the infrastructure repo’s knowledge-base/cdk-construct-patterns.md), then re-run |
| Cross-stack export in use | Trying to remove an export consumed by another stack | Deploy the consuming stack first to remove the dependency |
Helm Failures
Section titled “Helm Failures”| Symptom | Cause | Resolution |
|---|---|---|
helm upgrade --install times out | Pods not reaching Ready state | Check kubectl get pods -n <namespace>, inspect events and logs |
--atomic rollback | Helm auto-rolled back a failed release | Inspect helm history <release> -n <namespace> for error details |
ServiceAccount annotation mismatch | IAM role ARN changed but Helm didn’t update | Delete the SA manually: kubectl delete sa -n <namespace> <sa-name>, then re-run |
Kubernetes / Target Group Binding Failures
Section titled “Kubernetes / Target Group Binding Failures”| Symptom | Cause | Resolution |
|---|---|---|
TargetGroupBinding stuck in Progressing | LBC not running or target group ARN invalid | Verify LBC pods are healthy; check the ARN matches the NLB export |
| Stale bindings not deleted | Script only deletes bindings with non-matching ARNs | If ARN matches but binding is broken, delete manually with kubectl delete tgb |
Amplify Failures
Section titled “Amplify Failures”| Symptom | Cause | Resolution |
|---|---|---|
Vendor response doesn't contain <attribute> | CloudFormation export not yet available | Ensure the Partition CDK stacks completed; re-run |
| Compute role not attached | aws-cdk#34992 — CloudFormation does not set computeRoleArn | The script works around this; if it persists, run aws amplify update-app manually |
| Initial job fails | Build error in the frontend app | Check Amplify console build logs; this is a frontend issue, not an infrastructure issue |
Credential / Authentication Failures
Section titled “Credential / Authentication Failures”| Symptom | Cause | Resolution |
|---|---|---|
op read fails | 1Password CLI not authenticated | Run eval $(op signin) |
aws sso login hangs | Browser-based SSO flow not completing | Complete the SSO flow in the browser; check ~/.aws/config for the profile |
ExpiredTokenException | SSO session expired mid-run | The script calls aws sso login before each phase; if it still expires, the run took too long — re-run |
| OIDC role assumption fails (CI) | IAM trust policy doesn’t include the GitHub repo/branch | Update the trust policy on the <Infra>-I-GitHubActionInfrastructure role |
Deployment Log Diagnostics
Section titled “Deployment Log Diagnostics”Every run logs a structured JSON entry to CloudWatch (/arda/oam/deployments). The entry includes:
status:succeeded,failed, orinterruptedexit_code: the process exit codegit.branch,git.commit,git.worktree_dirty: the exact code version deployedaws_profile,aws_region: the AWS identity usedinfrastructure,partitions: what was targetedversion: the git tag matching the deployed commit (if any)
Query recent deployments:
aws logs filter-log-events \ --log-group-name /arda/oam/deployments \ --start-time $(date -u -d '24 hours ago' +%s000) \ --filter-pattern '{ $.status = "failed" }'Modification Guide
Section titled “Modification Guide”Adding a New Infrastructure
Section titled “Adding a New Infrastructure”- Create CDK instance files in
src/main/cdk/instances/<NewInfra>/infra.tsand one file per partition. - Add the Infrastructure to
RUNTIME_ACCOUNTSinsrc/main/cdk/platform/aws-configuration.ts. - Add the partition expansion to the
allcase inamm.sh. - Add the
Infrastructure:Partitionentries toAMPLIFY_DEPLOY_TARGETS,AMPLIFY_BRANCH_NAMES, andAMPLIFY_APP_REPOSif Amplify is needed. - Add the partition → 1Password vault mapping to
PARTITION_VAULT_MAP. - Add the new environment to the
amm.ymlworkflow’soptionslist. - Create the IAM role
<NewInfra>-I-GitHubActionInfrastructurewith OIDC trust for GitHub Actions.
Adding a New Partition to an Existing Infrastructure
Section titled “Adding a New Partition to an Existing Infrastructure”- Create
src/main/cdk/instances/<Infra>/<partition>.ts. - Update the
allexpansion inamm.sh. - Add the
PARTITION_VAULT_MAPentry. - Add
AMPLIFY_BRANCH_NAMESandAMPLIFY_APP_REPOSentries. - If the partition should auto-deploy via Amplify, add it to
AMPLIFY_DEPLOY_TARGETS. - Add
<Infra>/<partition>to theamm.ymlworkflow options and theci.yamlsynth matrix. - Create the
ARDA_API_KEY_<partition>GitHub Actions secret.
Adding a New Secret
Section titled “Adding a New Secret”- Add a parameter to
partitionSecrets.cfn.yamlwithNoEcho: true. - Add the corresponding
AWS::SecretsManager::Secretresource and output/export. - In
amm.sh:- Add the
op readcall in the local-credentials block. - Pass it to the
aws cloudformation deploy --parameter-overridesfor the secrets stack.
- Add the
- In
amm.yml: add the GitHub Actions secret reference to theenvblock. - If the secret is consumed by Amplify, add it to the
amplify.cfn.yamlEnvironmentVariables.
Best Practices
Section titled “Best Practices”- Test with
cdk synthfirst. The CI pipeline runssynthfor every Infrastructure/Partition combination. Runnpm run synth:named -- <Infra>/<target>locally before modifyingamm.sh. - Keep Helm chart versions pinned. All
helm upgrade --installcalls specify--version. Bump versions deliberately and test in a sandbox first. - Use
--atomicfor Helm. All Helm installs use--atomic, which auto-rolls back on failure. Do not remove this flag. - Respect the deployment order. Infrastructure must complete before any Partition. Secrets must be deployed before Amplify (Amplify references secret ARNs via CloudFormation exports).
- Do not skip
aws sso login. The script calls it before each phase to handle session expiry on long runs. Removing these calls will cause failures on multi-partition deployments. - Preserve the EXIT trap. The
log_run_completiontrap ensures every run is logged to CloudWatch, even on failure. If you restructure the script, ensure the trap remains installed early and covers all exit paths. - Idempotency. CloudFormation
deployand Helmupgrade --installare idempotent — they report “no changes” for already-up-to-date resources. However, the script re-executes every step from the beginning on each run; there is no--from-stepresume capability. Re-runs are safe but not instant — expect infrastructure and Helm steps to repeat. See Failure Mode Analysis for known side effects on re-run (e.g., unconditionalamplify start-job).
Copyright: © Arda Systems 2025-2026, All rights reserved