Skip to content

Operations Sentry — Changelog

Closes the Operations Sentry project: end-to-end Sentry observability for the operations Kotlin/Ktor backend (errors, performance, release health) wired so every common-module consumer inherits the behaviour, plus the FE-side trace propagation that joins browser and backend in a single Sentry trace.

Linear umbrella: PDEV-537.

  • Added cards.arda.common.observability package:
    • SentryInit — fail-soft SDK initialisation honouring SENTRY_DSN, SENTRY_ENVIRONMENT, SENTRY_RELEASE, SENTRY_TRACES_SAMPLE_RATE, SENTRY_ENABLE_AUTO_SESSION_TRACKING/SENTRY_AUTO_SESSION_TRACKING, SENTRY_SCRUB_SALT.
    • BoundaryCapture — invoked by runSuspendingBoundary / runBoundary; captures AppError.reportable() outputs and non-AppError throwables.
    • PiiScrubberbeforeSend / beforeSendTransaction filter: deterministic salted hashing for user identifiers, allow-list redaction for headers, request bodies, and span data.
    • Global CoroutineExceptionHandler for last-resort fire-and-forget paths.
  • Component.build() now wires SentryInit and a StatusPages capture interceptor on every consuming component.
  • Logback appender (configured via consumer logback.xml) captures ERROR-level and exception-carrying events with the minimumEventLevel option.
  • New AppError.reportable(): List<Throwable> policy: Internal.* and Generic are reportable; Invocation.* are not.
  • Bumped arda-common to 8.3.0.
  • Main.kt: installs the SentryRequestSession Ktor application plugin (manual per-request session start/end — the JVM SDK does not emit per-request sessions automatically; see DT-004 supersession).
  • system/batch/CsvUploadService.kt: wraps the fire-and-forget batch coroutine in runSuspendingBoundary (correct nesting: boundary inside, runCatching outside) on an independent SupervisorJob() scope so request-scoped cancellation cannot tear down the batch.
  • helm/templates/secrets.yaml: ExternalSecret be-sentry-scrub-salt keyed by .Values.global.purpose (partition scope) with property: salt to project the JSON-shaped Secrets Manager value.
  • helm/templates/deployment.yaml: sets the Sentry env var block, including both SENTRY_ENABLE_AUTO_SESSION_TRACKING and the canonical SENTRY_AUTO_SESSION_TRACKING (dual until PDEV-538 retires the deprecated name), and disables the bundled OTel agent’s OTLP exporter (OTEL_TRACES_EXPORTER=none, OTEL_METRICS_EXPORTER=none, OTEL_LOGS_EXPORTER=none) to silence the localhost:4318 retry loop.
  • logback.xml: Sentry appender at ERROR level.
  • New PartitionSecrets CFN stack per partition (Alpha002-dev-PartitionSecrets, Alpha002-stage-PartitionSecrets, Alpha001-demo-PartitionSecrets, Alpha001-prod-PartitionSecrets) holding the SentryScrubSalt Secrets Manager secret.
  • Exports follow the -API- cross-repo convention: <partition>-API-SentryScrubSaltArn for the operations Helm chart’s ESO SecretStore lookup, plus the marker-prefixed <partition>-I-SentryScrubSaltArn for cross-stack CDK consumers.
  • SecretValue typing for the override prevents accidental plaintext logging.
  • Stack id is PartitionSecrets (not Secrets) to avoid colliding with the pre-existing non-CDK secrets stack that still holds the six legacy partition credentials (tracked by PDEV-541 for future unification).
  • New helper src/lib/sentry/trace-propagation-targets.ts with an env-keyed BACKEND_HOSTS map covering development / dev, staging / stage, production / prod (Amplify’s NEXT_PUBLIC_DEPLOY_ENV uses the abbreviated forms).
  • All three Sentry init paths (sentry.client.config.ts, sentry.server.config.ts, sentry.edge.config.ts) consume the helper so BFF → BE trace propagation is explicit per-env instead of relying on the Sentry SDK’s default same-origin heuristic.
  • 12 unit tests including alias-parity assertions between abbreviated and long env names.
  • DT-007 closed.
  • New architectural reference current-system/oam/sentry-observability.md covering the boundary capture model, the JVM-SDK-emits-no-per-request-session quirk, and the manual Ktor plugin that compensates.
  • Rewritten how-to process/craft/operations-and-monitoring/sentry-integration.md superseding the stale page (covers SDK init env vars, Logback appender placement, scrub salt provisioning per partition, verification recipes).
  • Roadmap promotion: project moves from roadmap/in-progress/operations-sentry/ to roadmap/completed/operations-sentry/.

Deployed in the order Alpha002-dev → Alpha002-stage → Alpha001-demo → Alpha001-prod across two phases per partition: (1) amm.sh provisioning of the PartitionSecrets CFN stack and SentryScrubSalt; (2) GitHub Actions matrix step rolling the operations Helm release to 2.25.1. All four partitions verified at the pod level: SENTRY_DSN, SENTRY_ENVIRONMENT, SENTRY_RELEASE, SENTRY_SCRUB_SALT populated, both ExternalSecret instances SecretSynced=True, both replicas 1/1 Running on the new chart.

  • PDEV-491 — Documint client-side Sentry HTTP plugin (deferred from this project; outbound spans are already captured by the JVM agent so coverage is preserved).
  • PDEV-538 — Remove the deprecated SENTRY_ENABLE_AUTO_SESSION_TRACKING once all components consume the canonical name.
  • PDEV-541 — Unify the legacy partition-secrets stack with the new CDK-managed PartitionSecrets stack.
  • PDEV-543 / PDEV-544 — Review-thread follow-ups raised during PR triage on operations#172 and documentation#94 respectively; PDEV-544 blocked by PDEV-543.