Completion Report: Operations Sentry
Completed: 2026-05-19 (all four partitions rolled to operations 2.25.1).
Linear umbrella: PDEV-537.
Parallel adoption (out of scope): PDEV-533 — accounts-component.
What shipped
Section titled “What shipped”End-to-end Sentry observability for the operations Kotlin/Ktor backend, wired so every consumer of arda-common 8.3.0 inherits the behaviour:
- Error and exception tracking with
AppError.reportable()policy (Internal.*/Genericreportable;Invocation.*dropped). Captured at request boundaries viarunSuspendingBoundary, at background paths via a Logback appender at ERROR, and at a JVM-level last-resort handler before thread death. - Performance and tracing (APM) via the bundled Sentry OTel Java Agent for routes, DB queries (Exposed), and outbound HTTP (
http.clientspans on Documint), combined with Sentry-native sampling. End-to-end trace propagation from the browser through the BFF into the Ktor route, joining a single Sentry trace. - Release health via a manual
SentryRequestSessionKtor application plugin (the JVM SDK does not emit per-request sessions automatically — see DT-004 supersession). Sessions flow on the Release Health tab tagged bySENTRY_RELEASE=operations@<chart-version>. - PII scrubbing at both
beforeSendandbeforeSendTransactionvia the partition-scopedSENTRY_SCRUB_SALT, hashing user identifiers deterministically and applying allow-list redaction to headers, request bodies, and span data. - Frontend trace propagation with explicit env-aware
tracePropagationTargetscovering Amplify’s abbreviated env names (DEV/STAGING/PROD). - Infrastructure — per-partition
PartitionSecretsCFN stack provisioningSentryScrubSaltin Secrets Manager, with the two CFN export prefixes (-API-for the operations Helm chart and-I-for CDK compose-time wiring). - Documentation — new architectural reference
current-system/oam/sentry-observability.md, fully rewritten how-toprocess/craft/operations-and-monitoring/sentry-integration.md, and project roadmap promotion tocompleted/.
Merged PRs
Section titled “Merged PRs”| Stream | Repository | PR | Status |
|---|---|---|---|
| Common library | common-module | #171 | Merged — arda-common 8.3.0 |
| Component adoption | operations | #172 | Merged — chart 2.25.1 |
| Infrastructure | infrastructure | #459 | Merged |
| Frontend trace propagation | arda-frontend-app | #845 | Merged |
| Documentation | documentation | #94 | Open — wraps this project |
| Infra naming follow-up | infrastructure | #460 | Open at project close (independent cleanup) |
Partition rollout
Section titled “Partition rollout”Deployed in order Alpha002-dev → Alpha002-stage → Alpha001-demo → Alpha001-prod on 2026-05-19. Each partition followed the two-phase recipe: (1) amm.sh provisions the PartitionSecrets CFN stack and SentryScrubSalt Secrets Manager entry; (2) the GitHub Actions matrix step rolls the operations Helm release to 2.25.1.
| Partition | amm.sh | operations 2.25.1 | Sentry env vars | ESO scrub-salt |
|---|---|---|---|---|
| Alpha002-dev | ✅ | ✅ | ✅ | ✅ |
| Alpha002-stage | ✅ | ✅ | ✅ | ✅ |
| Alpha001-demo | ✅ | ✅ | ✅ | ✅ |
| Alpha001-prod | ✅ | ✅ | ✅ | ✅ |
Pod-level smoke verification on every partition confirmed SENTRY_DSN, SENTRY_ENVIRONMENT, SENTRY_RELEASE, SENTRY_SCRUB_SALT (64-char value) populated, both be-sentry-dsn and be-sentry-scrub-salt ExternalSecrets at SecretSynced=True/Ready=True, and both replicas 1/1 Running on the new chart.
Acceptance criteria
Section titled “Acceptance criteria”- AC-1 — Boundary error capture: an
AppError.Internal.*thrown in a Ktor route appears in Sentry underplatform-bewith the configured fingerprint, the joined frontend trace ID, andtenant_hashtag. ✅ - AC-2 —
Invocation.*drop:AppError.Invocation.NotFoundthrown in a Ktor route produces no Sentry issue. ✅ - AC-3 — Background capture: a
log.error(...)in a scheduled job produces an issue without a request scope. ✅ - AC-4 — Last-resort capture: an uncaught throwable in a fire-and-forget coroutine reaches the JVM-level handler and produces an issue. ✅ (validated by the
CsvUploadServicefix and supplementary unit tests). - AC-5 — End-to-end trace: a browser-initiated request flows as a single Sentry trace through the BFF into the Ktor route and includes DB spans and outbound
http.clientspans. ✅ — verified empirically with trace776c560abd12401a9c4bc2dc869581e9from a Documint-printing preview deploy of PR #845. - AC-6 — PII scrubbed: no plaintext user identifier or restricted header appears in any captured event or span. ✅ — verified with the spans-dataset query in the rewritten how-to.
- AC-7 — Release health: sessions flow on the Release Health tab tagged
operations@2.25.1. ✅ — verified after the manualSentryRequestSessionplugin landed. - AC-8 — Partition isolation of salts: each partition uses a distinct
SENTRY_SCRUB_SALT. ✅ — verified via Secrets Manager and ESO output across all four partitions.
Byproducts
Section titled “Byproducts”Under byproducts/:
- changelog.md — what changed by repository.
- learnings.md — non-obvious lessons (SDK 8 session-emission quirk, OTel agent OTLP suppression, ESO templating, JSDoc trap, fire-and-forget scope).
- suggestions.md — improvements worth doing but out of scope.
- alternatives.md — paths evaluated and not taken.
- skipped.md — scope deferred, with tracking tickets.
- specification-post.md — what the spec would say in hindsight.
The design artefacts under specification/, plan/, and the decision-log.md remain authoritative for the design record. The full exploration record is in the workbook at workbooks/notebooks/operations-sentry/.
Follow-ups
Section titled “Follow-ups”| Ticket | Scope | Status |
|---|---|---|
| PDEV-491 | Documint client-side Sentry HTTP plugin (semantic context) | Open |
| PDEV-533 | Adopt Sentry instrumentation in accounts-component | Open (out of scope here from the start) |
| PDEV-538 | Remove deprecated SENTRY_ENABLE_AUTO_SESSION_TRACKING once safe | Open |
| PDEV-541 | Unify legacy partition-secrets stack with new CDK PartitionSecrets | Open |
| PDEV-543 | Review-thread follow-up from operations#172 | Open |
| PDEV-544 | Review-thread follow-up from documentation#94 (blocked-by PDEV-543) | Open |
What was deferred
Section titled “What was deferred”- The Documint client-side Sentry plugin — outbound visibility is already covered by the OTel agent’s generic
http.clientspans; the plugin would add semantic enrichment. See PDEV-491. - A broader audit of fire-and-forget
launch { ... }invocations inside Ktor routes. The CSV path was the known leak and is fixed; a wider sweep is candidate scope for a future sprint. - An E2E Playwright smoke test that asserts
sentry-trace/baggageheader propagation. Empirically verified on PR #845 preview; not encoded as a test yet.
Closing notes
Section titled “Closing notes”The project closed on schedule across all four partitions. Two corrections to the original specification deserve carrying forward:
- Sentry JVM SDK 8.x emits no per-request sessions on Ktor. Anyone adopting the SDK on Ktor needs the
SentryRequestSessionplugin until Sentry publishes an official server-side Ktor plugin. - The bundled OTel Java Agent enables the OTLP HTTP exporter by default. Set
OTEL_*_EXPORTER=nonewhenever there is no collector running, or the pod logs fill with localhost:4318 retry errors.
Both are now baked into the operations chart, into arda-common 8.3.0’s documented usage, and into the rewritten sentry-integration.md how-to.
Copyright: © Arda Systems 2025-2026, All rights reserved