Skip to content

Sentry Integration

This page is the step-by-step recipe for wiring Sentry observability into an Arda component. The design rationale, capture-path topology, and policy decisions live in Sentry Observability — read that first if you have not. This page assumes you have, and goes straight to the moving parts.

Two scenarios are covered:

  • Add a new Kotlin/Ktor component to the platform’s Sentry surface (operations is the worked example; accounts-component will follow the same recipe under PDEV-533).
  • Update the frontend to harden trace propagation (already done; this section is the maintenance reference).

Both halves share the same arda-systems Sentry tenant. The backend events land in the Sentry project whose slug is platform-be (Sentry display name platform-operations); the frontend events land in the arda-frontend project. There is no per-component Sentry project today — Kotlin components share platform-be and are distinguished by the component tag and the release tag ({appName}@{Chart.AppVersion}).

The recipe assumes the component already consumes common-module 8.3.0 or later. If it does not, bump first — almost every step below relies on primitives the observability module ships.

Each consuming component’s chart has its own oam.performance.sentry block. Copy from operations/src/main/helm/values.yaml (the chart-level defaults) and add per-env overrides under values-<env>.yaml. The minimal default in values.yaml:

oam:
performance:
sentry:
enabled: false
environment: "" # falls back to application.environment helper
tracesSampleRate: "0.05" # safe default; per-env overrides set the real rate
sessions:
enabled: false

Per-env overrides (taking operations’ choices as the template):

EnvenabledtracesSampleRatesessions.enabled
localfalse0.0false
devtrue1.0true
stagetrue1.0true
demotrue0.2true
prodtrue0.2true

The dev / stage 1.0 is for diagnostic targets; demo / prod 0.2 matches the frontend’s prod rate so FE-initiated traces survive on the backend side (DT-004 — Session-based release health).

2. Helm template: emit Sentry env vars on the pod

Section titled “2. Helm template: emit Sentry env vars on the pod”

In templates/deployment.yaml, inside the container spec, gate the Sentry env vars on .Values.oam.performance.sentry.enabled and emit them. The block must include:

{{- if .Values.oam.performance.sentry.enabled }}
{{- $sentryEnv := default (include "application.environment" .) .Values.oam.performance.sentry.environment }}
- name: SENTRY_DSN
valueFrom:
secretKeyRef:
name: be-sentry-dsn
key: dsn
optional: true
- name: SENTRY_ENVIRONMENT
value: {{ $sentryEnv | quote }}
- name: SENTRY_RELEASE
value: {{ printf "%s@%s" (include "application.name" .) .Chart.AppVersion | quote }}
- name: SENTRY_TRACES_SAMPLE_RATE
value: {{ .Values.oam.performance.sentry.tracesSampleRate | quote }}
- name: SENTRY_ENABLE_AUTO_SESSION_TRACKING
value: {{ .Values.oam.performance.sentry.sessions.enabled | quote }}
- name: SENTRY_AUTO_SESSION_TRACKING
value: {{ .Values.oam.performance.sentry.sessions.enabled | quote }}
- name: SENTRY_SCRUB_SALT
valueFrom:
secretKeyRef:
name: be-sentry-scrub-salt
key: salt
optional: true
- name: OTEL_TRACES_EXPORTER
value: "none"
- name: OTEL_METRICS_EXPORTER
value: "none"
- name: OTEL_LOGS_EXPORTER
value: "none"
{{- end }}

Notes:

  • SENTRY_DSN and SENTRY_SCRUB_SALT use secretKeyRef.optional: true so the pod stays fail-soft when ESO has not yet reconciled the upstream secret.
  • Both SENTRY_ENABLE_AUTO_SESSION_TRACKING (the SDK-canonical name read by the agent at boot) and the legacy SENTRY_AUTO_SESSION_TRACKING (read by common-module 8.3.0’s SentryInit) are set to the same value. The legacy entry can be removed once common-module ships the canonical-name patch (tracked under PDEV-538).
  • The OTEL_*_EXPORTER=none lines disable the upstream OpenTelemetry pipeline’s OTLP exporters. Sentry’s own span exporter continues to ship spans to Sentry SaaS. See the rationale on Sentry Observability.

In templates/secrets.yaml, two ExternalSecret resources gated on the same sentry.enabled flag.

be-sentry-dsn — infrastructure-scoped DSN, shared across all components on the same Infrastructure:

apiVersion: external-secrets.io/v1
kind: ExternalSecret
metadata:
name: be-sentry-dsn
spec:
refreshInterval: 1h
secretStoreRef:
name: {{ include "application.name" . | quote }}
target:
deletionPolicy: Delete
data:
- secretKey: dsn
remoteRef:
key: {{ printf "%s-SentryDsn" .Values.global.infrastructure | quote }}
version: "AWSCURRENT"

be-sentry-scrub-salt — partition-scoped HMAC salt for opaque user IDs:

apiVersion: external-secrets.io/v1
kind: ExternalSecret
metadata:
name: be-sentry-scrub-salt
spec:
refreshInterval: 1h
secretStoreRef:
name: {{ include "application.name" . | quote }}
target:
deletionPolicy: Delete
data:
- secretKey: salt
remoteRef:
key: {{ printf "%s-%s-SentryScrubSalt" .Values.global.infrastructure .Values.global.purpose | quote }}
property: salt
version: "AWSCURRENT"

The remoteRef.key shape matters:

  • DSN uses .Values.global.infrastructure alone (one DSN per Infrastructure → Alpha001-SentryDsn).
  • Scrub salt uses .Values.global.infrastructure + .Values.global.purpose (one per partition → Alpha002-dev-SentryScrubSalt). Do not use .Release.Namespace — the legacy partition stack uses a namespace-scoped convention but the new partition-secrets stack is partition-scoped. The two collisions cost real time during the operations-sentry rollout; this is the one to get right.
  • property: salt on the scrub-salt entry is mandatory. The upstream payload is JSON {"salt": "<value>"}; without property: salt ESO projects the whole JSON into the K8s Secret and the env var would be {"salt":"..."} instead of the raw string.

In src/main/resources/logback.xml, add the SENTRY appender alongside the existing STDOUT appender and reference it from the root logger:

<appender name="SENTRY" class="io.sentry.logback.SentryAppender">
<minimumEventLevel>ERROR</minimumEventLevel>
<minimumBreadcrumbLevel>INFO</minimumBreadcrumbLevel>
</appender>
<root level="INFO">
<appender-ref ref="STDOUT"/>
<appender-ref ref="SENTRY"/>
</root>

ERROR-level events and any event carrying a Throwable become captured Sentry events; INFO and above ride along as breadcrumbs on the next captured event. The appender no-ops cleanly when SENTRY_DSN is unset (local builds, tests).

The Sentry Logback dependency ships with common-module 8.3.0+; no per-component dependency entry is needed.

common-module’s SentryInit.init() sets isEnableAutoSessionTracking=true, but on the JVM SDK 8.x this alone emits at most one session per JVM lifecycle — not per request. There is no published sentry-ktor server plugin from Sentry. The documented mechanism for per-request sessions on a non-Spring Java server is manual Sentry.startSession() / Sentry.endSession() at request boundaries.

Until this lifts into common-module (tracked under PDEV-490), each consuming component installs a small Ktor application plugin locally. Add to the component’s Main.kt:

import io.sentry.Sentry
import io.ktor.server.application.*
import io.ktor.server.application.hooks.ResponseSent
private val SentryRequestSession = createApplicationPlugin("SentryRequestSession") {
onCall { _ ->
if (Sentry.isEnabled()) Sentry.startSession()
}
on(ResponseSent) { _ ->
if (Sentry.isEnabled()) Sentry.endSession()
}
}
fun applicationConfigurer(cfgProvider: ConfigurationProvider): Application.() -> Unit = {
if (pluginOrNull(SentryRequestSession) == null) {
install(SentryRequestSession)
}
// ... existing Component.build(...) and per-module setup
}

The pluginOrNull guard makes the install idempotent — the eventual common-module move does not require a coordinated removal here.

To compile against Sentry, declare an explicit implementation(libs.sentry) in the component’s build.gradle.kts (and sentry = { module = "io.sentry:sentry", version.ref = "..." } in libs.versions.toml), even though the JAR is on the classpath transitively via common-module. This pins compile-time access against common-module’s internal implementation-scoped dependency.

6. runBoundary adoption in background paths

Section titled “6. runBoundary adoption in background paths”

Every out-of-request entry point in the component wraps its work in runBoundary("<job-label>") { ... } (or runSuspendingBoundary for suspend entry points). The wrapper is in common-module/.../runtime/observability/BoundaryCapture.kt.

Worked example from operations/.../csvUpload/CsvUploadService.kt:

import cards.arda.common.lib.runtime.observability.runSuspendingBoundary
import kotlinx.coroutines.SupervisorJob
// Fire-and-forget batch: carry the application context (tenant, MDC,
// call-id) so the launched work logs and queries under the same identity
// as the request, but install a fresh top-level SupervisorJob so the
// batch lifecycle is independent of the Ktor request — client disconnect
// must not cancel an in-flight batch.
CoroutineScope(currentCoroutineContext() + SupervisorJob() + Dispatchers.IO).launch {
runCatching {
runSuspendingBoundary("csvupload-process-batch") {
flowProcessor.process(starterEvent, fl, trkr)
}
}.flatten().onFailure { trkr.update(JobEvent.Failed(...)) }
}

The nesting matters: runSuspendingBoundary must wrap the actual work, not the outer runCatching — otherwise the inner catch swallows the throwable and the boundary captures nothing. The runCatching outside the boundary catches the rethrown error and runs the user-facing tracker update; both paths fire.

Audit the component for every CoroutineScope(...).launch { ... } outside a request handler and apply the same shape.

The Sentry OTel Java agent is bundled into the operations container by Jib. The relevant pieces:

  • build.gradle.kts declares sentryAgent as a configuration that depends on io.sentry:sentry-opentelemetry-agent and copies the JAR into /app/agents/sentry-otel-agent.jar inside the image.
  • The deployment template appends -javaagent:/app/agents/sentry-otel-agent.jar to JAVA_TOOL_OPTIONS when sentry.enabled is true.

Copy these pieces verbatim from operations when wiring a new component. The agent version is pinned alongside the Sentry SDK so the two never drift.

When SENTRY_DSN is unset (the local profile sets enabled: false), everything no-ops:

  • SentryInit.init() short-circuits at the missing DSN.
  • The Logback appender starts but emits nothing.
  • The SentryRequestSession plugin’s Sentry.isEnabled() guard returns false; startSession/endSession are skipped.
  • The OTel agent is not attached (no -javaagent: line).

Tests run unmodified — the in-process SDK guards on DSN at every entry point.

The frontend Sentry integration was wired in arda-frontend-app#845. For new init paths or new environments, follow the same pattern.

The single source of truth for trace-propagation targets is src/lib/sentry/trace-propagation-targets.ts:

export function tracePropagationTargets(env: string): Array<string | RegExp> {
const defaults: Array<string | RegExp> = ["localhost", /^\//];
switch (env) {
case "production":
return [...defaults, "api.arda.cards"];
case "stage":
return [...defaults, "stage.alpha002.io.arda.cards"];
case "development":
return [...defaults, "dev.alpha002.io.arda.cards"];
default:
return defaults;
}
}

To add a new environment, extend the switch with the env-specific backend host. The defaults (localhost, /^\//) match Sentry’s same-origin convention and cover browser→BFF; the env-specific host enables cross-origin browser-direct or BFF→BE propagation.

All three Sentry init paths read from the helper:

  • src/instrumentation-client.ts — browser runtime
  • sentry.server.config.ts — Next.js server runtime
  • sentry.edge.config.ts — Next.js edge runtime

Each path’s Sentry.init({ ... }) includes:

import { tracePropagationTargets } from "@/lib/sentry/trace-propagation-targets";
Sentry.init({
// ...existing options
tracePropagationTargets: tracePropagationTargets(env),
});

Where env is the path’s existing environment source (process.env.NEXT_PUBLIC_DEPLOY_ENV in browser; the relevant resolved env in server/edge).

Each acceptance criterion in the operations-sentry verification plan maps to a concrete check. The most useful day-to-day recipes:

Terminal window
POD=$(kubectl --context <Infrastructure> -n <namespace> get pod -l app=<component> -o jsonpath='{.items[0].metadata.name}')
# All Sentry env vars present in the running pod
kubectl --context <Infrastructure> -n <namespace> exec "$POD" -- printenv | grep '^SENTRY_'
# ESO sync of the scrub salt
kubectl --context <Infrastructure> -n <namespace> get externalsecret be-sentry-scrub-salt \
-o jsonpath='{range .status.conditions[*]}{.type}={.status} ({.reason}){"\n"}{end}'
# Decoded salt length must be 64 bytes
kubectl --context <Infrastructure> -n <namespace> get secret be-sentry-scrub-salt \
-o jsonpath='{.data.salt}' | base64 -d | wc -c

AC-2 / AC-3 — boundary capture + filtering

Section titled “AC-2 / AC-3 — boundary capture + filtering”

Trigger a deliberate AppError.Internal.Implementation via a test endpoint (or known route + payload), then query Sentry:

# Issue present
mcp__claude_ai_Sentry__search_issues(
organizationSlug='arda-systems',
projectSlugOrId='platform-be',
query='environment:<env> boundary:http')
# Invocation.* must NOT produce an Issue — same query, deliberate Invocation
# trigger should add nothing to the Issue list.
Terminal window
TOKEN=$(op read "op://Arda-SystemsOAM/Sentry Service Token/credential")
curl -s -G \
-H "Authorization: Bearer $TOKEN" \
--data-urlencode 'project=4511384478351360' \
--data-urlencode 'environment=<env>' \
--data-urlencode 'statsPeriod=24h' \
--data-urlencode 'field=sum(session)' \
--data-urlencode 'groupBy=session.status' \
--data-urlencode 'query=release:<appName>@<version>' \
'https://sentry.io/api/0/organizations/arda-systems/sessions/' | jq '.groups'

A non-zero sum(session) under session.status: ok confirms the per-request session emitter is firing.

Drive a real action through the UI (Playwright session login + a click that exercises a BE route), capture the outbound request’s sentry-trace header from the browser, then query Sentry for the same trace ID on platform-be:

mcp__claude_ai_Sentry__search_events(
organizationSlug='arda-systems',
projectSlug='platform-be',
dataset='spans',
query='trace:<trace_id>',
fields=['transaction', 'span.op', 'release', 'environment'],
sort='-timestamp')

The presence of BE-side spans on the same trace ID confirms FE→BFF→BE propagation.

Trigger a deliberate error with a request body containing a fake JWT, fake email, and X-Tenant-Id: T-12345. In the resulting Sentry event:

  • event.user.id is a 16-char hex string (opaque HMAC).
  • event.user.email, username, ipAddress are absent.
  • event.request.headers contains X-Request-Id only — no Authorization, no X-Tenant-Id.
  • event.tags.tenant_hash equals HMAC-SHA-256(salt, "T-12345") truncated.
  • event.request.data shows *** where the JWT was.
  • DB statements are parameterised in the corresponding transaction events (errors and transactions are separate event types in Sentry — exceptions have event.exception.values[], not event.spans). Query the spans dataset to confirm: span.op:db span.description:"*" on the same trace ID should show every literal replaced with ? per DbStatementRedactor.

A log.error(...) from a code path that catches and continues (no throw to a boundary) produces a Sentry event:

mcp__claude_ai_Sentry__search_events(
organizationSlug='arda-systems',
projectSlug='platform-be',
dataset='errors',
query='release:<appName>@<version> environment:<env> level:error',
sort='-timestamp')

Independent of the in-code scrubbing, two Sentry-side toggles are kept enabled as defence in depth (PDEV-535 ratified the current state):

ToggleSetting
Data Scrubbers (org/project)Enabled
Default Scrubbers (built-in regexes for credit cards, SSNs, etc.)Enabled
Scrub IP AddressesEnabled

The code-side beforeSend / beforeSendTransaction policy is authoritative; these are a second pass. Do not disable them.

  • SENTRY_SCRUB_SALT missing on the pod even with the ExternalSecret declared. Symptom: kubectl get externalsecret be-sentry-scrub-salt shows Ready=True reason=SecretDeleted. Cause: the remoteRef.key resolves to a name that doesn’t exist in AWS Secrets Manager — usually because .Release.Namespace was used in the template where .Values.global.purpose belongs. The fix is the one-identifier swap; see step 3.
  • localhost:4318 ConnectException flood in pod logs. The upstream OTel pipeline’s OTLP exporters are active. Set OTEL_TRACES_EXPORTER=none (and OTEL_METRICS_EXPORTER=none, OTEL_LOGS_EXPORTER=none) on the pod; see step 2.
  • Zero sessions on platform-be despite enableAutoSessionTracking=true. The JVM SDK doesn’t emit per-request sessions on its own; install the SentryRequestSession Ktor plugin per step 5.
  • runSuspendingBoundary wraps a runCatching block. The inner runCatching swallows every Throwable; the boundary sees no exception and captures nothing. Invert the nesting so the boundary wraps the work and the outer runCatching catches the rethrow.
  • Forgetting property: salt on the scrub-salt ExternalSecret. ESO projects the whole JSON document; SENTRY_SCRUB_SALT ends up {"salt":"abc..."} and PiiScrubber HMAC uses the brace-padded literal. Symptoms: opaque IDs look correct (stable) but don’t match what an out-of-band salt computation produces.

Copyright: (c) Arda Systems 2025-2026, All rights reserved