Skip to content

Analysis: PDEV-490 Operations Performance Improvements

Author: Claude Opus for jmpicnic | Date: 2026-05-19 | Status: Draft

Analysis: PDEV-490 Operations Performance Improvements

Section titled “Analysis: PDEV-490 Operations Performance Improvements”

Entry-state analysis of the operations component and common-module persistence layer, scoped to the two endpoints that PDEV-490 targets and the persistence-layer surfaces those endpoints depend on. Establishes the empirical baseline against which requirements define improvement targets, and surfaces the gaps that the specification closes.

The operations component issues bitemporal SELECTs (latest-version-per-eId) against kanban_card and item. Two route handlers fan out from the items-page workload: cardsForItem (single-item) and listWithDetails (page-scoped with per-chunk item-side fan-out). At the measured 2026-05-19 baseline, cardsForItem runs at p50 1,113 ms / p95 2,911 ms on Alpha001-prod; listWithDetails runs at p50 289 ms / p95 2,035 ms. The dominant costs are:

  1. Inner bitemporal subqueries on kanban_card and item planning against only three single-column indexes each — no composite covering the (tenant_id, item_reference_entity_id, eid, effective_as_of DESC, recorded_as_of DESC) access pattern.
  2. A wasted COUNT issued by cardsForItem because the kanban service requests withTotal = true and uses the result only as a non-null sanity check.
  3. A naive JDBC stack — HikariCP wired directly to a single Aurora cluster endpoint, no reader routing, no failover-aware retry. Aurora failovers translate to ~30 s of HTTP 500s while the JVM DNS cache holds the dead endpoint.

The project replaces the JDBC stack with the AWS Advanced JDBC Wrapper (read/write splitting + topology-driven failover + retry-on-typed-exception) in common-module; adds composite bitemporal indexes on kanban_card and item in operations; drops the wasted COUNT in cardsForItem; and surfaces transient failures as HTTP 503 with Retry-After. The bitemporal SELECTs are auto-routed to Aurora reader instances afterward.

This analysis covers:

  • The two target routes (cardsForItem, listWithDetails) — handlers, service methods, and the SQL they emit.
  • The common-module persistence layer that backs them — Persistence.kt, AbstractUniverse.kt, AbstractScopedUniverse.kt, DataSource.kt, the inTransaction boundary, and the StatusPages-installed HTTP error contract.
  • The Flyway migration trees that govern index coverage on kanban_card and item, plus a tenant-id index audit across every ScopedTable consumer in operations.
  • The measured performance baseline for both routes against Alpha001-prod, Alpha002-stage, and Alpha002-dev via Sentry over the trailing five-day window.

It does not cover:

  • Front-end consumers — the items-page front-end consolidation is tracked separately on PDEV-489. PDEV-490 ships the composite index that the front-end work depends on, but the front-end change itself is out of scope here.
  • Aurora cluster configuration (instance class, parameter group, max-connections) — handled by PDEV-479 and already shipped.
  • pg_stat_statements provisioning — handled by PDEV-498 and already shipped.
  • Long-term DB query observability tooling — tracked separately by PDEV-512.

The two routes in scope are declared in operations/src/main/kotlin/cards/arda/operations/resources/kanban/api/rest/KanbanCardEndpoint.kt and implemented in operations/src/main/kotlin/cards/arda/operations/resources/kanban/service/ServiceImpl.kt:

RouteService methodWorkload shape
GET /v1/kanban/kanban-card/for-item/{itemEId}KanbanCardService.cardsForItem(itemRef, asOf) (ServiceImpl.kt:276-293)One bitemporal SELECT on kanban_card with Filter.Eq(item.eId), plus an unused COUNT. Returns up to 1,000 cards.
POST /v1/kanban/kanban-card/detailsKanbanCardService.listWithDetails(query, asOf) (ServiceImpl.kt:322-350)One bitemporal SELECT on kanban_card followed by a chunked per-chunk SELECT on item (25-card chunks, flatMapMerge(concurrency = 25)). Hydrates KanbanCardDetails with full Item.Entity per card.

cardsForItem issues universe.list(query, asOf, withTotal = true). Tracing through common-module/lib/src/main/kotlin/cards/arda/common/lib/persistence/universe/AbstractUniverse.kt:152-180:

  • With withTotal = true, the underlying persistence layer issues a COUNT(*) against the same predicate in addition to the row-returning SELECT.
  • The Kotlin caller (ServiceImpl.kt:287-290) uses the resulting totalCount only as a non-null sanity check (when (pg.totalCount) { null -> Result.failure(AppError.IncompatibleState(...)); else -> Result.success(pg) }). The value never propagates to the HTTP response.
  • The null arm of that when is dead code under today’s withTotal = trueAbstractUniverse.list always materialises a non-null Long into pg.totalCount when withTotal is true. Dropping the flag without also removing the when would invert the dead branch into a 100%-failure regression.

Net per cardsForItem invocation today: 2 SQL statements (1 COUNT + 1 SELECT) on the kanban DB.

listWithDetails — chunked per-chunk fan-out

Section titled “listWithDetails — chunked per-chunk fan-out”

listWithDetails runs an outer kanban SELECT followed by a per-chunk inner item SELECT:

listEntities(query, asOf).flatMap { pageRs ->
pageRs.results.chunked(25).asFlow().flatMapMerge(concurrency = 25) { chunk ->
flow {
val targetItems = chunk.map { it.payload.item.eId }.toSet().toList()
itemService.listEntities(
Query(Filter.In(ITEM_TABLE.eId.name, targetItems), Pagination(0, chunk.size)),
asOf
).map { it.results.associate { it.payload.eId to it.payload } }
.onSuccess { itMap -> emitAll(chunk.asFlow().map { composeDetails(asOf, it, itMap[it.payload.item.eId]) }) }
.onFailure { emit(Result.failure(it)) }
}
}
// …
}

Net per listWithDetails invocation today: 1 SELECT on kanban_card + ⌈N/25⌉ SELECTs on item (where N is the kanban result size). For a 25-row page that’s 2 SQL statements; for 200 rows that’s 9.

Both routes ultimately emit the same SQL shape via common-module/lib/src/main/kotlin/cards/arda/common/lib/persistence/bitemporal/Persistence.kt. For the kanban_card SELECT:

SELECT bt.* -- ~30 wide columns
FROM kanban_card bt
WHERE bt.id IN (
SELECT sq.id
FROM kanban_card sq
WHERE <user condition>
AND <tenant constraint>
AND sq.effective_as_of <= <asOf.effective>
AND sq.recorded_as_of <= <asOf.recorded>
AND bt.eId = sq.eId -- correlated to outer row
AND bt.retired = FALSE
ORDER BY sq.effective_as_of DESC, sq.recorded_as_of DESC
LIMIT 1
)
ORDER BY bt.recorded_as_of DESC, bt.effective_as_of DESC, bt.id ASC
OFFSET 0 LIMIT 1000

This is the “latest version of each entity at an asOf coordinate” bitemporal pattern. The correlated subquery (bt.eId = sq.eId) forces Postgres to either re-execute the inner query per outer row or unroll it via a hash/merge plan. Plan quality depends entirely on whether a composite index covers the inner-subquery predicate.

Current indexes on kanban_card, from operations/src/main/resources/resources/kanban/database/migrations/V001__kanban.sql:50-52:

CREATE INDEX idx_kanban_card_eid ON kanban_card (eid);
CREATE INDEX idx_kanban_card_effective_as_of ON kanban_card (effective_as_of);
CREATE INDEX idx_kanban_card_recorded_as_of ON kanban_card (recorded_as_of);

Three single-column indexes. Subsequent migrations V002–V006 add columns but no further indexes on kanban_card. The tenant_id column exists but is not indexed in the Flyway tree; the AbstractScopedUniverse.kt:27 declaration tenantId.index("TENANT_ID_INDEX") is decorative (Exposed’s schema-emit path is not invoked in any deploy environment — Flyway is authoritative).

Current indexes on item follow the same pattern (single-column eid, effective_as_of, recorded_as_of), with the tenant_id index here actually present via reference/item/database/migrations/V012__bt_indexes.sql:8.

Tenant-id audit across ScopedTable consumers (audit completed 2026-05-18, full results below for reference):

ModuleTabletenant_id index status
reference/itemITEM_TABLEPresent (V012__bt_indexes.sql:8idx_item_tenant)
reference/business-affiliateBUSINESS_AFFILIATE_TABLEPresent (V001__biz_affiliates.sql:89idx_ba_tenant_id)
system/batchBATCH_JOB_TABLEPresent, but the migration lives in reference/item/V012__bt_indexes.sql:12 (the misplaced location is left as-is)
resources/kanbanKANBAN_CARD_TABLEMissing
resources/facilityFACILITY_TABLEMissing — deferred (out of scope)
resources/stationSTATION_TABLEMissing — deferred (out of scope)
procurement/ordersORDER_HEADER_TABLEMissing — deferred (out of scope)

The audit found exactly one PDEV-490-actionable gap: kanban_card is missing its tenant_id index. The migration that adds the composite bitemporal indexes on kanban_card will also add (tenant_id) as a separate index in the same file. The three modules deferred (facility, station, procurement/orders) are deliberately out of scope for PDEV-490; they are candidates for a future per-module hygiene pass.

operations/src/main/resources/application.conf:45-58:

dataSource {
pool {
minIdle = 1
maxPoolSize = 10
maxLifetime = 1800000
connectionTimeout = 30000
validationTimeout = 1000
idleTimeout = 600000
initializationFailTimeout = 1
isAutoCommit = true
keepAliveTime = 600000
transactionIsolation = "TRANSACTION_REPEATABLE_READ"
}
}

The JDBC stack today (common-module/lib/src/main/kotlin/cards/arda/common/lib/persistence/DataSource.kt):

  • HikariCP as the application-level pool, one pool per module DB (six pools in operations: kanban, item, businessaffiliates, facility, station, batch).
  • jdbcUrl of the form jdbc:postgresql://<aurora-cluster-writer-endpoint>:<port>/<db>.
  • driverClassName = "org.postgresql.Driver".
  • No read/write splitting — every transaction lands on the writer endpoint.
  • No failover-aware behavior — when Aurora promotes a different writer instance, the JVM DNS cache continues to resolve the cluster endpoint to the previously-promoted instance for ~30 s (per the JVM’s default networkaddress.cache.ttl).

Connection.setReadOnly(true) is propagated by Exposed when callers pass readOnly = true to transaction(...). Today this flag is set but unused at the JDBC layer — it’s a no-op against a writer-endpoint connection.

common-module/lib/src/main/kotlin/cards/arda/common/lib/api/rest/types/HttpResponses.kt:233-250 defines the canonical appErrorResponse mapping. AppError.Internal subtypes (Implementation, Infrastructure, InternalService, IncompatibleState, InternalTimeout, ExternalService) all render as HTTP 500 with the exception message in the body. There is no AppError.Transient branch, no HTTP 503 contract, and no Retry-After header.

Operations-side SQLException handler audit (completed 2026-05-18): grep -rnE 'SQLException|ExposedSQLException|PSQLException' src/main/kotlin/ against the operations worktree returned 0 hits. The canonical StatusPages handler in common-module is the sole HTTP renderer for SQL exceptions.

Today, an Aurora failover triggers the following sequence:

  1. The previously-promoted writer instance becomes unavailable.
  2. HikariCP detects connection failure on the next acquire and starts retrying within the 30 s connectionTimeout window.
  3. The JVM continues to resolve the cluster endpoint to the dead IP for up to 30 s (DNS cache).
  4. Connections continue to fail. HikariCP exhausts its retry budget; transactions surface as org.postgresql.util.PSQLException / org.jetbrains.exposed.exceptions.ExposedSQLException.
  5. The StatusPages handler maps these to HTTP 500 (AppError.Implementation).
  6. The user-visible 5xx window is ~30 s long, all HTTP 500.

There is no graceful-degradation path, no retry-on-transient at the inTransaction boundary, and no Aurora-topology awareness — the JVM does not know that Aurora has promoted a different writer until the DNS cache expires.

Sentry transaction durations on platform-be, trailing 5 days, all environments:

RouteEnvCountp50p95p99
GET /v1/kanban/.../kanban-card/for-item/{item-eid}Alpha001-prod4,3751,113 ms2,911 ms3,677 ms
GET /v1/kanban/.../kanban-card/for-item/{item-eid}Alpha002-dev1,140553 ms1,610 ms2,173 ms
GET /v1/kanban/.../kanban-card/for-item/{item-eid}Alpha002-stage142694 ms1,725 ms1,854 ms
POST /v1/kanban/.../kanban-card/detailsAlpha001-prod24,755289 ms2,035 ms3,215 ms
POST /v1/kanban/.../kanban-card/detailsAlpha002-dev7611,213 ms2,081 ms2,672 ms
POST /v1/kanban/.../kanban-card/detailsAlpha002-stage70680 ms2,110 ms2,182 ms

For reference, lighter sibling kanban-card routes on Alpha001-prod (no fan-out, no wide-row hydration):

Routep50p95
POST .../kanban-card/details/{status}119 ms195 ms
POST .../kanban-card/query76 ms145 ms
GET .../kanban-card/{entity-id}6 ms7 ms

These sibling routes establish what the kanban-card SQL surface looks like when the inner subquery isn’t the dominant cost — single-digit-millisecond simple lookups, ~100–200 ms for filtered listings without per-row hydration.

Connection-timeout signal: zero Sentry events for connectionTimeout, SQLTransientConnectionException, HikariPool, or the broader connection term across errors and logs datasets in the trailing 4 days (sanity check: 1,120,365 spans on Alpha001-prod over the same window confirms instrumentation is live). The writer-side connection pool is not under saturation pressure today.

HPA configuration (operations/src/main/helm/values-*.yaml, working tree at origin/main 2026-05-19):

EnvironmentminReplicasmaxReplicas
values-prod.yaml28
values-stage.yaml24
values-demo.yaml24
values-dev.yaml24
values-local.yaml12
chart default24

Only prod runs at the upper maxReplicas = 8.

PDEV-490 changes the persistence layer and the kanban-side SQL surface in coordinated steps:

common-module/lib/src/main/kotlin/cards/arda/common/lib/persistence/DataSource.kt wires HikariCP through the AWS Advanced JDBC Wrapper (software.amazon.jdbc:aws-advanced-jdbc-wrapper:4.0.1):

  • jdbcUrl template changes from jdbc:postgresql://… to jdbc:aws-wrapper:postgresql://….
  • driverClassName = "software.amazon.jdbc.Driver".
  • Plugin pipeline: auroraInitialConnection, failover2, efm2, readWriteSplitting.
  • HikariConfig.exceptionOverrideClassName = "software.amazon.jdbc.util.HikariCPSQLException" so HikariCP cooperates with wrapper-emitted failover exceptions instead of evicting healthy connections.
  • Aurora-tuning properties: failoverClusterTopologyRefreshRateMs = 2000, failoverReaderConnectTimeoutMs = 5000, failoverWriterReconnectIntervalMs = 2000, loadBalanceReadOnlyTraffic = true.

After this lands, Connection.setReadOnly(true) (which Exposed already calls on transaction(readOnly = true)) becomes meaningful — the wrapper’s readWriteSplitting plugin routes read-only physical connections to an Aurora reader instance; writes land on the writer instance. The application-level HikariCP pool, its size, and its caller-facing surface are unchanged.

Two new Flyway migrations:

  • operations/src/main/resources/resources/kanban/database/migrations/V007__kanban_card_bitemporal_indexes.sql — adds three indexes on kanban_card in a single file: the two composite bitemporal indexes ((eid, effective_as_of DESC, recorded_as_of DESC) and (tenant_id, item_reference_entity_id, eid, effective_as_of DESC, recorded_as_of DESC)) plus the missing (tenant_id) index. None carry a WHERE retired = FALSE partial predicate.
  • operations/src/main/resources/reference/item/database/migrations/V*__item_bitemporal_indexes.sql — adds the composite bitemporal index on item matching the same shape. Existing idx_item_tenant stands.

All indexes use CREATE INDEX CONCURRENTLY, which means each statement must run outside a Flyway transaction (one statement per migration file or executeInTransaction = false on the migration).

ServiceImpl.kt:276-293 collapses to:

override suspend fun cardsForItem(itemReference: ItemReference, asOf: TimeCoordinates)
: Result<Page<KanbanCard, KanbanCardMetadata>> = inTransaction(db, readOnly = true) {
universe.list(
Query(Filter.Eq(KANBAN_CARD_TABLE.item.eId.name, itemReference.eId), Pagination(0, 1000)),
asOf,
includeDeleted = false,
withTotal = false
)()
}

Two coupled changes that must land together: flip withTotal = true → false AND delete the flatMap { … when (pg.totalCount) … } block.

common-module gains:

  • A new AppError.Transient sealed branch under AppError.Internal, with three subtypes wrapping the wrapper’s typed exceptions: FailoverSucceeded (over FailoverSuccessSQLException), TransactionStateUnknown (over TransactionStateUnknownSQLException), FailoverFailed (over FailoverFailedSQLException).
  • New branches on the existing Throwable.normalizeToAppError() extension (at common-module/lib/src/main/kotlin/cards/arda/common/lib/lang/errors/AppError.kt:192) that walk the cause chain (unwrapping ExposedSQLException and HikariCP wrapping) to detect the three wrapper exception classes. No separate adapter class is introduced; classification stays in the canonical normalizeToAppError function.
  • StatusPages rendering of AppError.Transient as HTTP 503 with header Retry-After: 2.
  • A retry policy at the inTransactionAsync / inTransactionSync boundary that catches the three transient types, retries up to PoolConfig.maxAttempts - 1 additional times with PoolConfig.backoffMs ms between attempts, and surfaces AppError.Transient once retries exhaust.
  • New PoolConfig fields maxAttempts (default 2) and backoffMs (default 300).

Operations consumes the new release by:

  • Bumping the common-module pin in operations/gradle/libs.versions.toml.
  • Updating application.conf dataSource.jdbcUrl to the jdbc:aws-wrapper:postgresql://… scheme.
  • Adding the explicit dataSource.pool.maxAttempts = 2 and dataSource.pool.backoffMs = 300 knobs in application.conf (defaults match common-module; explicit values document the env contract).

common-module/lib/src/main/kotlin/cards/arda/common/lib/persistence/universe/AbstractScopedUniverse.kt:27 — the tenantId.index("TENANT_ID_INDEX") call is removed; the column declaration becomes plain uuid(ScopedMetadata.COLUMN_TENANT_ID). No runtime change (Exposed’s schema-emit was never relied on); the decorative declaration is removed so future readers don’t infer a guarantee that doesn’t exist. Flyway is the single authoritative source for indexes.

AreaCurrentTargetGap closed by
Bitemporal SELECT plan on kanban_cardSequential / single-column index lookup on the correlated subqueryIndex scan on the composite (tenant_id, item_reference_entity_id, eid, effective_as_of DESC, recorded_as_of DESC)Wave 1 kanban Flyway PR
Bitemporal SELECT plan on itemSequential / single-column index lookupIndex scan on the compositeWave 1 item Flyway PR
cardsForItem SQL count2 statements (1 COUNT + 1 SELECT)1 statement (SELECT only)Wave 1 kanban Kotlin change
Tenant-id index on kanban_cardMissingPresentWave 1 kanban Flyway PR (consolidated with the bitemporal-index migration)
Decorative TENANT_ID_INDEX declarationPresent at AbstractScopedUniverse.kt:27RemovedWave 2 common-module release
Read/write splittingNone — all transactions hit writerRead-only transactions auto-route to Aurora reader instance via wrapper’s readWriteSplitting pluginWave 2 common-module release
Failover detection latency~30 s (JVM DNS cache–bound)~2–5 s (Aurora topology API via wrapper’s failover2 plugin)Wave 2 common-module release
Transient SQL HTTP contractHTTP 500 with raw exception bodyHTTP 503 with Retry-After: 2Wave 2 common-module release
Retry on transientNoneIn-process retry with maxAttempts=2, backoffMs=300 at the inTransaction boundaryWave 2 common-module release
Operations consumer wiringDefault JDBC scheme, no retry knobsjdbc:aws-wrapper:postgresql://… scheme, explicit retry knobs in application.confWave 3 operations PR
DocumentationNo site pages on wrapper / bitemporal-index pattern / Flyway-authoritative convention / 503 contract; no runbooks for the wrapper deploy or the synthetic-failover testAll four site pages and all three runbooks presentWave 4 documentation PR
Synthetic-failover acceptance testNot exercisedProcedure documented; passes on dev before promotionWave 5 dev failover test

These adjacent surfaces are deliberately untouched by PDEV-490:

  • The items-page front-end consumer of listWithDetails. Tracked on PDEV-489. The front-end resolution path (consolidate the two per-row backend calls into one page-level /v1/kanban/kanban-card/query call with Filter.In(item_reference_entity_id, [eIds…])) does not require any new back-end route — it uses an existing one. PDEV-490 ships the composite kanban-card index that the new front-end SQL plan needs, but the front-end implementation itself is not part of this project.
  • listWithDetails chunked-fan-out refactor. A previously proposed refactor (listWithDetails collapses the per-chunk inTransaction into a single up-front Filter.In fetch) was cancelled when the front-end resolution moved off this route entirely. Remaining callers (ItemDetailsPanel.fetchCards, ManageCardsPanel.fetchCards) are single-item flows where the chunk-vs-fetch tradeoff has no forcing function.
  • A new summary/for-items aggregate route on the kanban service. Cancelled. The front-end consolidation onto the existing /v1/kanban/kanban-card/query route renders the new aggregate redundant.
  • Pool-size tuning on item and kanban DBs. The wrapper’s read/write split removes the writer-pool ceiling pressure that would have driven a tuning pass. The current maxPoolSize = 10 stays. Sentry shows zero connection-timeout pressure in the trailing 4 days across all environments.
  • HPA maxReplicas reduction. Was a fallback under the originally-considered writer-pool budget pressure; the wrapper’s read/write split removes the budget pressure. No change to HPA.
  • JVM DNS TTL helm chart change. Was relevant under the original DNS-cache-bound failover detection; the wrapper bypasses DNS for failover detection (uses the Aurora topology API). The chart-level networkaddress.cache.ttl override is not added.
  • transactionIsolation evaluation (REPEATABLE_READ → READ_COMMITTED on read-only paths). Filed as Linear PDEV-534 to run after PDEV-490 ships, so the post-wrapper, post-index baseline is the reference point.
  • RDS Proxy adoption. Closed as won’t-do (Linear PDEV-499); the wrapper is incompatible with RDS Proxy by design.
  • Service-level read cache on kanban_card / item. Deferred; revisit only after the new indexes have soaked and pg_stat_statements still shows headroom.
  • cardsForItem bulk-handler cleanup on the items page. Three items-page bulk handlers (handleDeleteMultipleItems, handlePrintSelectedCards, handlePreviewSelectedCards) still loop per selected item against cardsForItem; user-initiated, latency tolerable. Future ticket.
  • Per-module tenant_id Flyway migrations for FACILITY_TABLE, STATION_TABLE, ORDER_HEADER_TABLE. The audit surfaced these; deferred to a future per-module hygiene pass. The misplaced BATCH_JOB migration (declared in the item module’s tree) is also accepted as-is.

PDEV-490 is low risk by construction — most changes are additive (new indexes, new error branch) or coupled by design (the cardsForItem two-line change). Failure modes worth pinning:

  • Coupled K12 regression. If the withTotal = true → false flag flip ships without removing the surrounding flatMap { … when (pg.totalCount) … } block, every cardsForItem call returns HTTP 500 (the previously-dead Result.failure(AppError.IncompatibleState) arm becomes the live branch). Mitigation: the change is documented as a coupled two-line change; verification asserts both arms cover zero-row and multi-row cases.
  • Wrapper jdbcUrl scheme regression. The jdbcUrl scheme change is breaking. If a consumer of common-module (today only operations; future: accounts-component) bumps the common-module pin without updating its jdbcUrl, the new driver class cannot resolve and the pod fails on startup. Mitigation: the change is documented in the common-module release CHANGELOG as Changed with explicit “Consumers must update jdbcUrl”; operations consumer PR ships both the pin bump and the scheme change in the same PR.
  • Reader-endpoint topology discovery. The wrapper’s topology cache is built lazily on first connection. The first request after a pod cold start may pay a topology-discovery cost. Mitigation: auroraInitialConnection plugin in the pipeline; failoverClusterTopologyRefreshRateMs = 2000 keeps the cache fresh post-discovery.
  • CREATE INDEX CONCURRENTLY on busy tables. The kanban-card and item migrations use CONCURRENTLY so they don’t lock the table. On a sufficiently active table the index build can fail with pg_index.indisvalid = false and require a manual cleanup. Mitigation: ship to dev first; rerun on failure (the migration is idempotent at the CREATE INDEX IF NOT EXISTS level when the index name is unique).
  • Wrapper compatibility with Exposed. The wrapper hooks into the JDBC Connection.setReadOnly lifecycle. Exposed at version 0.60.0 (pinned in common-module/gradle/libs.versions.toml:11) sets readOnly before autoCommit, which is the ordering the wrapper expects. The two-line ordering was verified by source inspection of ThreadLocalTransactionManager.kt:131-161 and JdbcConnectionImpl.kt:46-50 during the design phase.

The dev synthetic-failover test gates promotion beyond dev; demo / stage / prod each take a standard per-environment soak window after that.

  • operations/src/main/kotlin/cards/arda/operations/resources/kanban/api/rest/KanbanCardEndpoint.kt — route declarations.
  • operations/src/main/kotlin/cards/arda/operations/resources/kanban/service/ServiceImpl.kt:276-293 (cardsForItem), ServiceImpl.kt:322-350 (listWithDetails) — service implementations.
  • operations/src/main/kotlin/cards/arda/operations/resources/kanban/persistence/KanbanCardPersistence.kt:24-34KANBAN_CARD_TABLE declaration including the item_reference component.
  • operations/src/main/kotlin/cards/arda/operations/reference/item/domain/persistence/ItemReferenceComponent.kt:24 — the item_reference_entity_id column declaration consumed by Filter.In(KANBAN_CARD_TABLE.item.eId.name, …).
  • operations/src/main/resources/resources/kanban/database/migrations/V001__kanban.sql:50-52 — current indexes on kanban_card.
  • operations/src/main/resources/reference/item/database/migrations/V012__bt_indexes.sql:8 — current idx_item_tenant; V012__bt_indexes.sql:12 — misplaced idx_batch_job_tenant (out of scope to fix).
  • operations/src/main/resources/application.conf:45-58dataSource.pool block.
  • operations/src/main/helm/values-prod.yaml:14-15 — prod HPA minReplicas, maxReplicas.
  • common-module/lib/src/main/kotlin/cards/arda/common/lib/persistence/bitemporal/Persistence.kt — bitemporal SQL emitter (self-alias bt at line 88; selection condition at lines 214-215; asOfCondition helper at lines 92-95).
  • common-module/lib/src/main/kotlin/cards/arda/common/lib/persistence/universe/AbstractUniverse.kt:152-180list(…, withTotal) method with the COUNT + SELECT logic.
  • common-module/lib/src/main/kotlin/cards/arda/common/lib/persistence/universe/AbstractScopedUniverse.kt:27 — decorative tenantId.index("TENANT_ID_INDEX") declaration.
  • common-module/lib/src/main/kotlin/cards/arda/common/lib/api/rest/types/HttpResponses.kt:233-250appErrorResponse and internalErrorResponse mapping.
  • common-module/gradle/libs.versions.toml:11 — Exposed version pin (0.60.0).

Copyright: (c) Arda Systems 2025-2026, All rights reserved