Skip to content

Requirements: PDEV-490 Operations Performance Improvements

Author: Claude Opus for jmpicnic | Date: 2026-05-19 | Status: Draft

Requirements: PDEV-490 Operations Performance Improvements

Section titled “Requirements: PDEV-490 Operations Performance Improvements”

Functional and non-functional requirements for PDEV-490. Each requirement is written in EARS form (Easy Approach to Requirements Syntax) — ubiquitous, event-driven, state-driven, unwanted-behavior, or complex — and carries a Source column tracing back to the PDEV-490 goal, analysis, or an external standard. Verification methods and acceptance criteria for each requirement live in the verification matrix.

EARS pattern summary used in this document:

  • Ubiquitous: The <system> shall <behaviour>.
  • Event-driven: When <trigger>, the <system> shall <behaviour>.
  • State-driven: While <state>, the <system> shall <behaviour>.
  • Unwanted-behaviour: If <unwanted condition>, then the <system> shall <behaviour>.
  • Complex: combinations of the above.

IDRequirementSource
REQ-PDEV490-001The common-module build shall declare a dependency on the AWS Advanced JDBC Wrapper artifact (software.amazon.jdbc:aws-advanced-jdbc-wrapper:4.0.1) so the wrapper’s classes are available on the classpath of every consumer.Goal § Common Module; analysis § Target state / JDBC stack.
REQ-PDEV490-002The common-module DataSource shall treat the consumer-supplied jdbcUrl (via DbConfig.url) as authoritative and shall detect wrapper opt-in by testing whether that URL starts with the prefix jdbc:aws-wrapper:. URL construction is not in common-module’s scope.Goal § Common Module; analysis § Target state / JDBC stack.
REQ-PDEV490-003When the jdbcUrl indicates wrapper opt-in (REQ-PDEV490-002), the common-module DataSource shall set HikariCP’s driverClassName to software.amazon.jdbc.Driver. Otherwise, driverClassName is left unset (JDBC auto-discovery resolves the PostgreSQL driver).Analysis § Target state / JDBC stack.
REQ-PDEV490-004When the jdbcUrl indicates wrapper opt-in (REQ-PDEV490-002), the common-module DataSource shall set the wrapper plugin pipeline auroraInitialConnection,failover2,efm2,readWriteSplitting via the wrapperPlugins HikariCP dataSourceProperty. Otherwise, no wrapper plugin property is set.Goal § Common Module; analysis § Target state / JDBC stack.
REQ-PDEV490-005When the jdbcUrl indicates wrapper opt-in (REQ-PDEV490-002), the common-module DataSource shall set HikariCP’s exceptionOverrideClassName to software.amazon.jdbc.util.HikariCPSQLException so HikariCP cooperates with the wrapper’s failover exception classes. Otherwise, exceptionOverrideClassName is left at its default.Analysis § Target state / JDBC stack.
REQ-PDEV490-006When the jdbcUrl indicates wrapper opt-in (REQ-PDEV490-002), the common-module DataSource shall set Aurora-tuning properties on the wrapper as follows: failoverClusterTopologyRefreshRateMs = 2000, failoverReaderConnectTimeoutMs = 5000, failoverWriterReconnectIntervalMs = 2000, loadBalanceReadOnlyTraffic = true, readerInitialConnectionHostSelectorStrategy = leastConnections. Otherwise, no wrapper tuning property is set.Analysis § Target state / JDBC stack.
REQ-PDEV490-007The operations component shall consume the common-module release containing REQ-PDEV490-001 through REQ-PDEV490-006 by updating the commonModule version pin in operations/gradle/libs.versions.toml.Goal § Operations / General-Configuration.
REQ-PDEV490-008The operations component’s application.conf shall set dataSource.jdbcUrl to a value matching the jdbc:aws-wrapper:postgresql://… scheme, thereby opting into the wrapper code path on common-module’s DataSource.Goal § Operations / General-Configuration.
IDRequirementSource
REQ-PDEV490-010When the application opens a transaction with readOnly = true, the wrapper’s readWriteSplitting plugin shall route the underlying physical connection to an Aurora reader instance.Goal § Common Module; analysis § Target state / JDBC stack.
REQ-PDEV490-011When the application opens a transaction with readOnly = false (or unset), the wrapper shall route the underlying physical connection to the Aurora writer instance.Goal § Common Module; analysis § Target state / JDBC stack.
REQ-PDEV490-012The application-level HikariCP pool size, eviction policy, and caller-facing API surface shall not change as a consequence of the wrapper adoption.Analysis § Target state / JDBC stack.
IDRequirementSource
REQ-PDEV490-020The operations repository shall include a new Flyway migration in operations/src/main/resources/resources/kanban/database/migrations/ that adds the composite index (eid, effective_as_of DESC, recorded_as_of DESC) on kanban_card.Goal § Kanban Module; analysis § Target state / Composite bitemporal indexes.
REQ-PDEV490-021The same Flyway migration shall add the composite index (tenant_id, item_reference_entity_id, eid, effective_as_of DESC, recorded_as_of DESC) on kanban_card.Goal § Kanban Module; analysis § Target state / Composite bitemporal indexes.
REQ-PDEV490-022The same Flyway migration shall add the single-column index (tenant_id) on kanban_card, closing the tenant_id-audit gap identified for KANBAN_CARD_TABLE.Goal § Kanban Module; analysis § Current state / Index coverage.
REQ-PDEV490-023The new kanban_card indexes shall be created using CREATE INDEX CONCURRENTLY and shall not carry a WHERE retired = FALSE partial-index predicate.Analysis § Target state / Composite bitemporal indexes; workbook decision T17.
IDRequirementSource
REQ-PDEV490-025The operations repository shall include a new Flyway migration in operations/src/main/resources/reference/item/database/migrations/ that adds the composite bitemporal index on item matching the same shape as the kanban_card indexes (covering (tenant_id, eid, effective_as_of DESC, recorded_as_of DESC) with optional (eid, effective_as_of DESC, recorded_as_of DESC) companion as data-driven verification justifies).Goal § Item Module; analysis § Target state / Composite bitemporal indexes.
REQ-PDEV490-026The new item indexes shall be created using CREATE INDEX CONCURRENTLY and shall not carry a WHERE retired = FALSE partial-index predicate.Workbook decision T17; analysis § Target state / Composite bitemporal indexes.
IDRequirementSource
REQ-PDEV490-030The cardsForItem method on KanbanCardService shall call universe.list(…) with withTotal = false.Goal § Kanban Module; analysis § Current state / cardsForItem.
REQ-PDEV490-031The cardsForItem method shall not contain the flatMap { … when (pg.totalCount) … } block: the method body collapses to a single inTransaction { universe.list(…)() } expression.Goal § Kanban Module; analysis § Current state / cardsForItem.
REQ-PDEV490-032The two changes in REQ-PDEV490-030 and REQ-PDEV490-031 shall ship in a single coupled commit; an intermediate state in which the flag is flipped but the flatMap / when block remains is forbidden because it converts the previously-dead AppError.IncompatibleState branch into a 100%-failure regression.Analysis § Risks and constraints; workbook decisions K12-commit, T18-resolved-by-K12.
REQ-PDEV490-033When cardsForItem is invoked for an item with zero cards, the system shall return HTTP 200 with a Page whose records is an empty list and totalCount is null.Analysis § Current state / cardsForItem; workbook decision T18-resolved-by-K12.
IDRequirementSource
REQ-PDEV490-040The common-module shall declare a new AppError.Transient sealed branch under AppError.Internal with three subtypes: AppError.Transient.FailoverSucceeded, AppError.Transient.TransactionStateUnknown, AppError.Transient.FailoverFailed.Goal § Common Module; analysis § Target state / AppError.Transient + HTTP 503.
REQ-PDEV490-041The common-module Throwable.normalizeToAppError() function shall classify each of FailoverSuccessSQLException, TransactionStateUnknownSQLException, and FailoverFailedSQLException to the corresponding AppError.Transient subtype declared in REQ-PDEV490-040, including the case where the wrapper SQL exception is wrapped inside an ExposedSQLException. The classification is added as new branches on the existing normalizeToAppError function (no separate adapter class is introduced).Analysis § Target state / AppError.Transient + HTTP 503; common-module convention at AppError.kt:192.
REQ-PDEV490-042When StatusPages receives an AppError.Transient, it shall render the response as HTTP 503 with header Retry-After: 2.Goal § Common Module § Success Criteria; analysis § Target state / AppError.Transient + HTTP 503.
REQ-PDEV490-043When StatusPages receives an AppError.Internal subtype that is not AppError.Transient, it shall continue to render the response as HTTP 500 (the existing contract).Analysis § Current state / Error rendering.
REQ-PDEV490-044The HTTP 503 response body shall match the existing ErrorResponse shape used for HTTP 500 responses, with code = 503 and a message string sourced from the underlying AppError.Transient.cause.message.Analysis § Target state / AppError.Transient + HTTP 503.

Retry policy at the inTransaction boundary

Section titled “Retry policy at the inTransaction boundary”
IDRequirementSource
REQ-PDEV490-050The common-module PoolConfig data class shall expose two new fields: maxAttempts: Int (default 2) and backoffMs: Long (default 300).Goal § Common Module; analysis § Target state / Retry policy.
REQ-PDEV490-051When inTransactionAsync or inTransactionSync catches a throwable that Throwable.normalizeToAppError() classifies as an AppError.Transient, the system shall retry the transaction block up to PoolConfig.maxAttempts - 1 additional times, separated by PoolConfig.backoffMs milliseconds.Goal § Common Module; analysis § Target state / Retry policy.
REQ-PDEV490-052While retrying per REQ-PDEV490-051, the system shall not retry on any throwable other than the three transient classes declared in REQ-PDEV490-040. Non-transient throwables shall surface immediately.Analysis § Target state / Retry policy.
REQ-PDEV490-053If a transaction block fails on every attempt up to PoolConfig.maxAttempts, then the system shall surface the final AppError.Transient, which StatusPages shall render per REQ-PDEV490-042.Analysis § Target state / AppError.Transient + HTTP 503; analysis § Target state / Retry policy.
REQ-PDEV490-054The operations component’s application.conf shall declare dataSource.pool.maxAttempts = 2 and dataSource.pool.backoffMs = 300 as explicit values matching the common-module defaults.Goal § Operations / General-Configuration.

Decorative declaration removal and Flyway mixed mode

Section titled “Decorative declaration removal and Flyway mixed mode”
IDRequirementSource
REQ-PDEV490-060The common-module AbstractScopedUniverse class shall declare its tenantId column as uuid(ScopedMetadata.COLUMN_TENANT_ID) with no chained .index(...) call.Goal § Common Module; workbook decision T23-flyway-authoritative-for-indexes.
REQ-PDEV490-061The removal in REQ-PDEV490-060 shall not be accompanied by any other change to Exposed-level index declarations on ScopedTable subclasses — Flyway remains the single authoritative source of database indexes.Workbook decision T23-flyway-authoritative-for-indexes.
REQ-PDEV490-062The common-module DbMigration class shall configure Flyway with mixed=true on the FluentConfiguration chain, in addition to the existing group=true. This permits transactional and non-transactional migrations to coexist in a single migration group, which is required so the operations V007 (CREATE INDEX CONCURRENTLY, non-transactional) can apply alongside the existing transactional migration tree V001..V006 on fresh test-container DBs. The change is purely additive — migrations that are already strictly transactional continue to run in their own transactions.Goal § Constraints / K16; discovered during W1.3 first-of-kind sidecar validation on 2026-05-19.
REQ-PDEV490-063The common-module DataSource class shall register every HikariDataSource produced by newSqlDataSource() into a JVM-wide registry on its companion object. The registry shall be internal to common-module:lib so consuming modules cannot reach it from production code. A new test-only helper cards.arda.common.lib.testing.persistence.PoolRegistry.closeAllPoolsForTests() shall close every registered pool (idempotent), and ContainerizedPostgres.stop() shall invoke it before stopping the Postgres test container. The ForTests suffix + testing/persistence package + explicit “DO NOT CALL FROM PRODUCTION CODE” KDoc deliberately make accidental misuse from production awkward. Without this fix, HikariCP daemon threads survive across test classes, retrying dead Testcontainer ports — cumulative across ~50 test classes the JVM hangs under Gradle parallel test execution on multi-core machines.Goal § Constraints / K17; discovered during W1.3 verification on 2026-05-19.
IDRequirementSource
REQ-PDEV490-070When the Aurora cluster promotes a different writer instance (planned or unplanned failover), the wrapper’s failover2 plugin shall detect the topology change via the Aurora topology API within the rate configured by failoverClusterTopologyRefreshRateMs.Goal § Success Criteria; analysis § Target state / JDBC stack.
REQ-PDEV490-071During the failover window, transactions that succeed against the newly-promoted writer shall surface as HTTP 200 to the caller (the in-process retry per REQ-PDEV490-051 absorbs the failure).Goal § Success Criteria; analysis § Target state / Retry policy.
REQ-PDEV490-072If a transaction fails on all retries within the failover window, then the system shall surface HTTP 503 with Retry-After per REQ-PDEV490-042 (not HTTP 500).Goal § Success Criteria; analysis § Target state / AppError.Transient + HTTP 503.
REQ-PDEV490-073The wrapper’s failover detection shall not depend on JVM DNS-cache expiry: the JVM-level networkaddress.cache.ttl setting shall remain at its current value (no helm-chart change introduced by PDEV-490).Analysis § Out-of-scope surfaces.
IDRequirementSource
REQ-PDEV490-080The operations repository shall not contain any explicit handler for java.sql.SQLException, org.jetbrains.exposed.exceptions.ExposedSQLException, or org.postgresql.util.PSQLException outside of common-module-installed surfaces. Verified on 2026-05-18 by grep -rnE 'SQLException|ExposedSQLException|PSQLException' src/main/kotlin/ returning zero hits.Goal § Operations / General-Configuration; analysis § Current state / Error rendering.
IDRequirementSource
REQ-PDEV490-085The tenant_id Flyway-coverage audit across ScopedTable consumers in operations shall be considered complete on PDEV-490; the dispositions recorded on 2026-05-18 (OK / Do-not-touch / Add-migration / Skip) shall be honoured as authoritative for this project. The only follow-up index added by PDEV-490 is the kanban_card.tenant_id index per REQ-PDEV490-022.Goal § Operations / General-Configuration; analysis § Current state / Index coverage.
IDRequirementSource
REQ-PDEV490-090The documentation repository shall include a new architecture page documenting the AWS Advanced JDBC Wrapper integration (plugin pipeline, jdbcUrl scheme, failover-detection model, HikariCP-cooperation contract).Goal § Documentation updates.
REQ-PDEV490-091The documentation repository shall include a new pattern page on bitemporal composite indexing covering the (tenant_id, …, eid, effective_as_of DESC, recorded_as_of DESC) shape and the workbook-decided rationale for not using a WHERE retired = FALSE partial predicate.Goal § Documentation updates; workbook decision T17-no-partial-index-predicate.
REQ-PDEV490-092The documentation repository shall include a new convention page stating that Flyway is the single authoritative source for database indexes; Exposed-level .index(…) declarations are decorative and must not be added.Goal § Documentation updates; workbook decision T23-flyway-authoritative-for-indexes.
REQ-PDEV490-093The documentation repository shall include or update an error-contract page covering the HTTP 503 + Retry-After: 2 surface emitted by AppError.Transient.Goal § Documentation updates.
REQ-PDEV490-094The documentation repository shall include three new runbooks under process/operation-notes/ (or equivalent): the Aurora synthetic-failover test (dev) procedure, the AWS JDBC Wrapper deploy notes, and the Aurora wrapper troubleshooting guide.Goal § Documentation updates; goal § Constraints (documentation runbooks land before the synthetic-failover test).
REQ-PDEV490-095The Wave 4 documentation PR shall merge before the Wave 5 dev synthetic-failover test executes. Wave 3 and Wave 4 can proceed in parallel; if the actual Wave 3 deployed shape diverges from the spec, the runbook is refined post-Wave-3-deploy via a follow-up commit.Goal § Constraints.

The performance requirements below are framed around the Sentry platform-be project on Alpha001-prod as the measurement surface. The baseline figures referenced were captured on 2026-05-19 over a trailing five-day Sentry window.

IDRequirementSource
REQ-PDEV490-NFR-001When PDEV-490 has been promoted to Alpha001-prod and has soaked for at least seven days, the route GET /v1/kanban/kanban-card/for-item/{itemEId} shall achieve transaction-duration p50 ≤ 250 ms and p95 ≤ 1,000 ms, measured over a rolling seven-day Sentry window on the platform-be project. Baseline at 2026-05-19: p50 = 1,113 ms, p95 = 2,911 ms.Goal § Success Criteria #1; analysis § Current state / Measured baseline.
REQ-PDEV490-NFR-002When PDEV-490 has been promoted to Alpha001-prod and has soaked for at least seven days and the items-page front-end consumer has been migrated off this route (out of scope for PDEV-490; tracked on PDEV-489), the route POST /v1/kanban/kanban-card/details shall achieve transaction-duration p50 ≤ 250 ms and p95 ≤ 1,000 ms, measured over a rolling seven-day Sentry window on the platform-be project. Baseline at 2026-05-19: p50 = 289 ms, p95 = 2,035 ms. PDEV-490 alone is not expected to satisfy the p95 component of this requirement while the items-page fan-out is still being driven against this route — the conditional on PDEV-489 is explicit and is not part of this project’s deliverables.Goal § Success Criteria #1; analysis § Current state / Measured baseline; analysis § Out-of-scope surfaces.
IDRequirementSource
REQ-PDEV490-NFR-010When a controlled Aurora failover is triggered against a development cluster while a steady-load probe runs against operations-dev, the user-visible HTTP 5xx window shall be no longer than 5 seconds (down from approximately 30 seconds today).Goal § Success Criteria #3; analysis § Current state / Failover behavior.
REQ-PDEV490-NFR-011While the HTTP 5xx window per REQ-PDEV490-NFR-010 is in effect, HTTP 503 responses shall dominate the 5xx distribution and HTTP 500 responses shall be essentially zero.Goal § Success Criteria #2; analysis § Target state / AppError.Transient + HTTP 503.
REQ-PDEV490-NFR-012After PDEV-490 has been promoted to Alpha001-prod, the wrapper’s failover detection latency shall not exceed approximately 5 seconds (down from approximately 30 seconds today), as measured indirectly via the user-visible 5xx window in REQ-PDEV490-NFR-010.Goal § Success Criteria #3.
IDRequirementSource
REQ-PDEV490-NFR-020After REQ-PDEV490-020 and REQ-PDEV490-021 have shipped, EXPLAIN ANALYZE on the kanban_card bitemporal SELECT shall show an index scan or index-only scan on the inner correlated subquery (not a sequential scan, and not a generic single-column index lookup).Goal § Success Criteria #4; analysis § Current state / Bitemporal SQL pattern; analysis § Current state / Index coverage.
REQ-PDEV490-NFR-021After REQ-PDEV490-025 has shipped, EXPLAIN ANALYZE on the item bitemporal SELECT shall similarly show an index scan or index-only scan on the inner correlated subquery.Goal § Success Criteria #4; analysis § Current state / Index coverage.
REQ-PDEV490-NFR-022After REQ-PDEV490-030 ships, the COUNT-shaped SQL statement formerly issued by cardsForItem shall no longer appear in the top entries of pg_stat_statements ranked by call frequency on the kanban DB in any environment.Goal § Success Criteria; analysis § Current state / cardsForItem.
IDRequirementSource
REQ-PDEV490-NFR-030common-module shall build successfully (make clean build) and all existing tests shall pass at each phase gate.Process / dev-workflows.md.
REQ-PDEV490-NFR-031operations shall build successfully (make clean build) and all existing tests shall pass at each phase gate.Process / dev-workflows.md.
REQ-PDEV490-NFR-032Code coverage shall meet or exceed targets configured in Gradle build scripts for both common-module and operations.Process / dev-workflows.md.
REQ-PDEV490-NFR-033documentation shall build successfully (make pr-checks) at the Wave 4 phase gate.Process / documentation make pr-checks.
IDRequirementSource
REQ-PDEV490-NFR-040Each of the three PDEV-490 PRs (one common-module PR, one combined operations PR, one documentation PR) shall include a CHANGELOG entry (direct-edit CHANGELOG.md for common-module and operations; PR-body ## CHANGELOG section for documentation) describing the user-visible / API-visible behaviour change from the consumer’s perspective.Process / workspace/instructions/claude/rules/changelog.md (workspace rule, generalised; outside the documentation site).
REQ-PDEV490-NFR-041The common-module CHANGELOG entry shall present the wrapper integration as Added (not Changed / breaking): consumers on jdbc:postgresql: URLs see zero behaviour change; consumers opt into wrapper routing by changing their own application.conf URL to jdbc:aws-wrapper:postgresql://… on their own deploy schedule. The mixed=true change is also called out under Added.Goal § Constraints; analysis § Risks and constraints.
REQ-PDEV490-NFR-042The operations operations-component-PR CHANGELOG entry shall describe the HTTP 500 → HTTP 503 status-code remap for transient SQL failures as a Changed entry (downstream consumers, BFF, monitoring dashboards keyed on 5xx by code).Goal § Success Criteria #2.

Every requirement above maps to one of the five EARS templates. Distribution:

TemplateCountExamples
Ubiquitous (The X shall Y)38REQ-PDEV490-001, REQ-PDEV490-020, REQ-PDEV490-040
Event-driven (When trigger, the X shall Y)9REQ-PDEV490-010, REQ-PDEV490-033, REQ-PDEV490-051, REQ-PDEV490-070, REQ-PDEV490-NFR-001, REQ-PDEV490-NFR-010
State-driven (While state, the X shall Y)2REQ-PDEV490-052, REQ-PDEV490-NFR-011
Unwanted behaviour (If unwanted, then the X shall Y)2REQ-PDEV490-053, REQ-PDEV490-072
Complex1REQ-PDEV490-NFR-002 (event-driven with an “and” composition that includes an explicit out-of-scope dependency)

  • PDEV-490 goal — project goal, success criteria, constraints.
  • analysis.md — entry-state analysis (where each requirement’s “current” comes from).
  • specification.md — phased implementation plan (how each requirement is realised).
  • verification.md — bidirectional traceability between requirements and acceptance criteria.
  • EARS reference: Alistair Mavin et al., Easy Approach to Requirements Syntax (EARS), 17th IEEE International Requirements Engineering Conference, 2009.

Copyright: (c) Arda Systems 2025-2026, All rights reserved