Skip to content

PDEV-498 — postgres-database-initializer changes for pg_stat_statements

Implementation proposal for the changes PDEV-498 requires in the postgres-database-initializer image so that pg_stat_statements becomes queryable from every application database the initializer creates on each Aurora cluster.

PDEV-498 (sub-issue of PDEV-479) needs CREATE EXTENSION IF NOT EXISTS pg_stat_statements; to run on each Aurora cluster as part of pod startup. Loading the library at cluster level is PDEV-479’s responsibility (shared_preload_libraries in a custom DB-cluster parameter group; instance restart required); this ticket lights up the queryable view so PDEV-490 can read aggregate query stats to drive the SQL optimisations (composite bitemporal indexes, dropping unused COUNTs, etc.).

pg_stat_statements’ underlying tracking is cluster-wide: the shared-memory hash collects queries from every database on the cluster regardless of where CREATE EXTENSION has been run. CREATE EXTENSION controls where the view exists (which DB a connected user can SELECT * FROM pg_stat_statements from), not what is tracked.

Practical implication: operators connect to the app DB they’re debugging (shop_access, etc.), not to the postgres admin DB. Creating the extension only in postgres would mean an extra disconnect/reconnect every time someone wants to see slow-query data — with no offsetting benefit, since the underlying data is the same. Creating it in every app DB the initializer creates is the right convention.

Earlier drafts of PDEV-498 prescribed admin-DB-only creation; that was overly conservative and is superseded by this proposal.

  • entrypoint.sh runs psql with -f values.sql -f setup.sql -f create.sql (or destroy.sql on down).
  • create.sql creates the app DB + owner role + role for permissions, then \c :"database_name" switches to that DB and creates any per-DB extensions (pg_trgm, btree_gin always; plus anything in values.properties extensions=…).
  • The extension list is normalised in entrypoint.sh (lines 37–52): always-included floor + user-supplied additions, deduplicated and sorted.
  • setup.sql sets ON_ERROR_STOP on for the whole session — any individual SQL command failure aborts the script.

Add pg_stat_statements to the existing always-included extensions floor. No new property, no new code path. The extension joins pg_trgm and btree_gin in the default list that gets created in every app DB the initializer creates.

echo "pg_trgm,btree_gin,$(sed -n -e 's/^extensions=//p' "${values}")" |
echo "pg_trgm,btree_gin,pg_stat_statements,$(sed -n -e 's/^extensions=//p' "${values}")" |

That’s the single substantive change. Everything else in the image already works: the extension renders into :'extensions', gets created via the existing \gexec loop in create.sql lines 61–64, and is now visible to anyone connecting to the app DB with read privileges on pg_stat_statements (typically rds_superuser / pg_read_all_stats).

pg_stat_statements is the first member of the default extensions floor with a cluster-level prerequisiteshared_preload_libraries must include it (set via PDEV-479’s DB-cluster parameter group + instance restart). If the prerequisite isn’t satisfied, CREATE EXTENSION errors with ERROR: pg_stat_statements must be loaded via shared_preload_libraries. pg_trgm and btree_gin are stock-Postgres extensions with no preload dependency, so this is a new failure mode for the image.

The deliberate choice is to let this fail loud: the existing ON_ERROR_STOP on semantics will abort the init container, K8s will retry, the pod stays in Init:CrashLoopBackOff until the cluster parameter group is applied.

Rationale:

  • The failure happens at most once per cluster, ever — once shared_preload_libraries is set on a cluster, every subsequent pod-start succeeds.
  • The failure surfaces in dev first (the lowest-impact env), which immediately prompts a review of cluster configuration across the rest of the platform — exactly the right escalation timing.
  • The alternative (fail-soft, warn-and-continue) would risk silently rolling out without the observability data ever materialising; PDEV-490’s work would then quietly proceed without pg_stat_statements data and produce sub-optimal SQL fixes.
  • Low practical blast radius: this is a one-time bootstrap coordination, not an ongoing reliability concern.

The image’s existing semantics (ON_ERROR_STOP on, init container crash-loops on error) are correct here; no special-casing needed.

One line. Add pg_stat_statements to the default extensions floor on line 39 (between btree_gin and the user-supplied $(sed …) expansion).

for extension in $(
echo "pg_trgm,btree_gin,$(sed -n -e 's/^extensions=//p' "${values}")" |
echo "pg_trgm,btree_gin,pg_stat_statements,$(sed -n -e 's/^extensions=//p' "${values}")" |
awk -F',' '{ for (i = 1; i <= NF; i++) { gsub(/^[[:space:]]+|[[:space:]]+$/, "", $i); if ($i != "") print $i } }' |
sort -u \
); do

The existing \gexec loop on lines 61–64 already iterates the comma-separated :'extensions' value and emits CREATE EXTENSION IF NOT EXISTS for each entry. With pg_stat_statements now in the floor, it’s just another item the loop handles.

Empirical correction added during implementation: On modern Postgres (16+), CREATE EXTENSION pg_stat_statements succeeds without shared_preload_libraries — the schema objects install cleanly, but the underlying shared-memory hash is never allocated. Queries against pg_stat_statements then error with pg_stat_statements must be loaded via shared_preload_libraries at query time, not at create time. Verified empirically against postgres:16-alpine3.20 during the implementation of this ticket.

To achieve the fail-loud-at-init-time design intent (so the failure surfaces before downstream consumers query and find silent emptiness), add an explicit verify-query immediately after the extension-creation \gexec block:

-- Verify pg_stat_statements is functional, not just installed.
DO $$
BEGIN
IF EXISTS (SELECT 1 FROM pg_extension WHERE extname = 'pg_stat_statements') THEN
PERFORM count(*) FROM pg_stat_statements;
END IF;
END;
$$;

Without preload, the PERFORM raises the “must be loaded” error; ON_ERROR_STOP on (from setup.sql) aborts the init script; pod enters Init:CrashLoopBackOff — fail-loud as designed.

No change. Existing behaviour is correct — the script drops the app DB and roles, not the extensions inside it (they go with the DB).

Update the extensions row of the values.properties table to mention the new floor member, and add a short paragraph about the cluster prerequisite:

PropertyRequiredDescription
extensionsnoComma-separated names of Postgres extensions to create; pg_trgm, btree_gin, and pg_stat_statements are always included.

New paragraph (placement: right after the values.properties table):

Cluster prerequisite for pg_stat_statements. The image always attempts to create the pg_stat_statements extension. This requires the cluster to have pg_stat_statements listed in its shared_preload_libraries (typically set via an Aurora DB-cluster parameter group + instance restart). If the prerequisite is not satisfied, the init container fails with an error and the pod enters Init:CrashLoopBackOff until the cluster configuration is corrected. This is a one-time bootstrap concern, not an ongoing reliability concern — once configured, every subsequent initialization succeeds.

Two complementary test fixtures:

(a) Happy-path fixture — configure the test Postgres with shared_preload_libraries=pg_stat_statements via a command: override in compose.yaml (e.g., command: ['postgres', '-c', 'shared_preload_libraries=pg_stat_statements']). Assert that after initializer runs, pg_stat_statements is present in pg_extension inside the app DB (joined with pg_namespace if necessary to confirm the schema). Use the existing values_all.properties / values_minimal.properties fixtures unchanged.

(b) Prerequisite-missing fixture — run against a stock postgres:16 (no preload override) and assert the initializer exits non-zero with pg_stat_statements must be loaded in stderr. Confirms the fail-loud contract.

Concrete shape depends on compose.yaml and tests.sh conventions (read these before finalising the test layout).

The repo uses the direct-edit CHANGELOG model (visible from .github/clq/changemap.json). Suggested entry:

Added pg_stat_statements to the default extensions floor — the extension is now created in every database the initializer provisions. Cluster prerequisite: shared_preload_libraries must include pg_stat_statements (deliberately fail-loud if not set). Required by Arda PDEV-498 to expose cluster-wide query observability from any application database.

ScenarioBehaviour
Cluster has shared_preload_libraries=pg_stat_statements (PDEV-479 deployed)Extension created in the app DB; visible from SELECT * FROM pg_extension WHERE extname = 'pg_stat_statements'.
Cluster does not have the parameter group applied yetCREATE EXTENSION errors; ON_ERROR_STOP on aborts the init script; pod stays in Init:CrashLoopBackOff. Surfaces clearly in kubectl describe pod and pod logs. Deploy-order escalation prompts cluster-config review.
Master user lacks rds_superuser / sufficient privilegeSame as above — CREATE EXTENSION errors, init crash-loops.
extensions=foo,bar user-supplied additionsJoined with the floor (now including pg_stat_statements); deduplicated; sorted; each created in turn. No change from existing behaviour.
Extension name with invalid charactersValidation in entrypoint.sh (existing regex) rejects before reaching SQL. No change.

This image change ships independently and produces a new image tag. The operations chart (and any other chart using this initializer) picks up the new image via its own bump-the-tag flow — not part of this ticket. Coordination requirement:

  1. PDEV-479 lands the DB-cluster parameter group on each Aurora cluster (with the instance restart) before any chart that references the new initializer image deploys to that cluster.
  2. If a chart redeploy lands first by accident, the operations pod crash-loops on Init. Fix is to apply PDEV-479’s parameter group and let K8s retry.

PDEV-498 already links blocks: PDEV-490 to express the downstream consumer relationship. The PDEV-479 ↔ PDEV-498 deploy ordering is a soft sequencing concern (recoverable, not a hard data-loss risk).

Why this is simpler than the original proposal

Section titled “Why this is simpler than the original proposal”

The earlier draft of this doc proposed a separate cluster_extensions property and an admin-DB-only CREATE block before the \c switch. That added a new concept to the image, a new SQL code path, and a fail-soft DO/EXCEPTION wrapper — all in service of an outcome that’s worse (the view only visible from postgres, not from app DBs). This revised design drops all of that complexity and reduces the implementation to a one-line entrypoint.sh change plus tests + docs.

  • Cluster-level parameter group — PDEV-479’s responsibility.
  • CREATE EXTENSION in the postgres admin DB — not pursued. Every app DB gets the view; operators read from there. If a future need surfaces for admin-DB visibility (e.g., a tooling integration that connects only to postgres), revisit then.
  • pg_stat_statements.track, pg_stat_statements.max tuning — covered by parent PDEV-479’s parameter-group change.
  • Operations chart init-container image-tag bump — handled by the normal release pipeline, not by this ticket or change.