Specification: Product Slow — Front-End Items Page Performance
This specification pins the design decisions identified in
goal.md as load-bearing for the
PDEV-489 sub-issue stack: the
shape of ItemCardsContext (which PDEV-235 introduces and PDEV-548 / PDEV-549
consume), the AG Grid SSRM integration point (which determines when the
batched kanban-card query fires and how it interacts with the block cache),
and the freshness model (which determines how cross-session staleness is
bounded without re-introducing the per-row fan-out the project is here to
eliminate).
Everything else — file lists, line-by-line edits, test additions — is implementation detail and lives in the per-sub-issue PRs.
1. ItemCardsContext
Section titled “1. ItemCardsContext”1.1 What it is
Section titled “1.1 What it is”ItemCardsContext is the page-scoped shared store of kanban-card data
keyed by item eid, owned by the /items page. It exists so that every
component on the page (grid row cells, detail panel, bulk-action handlers)
reads the same kanban-card dataset for a given item without each issuing its
own network request.
“Page-scoped” means one mount of /items in one browser tab, not “one
user session”. The context is React state inside the page component tree;
its lifetime is the lifetime of that React root for /items:
- One user, two tabs of
/items→ two independent stores, two independent batched fetches. No cross-tab sharing. - Navigate away and back (e.g.
/items→/scan→/items) → the page unmounts, the store is dropped, the next mount starts cold. - Reload → cold.
- Across users → there is no sharing surface; each user’s browser has its own React tree.
ItemCardsContext is the single source of truth on the page, not a
freshness guarantee against the backend. It enforces consistency within
the page (all consumers see the same cards for a given eid) but does
not, on its own, bound staleness vs. the backend. The freshness model in
§3 is what bounds staleness; the context is the substrate it operates on.
The only kanban-card caching wider than the page mount lives outside this
project’s scope: nothing today, and explicitly nothing added — kanban-card
data is too change-heavy to cache at the BFF or in localStorage. The
items list’s separate Next.js unstable_cache caches items/query-ssrm
responses on the BFF, not kanban-cards, and is not affected by this work.
1.2 Why it is needed
Section titled “1.2 Why it is needed”Three independent code paths on /items need the same kanban-card data for
the same items, at overlapping times:
- Grid row cells (
QuickActionsCellincolumnPresets.tsx) readsafeCards.length,inOrderQueueCount,printedCount, and pick thecandidateCardfor the print/preview/order-queue buttons on every row. ItemDetailsPanelreads the same cards (full card list) when the panel opens for a row.- Bulk handlers (
handleDeleteMultipleItems,handlePrintSelectedCards,handlePreviewSelectedCardsinpage.tsx) read cards for every selected item to gate deletion, choose labels to print, and compose preview sheets.
Without a shared store, each path issues its own request — and the grid
path issues one per row. Routing all three through ItemCardsContext
turns the rendered page into a single batched read and turns subsequent
consumers (panel, bulk action) into reads against an already-warm store,
which the freshness model then layers refresh behavior on top of.
The context is not a general-purpose kanban-card cache: it is the
backing store of the /items page’s rendered state. Off-page consumers
(e.g. the standalone kanban page, the print preview window) own their own
data acquisition.
1.3 Design criteria
Section titled “1.3 Design criteria”The shape is constrained by the consumers, the SSRM lifecycle, the
acceptance criteria in goal.md, and the freshness model in §3. Pinning
these criteria explicitly so the four PRs in the stack do not drift:
- Page-scoped lifetime, not request-scoped. The store outlives any
single SSRM block fetch and any single panel-open event. It is cleared
only when the
/itemspage unmounts or when the tenant / active-tab / filter-tokens combination changes (any change that invalidates the set of items the grid is showing). - Keyed by item
eid(entity ID), not row index. The grid is server-side; row indices are not stable across sorts, filter changes, or pagination.eidis the only stable handle the BFF and operations share with the frontend. - Entry shape carries a client-side
fetchedAt. Each entry is{ cards: KanbanCardResult[], fetchedAt: number }.fetchedAtis written fromDate.now()on the client at the moment the response resolves — never fromasOf.recordedor any server timestamp — so clock skew between client and server cannot make every entry permanently “stale” and induce a refresh loop.fetchedAtis the substrate for the TTL check in §3. - Populated by batched fetch, not per-item. The introducer
(PDEV-235) replaces
ensureCardsForItem(eid)per row with a singleensureCardsForItems(eids[])call per SSRM block. Per-item entry points remain on the surface as thin wrappers for callers that genuinely act on oneeid(the detail panel’s refresh-after-mutation path); they must not be the population path for the page load. - Idempotent on overlap, in-flight deduplicated. Two SSRM blocks
(or a block plus a panel open plus a focus-refresh sweep) may
request overlapping
eidsets concurrently. The context must deduplicate in-flight requests byeidand never issue two parallel kanban-card queries for the sameeid. The existingcardFetchPromisesRefper-item dedup generalizes naturally to a per-eidin-flight map that the batched call writes into once per member. - Invalidation on per-item mutation, not on every change. When a
user prints, previews, moves a card through the order queue,
deletes an item, or receives cards in the panel, the context
refreshes only the affected
eids (viarefreshCardsForItem(eid)or its batched formrefreshCardsForItems(eids[])). The grid does not invalidate the entire store on a single mutation. - No persistence. Kanban-card state is too change-heavy to persist; the context is in-memory React state for the lifetime of the page mount.
- Empty-result semantics are first-class. An item with zero
cards has its
eidmapped to{ cards: [], fetchedAt }, not absent from the store. Consumers distinguish “not yet fetched” (map[eid] === undefined) from “fetched, no cards” (map[eid].cards.length === 0). PDEV-490 K12 (withTotal = false) guarantees the batched query returns an empty array rather than the legacyIncompatibleState500, so the absent-vs-empty distinction is meaningful. - Failure is sticky-empty by default, retry on mutation or focus.
If the batched kanban-card query fails for a block, the context
records the
eids as{ cards: [], fetchedAt: now }and logs the error; consumers render as if the items had no cards rather than perpetually spinning. The next user-initiated refresh path on one of those items (mutation, panel-open, focus-refresh) can recover. - Freshness at every edit surface, staleness tolerated only at
display. Every code path where a user is about to update
state — opening the detail panel for an item, initiating a bulk
mutation — refreshes the affected
eids before acting (subject to the policy in §3). Display-only reads (grid cells, bulk print/preview) may read stale data. This preserves the pre-project safety net for editing while keeping the page-load latency win.
1.4 Public surface
Section titled “1.4 Public surface”The context value (after this project):
interface ItemCardsContextType { /** eid → { cards, fetchedAt }. Undefined means "not yet fetched"; a present * entry with cards.length === 0 means "fetched, no cards". */ itemCardsMap: Record<string, { cards: KanbanCardResult[]; fetchedAt: number }>;
/** Batched populate. Fetches eids that are absent or whose entries are * older than the TTL (§3). Deduplicates concurrent calls per eid. No-op * for already-fresh eids. */ ensureCardsForItems: (itemEntityIds: string[]) => Promise<void>;
/** Batched refresh. Always fetches, ignoring TTL. Overwrites entries for * the given eids. Used by mutation completion, focus-refresh, and the * panel's parallel refresh on open. */ refreshCardsForItems: (itemEntityIds: string[]) => Promise<void>;
/** Single-item wrappers over the batched calls. Kept on the surface for * callers that genuinely act on one eid (e.g. the detail panel's * refresh-after-receive-card path). */ ensureCardsForItem: (itemEntityId: string) => Promise<void>; refreshCardsForItem: (itemEntityId: string) => Promise<void>;
onOpenItemDetails?: (item: items.Item) => void; bulkPrintingCards?: Set<string>; bulkPrintingLabels?: Set<string>;}Alongside the context, the same module exports the freshness hook used by edit surfaces:
/** Read with optional debounce. Returns cached data immediately if present; * triggers refreshCardsForItem(eid) on mount/eid-change; holds caller's * paint for up to debounceMs (default 0) waiting for the refresh to land. * After debounceMs (or immediately if 0), returns cached + isStale flag; * consumers can render a banner when isStale flips and rId differs. */function useFreshRead( itemEntityId: string, opts?: { debounceMs?: number },): { cards: KanbanCardResult[] | undefined; isStale: boolean; refresh: () => Promise<void> };The single-item wrappers exist because the detail panel’s
“refresh-after-receive-card” path is genuinely per-item; forcing every
caller to wrap an eid in an array would be noise.
1.5 Consumer contract
Section titled “1.5 Consumer contract”Per-consumer behavior under the freshness model (§3):
| Consumer | Today | After this project |
|---|---|---|
QuickActionsCell (per row, display-only) | useEffect calls ensureCardsForItem(eid) on mount + dependency change | Reads itemCardsMap[eid]?.cards synchronously. No effects. Block-level fetch is owned by the SSRM datasource. If the entry is stale-by-TTL at read time, the read enqueues a coalesced batch refresh (§3) without blocking the paint. |
ItemDetailsPanel — on open | Local fetchCards calls cardsForItem directly | useFreshRead(eid, { debounceMs: 200 }). Paints from cache for instant render, awaits refresh up to 200ms, then paints. After resolution: if rIds differ vs. cached, surface the banner (§3.4). Net round-trip count: 1 (same as today). |
ItemDetailsPanel — after in-panel mutation (Add to order queue, onReceiveCard, etc.) | fetchCards + delayed refetches at 300ms / 1000ms / 500ms / 1500ms | refreshCardsForItem(eid) + the same delayed refetches. Existing behavior preserved; only the call surface changes. |
ItemDetailsPanel — refreshItemCards window event | fetchCards + delayed refetches | refreshCardsForItem(eid) + same delayed refetches. |
handleDeleteMultipleItems (mutating) | Loops cardsForItem per selected eid | await refreshCardsForItems(selectedEids) before proceeding, with a visible progress indicator. If any selected eid’s rId set differs vs. the cached version, abort with a banner (“Selection changed — refresh and retry?”) and let the user re-trigger. |
handlePrintSelectedCards, handlePreviewSelectedCards (non-mutating) | Loops cardsForItem per selected eid | Reads from itemCardsMap. No refresh, no debounce. Worst case: a duplicate label sheet or a preview of a since-changed card. Accepted risk. |
2. AG Grid SSRM integration point
Section titled “2. AG Grid SSRM integration point”2.1 Where the batched call lives
Section titled “2.1 Where the batched call lives”The batched kanban-card/query call is issued inside the SSRM
datasource’s getRows callback, after the
/items/query-ssrm response resolves and before params.success is
called for the block. The sequence per block:
getRows(params)invoked by AG Grid with the block’sstartRow,endRow, sort model, and filter model.- Datasource calls
/items/query-ssrm→ receives rows for the block plus total count + filter options. - Datasource extracts the
eidset from the returned rows and callsensureCardsForItems(eids)on the page-level context. - Once the kanban-card store is populated, datasource calls
params.success({ rowData, rowCount }).
Steps 3 and 4 run in series — params.success waits on the
kanban-card fetch — because the per-row cells read
itemCardsMap[eid]?.cards synchronously on render. If success fires
before the store is populated, every row briefly renders with
safeCards = [] and then re-renders, which both flickers the buttons
and triggers spurious cell-render telemetry.
Alternative considered and rejected: fire-and-forget the kanban-card
call after params.success, with rows initially showing a “loading”
state. Rejected because (a) the counts column is the same width whether
data is present or not, so there is no layout cost to waiting; and
(b) the SSRM block fetch itself is the long pole — waiting an extra
~50–300ms for the kanban-card call is dominated by the
items/query-ssrm latency and barely changes the user-visible
time-to-paint.
2.2 Block cache interaction
Section titled “2.2 Block cache interaction”AG Grid’s SSRM block cache may evict and re-request a block when the
user scrolls back to it. The itemCardsMap is page-scoped, not
block-scoped: when a block is re-fetched, the datasource still
calls ensureCardsForItems(eids), which is a no-op for eids already
in the store with a fresh-by-TTL fetchedAt. Block evictions do not
themselves trigger kanban-card re-fetches.
Cache invalidation (full grid refresh — sort change, filter change,
tenant switch) clears the SSRM block cache and the
itemCardsMap. Partial refresh (single-item mutation) clears neither;
it calls refreshCardsForItem(eid) which overwrites just that entry.
2.3 Error handling
Section titled “2.3 Error handling”If /items/query-ssrm fails, the SSRM datasource calls
params.fail() and the kanban-card call is not issued — no rows
exist to fetch cards for.
If the batched kanban-card/query fails after items/query-ssrm
succeeds:
- The datasource still calls
params.successwith the row data — the grid must render rows, since the items themselves are loaded. - The context records each
eidin the failed batch as{ cards: [], fetchedAt: now }(per §1.3 #9). - The failure is logged via the existing error-logging path (not
per-row
console.error). - The next user mutation, panel-open, or focus-refresh on an
affected row triggers
refreshCardsForItem(eid), which can recover.
The current IncompatibleState 500 → “no cards” branch in
getKanbanCardsForItem becomes dead code once operations#173 lands
(PDEV-490 K12); PDEV-235 removes it.
2.4 Block boundary edge cases
Section titled “2.4 Block boundary edge cases”- Block size mismatch. AG Grid’s default SSRM block size is 100. The
60-row test tenant fits in one block, so the project’s “2 round-trips
per block” target is “2 round-trips total” in the common case. For a
500-row tenant: 5
items/query-ssrm+ 5 batchedkanban-card/query= 10 round-trips, which is still O(blocks) not O(rows). - Partial last block. The last block may return fewer rows than
requested. The batched call uses the actual
eidset from the response, not the requested range — no empty-eidcalls. - Empty block. If
items/query-ssrmreturns zero rows (filtered-to-empty state), the datasource skips theensureCardsForItemscall entirely and callsparams.success({ rowData: [], rowCount })directly.
3. Freshness and concurrency model
Section titled “3. Freshness and concurrency model”3.1 The trade-off and the boundary
Section titled “3.1 The trade-off and the boundary”The pre-project code refreshed kanban-card data on every panel open and every bulk action — expensive but safe; cross-session staleness was bounded to the time between user interactions on a row. The naïve “replace fetches with cache reads” form of this project would have collapsed page-load traffic and dropped that safety. The freshness model below restores the safety at the edit surfaces while keeping the page-load batching.
The honest claim is:
Cross-session staleness on
/itemsis bounded by (a) the user’s next interaction with a row (open panel, bulk action, mutation), (b) the next scroll into a block whose entries are stale-by-TTL, (c) the next sort/filter/tab change, (d) the next time the browser tab regains focus, or (e) the TTL window for a row that someone is actively reading — whichever comes first. There is no continuous polling; sub-second freshness for stationary views is out of scope and tracked separately under PDEV-442.
3.2 TTL — on-read, per-eid, coalesced
Section titled “3.2 TTL — on-read, per-eid, coalesced”Each entry in itemCardsMap carries a client-side fetchedAt. The TTL
default is 30 seconds, exposed as a constant so it can be tuned
without reshaping the context. Per-eid rather than global so that an
entry just refreshed by a panel-open or mutation doesn’t get re-fetched
by an adjacent grid read.
Eviction trigger is on-read, not timer-based:
- When a consumer reads
map[eid]and finds the entry stale-by-TTL (now - fetchedAt > ttlMs), the read enqueueseidinto a coalesced batch. - The coalescer flushes on the next microtask / animation frame as
one
Filter.Incall for the union of enqueuedeids. Up to one refresh batch in flight at any time; further enqueues during flight are queued for the next flush. - Display-only reads (grid cells) do not block on the refresh — they paint with the stale data; the cell re-renders when the batch resolves and the map updates (stale-while-revalidate).
- Edit-surface reads (
useFreshRead) participate in the same coalescer but may also hold their paint for up todebounceMs.
Consequences worth naming:
- A stationary grid does not refresh on TTL alone. With on-read triggers, only rows someone is reading produce refreshes. This is intentional and fits the editing-workbench use case; the dashboard-watcher use case needs push and is out of scope.
- Visibility-API gating is free. A backgrounded tab is not
rendering, so it is not reading, so it is not refreshing. No
separate
visibilitychangecheck is needed for traffic control. - Scroll into a stale block produces exactly one batched refresh for the entries the block contains — not one per row.
- TTL writes are client-clock-only.
fetchedAt = Date.now()on the client at response resolution. NeverasOf.recordedor any server timestamp; client/server clock skew cannot induce a permanent-stale loop.
Rejected alternative: timer-based sweep (setInterval scanning the
map). Costs more code, requires visibility-API gating to avoid
background traffic, and produces refreshes for rows nobody is reading.
On-read has no comparable upside given Piece 3 (below) handles every
case where freshness actually matters.
3.3 Refresh-on-focus
Section titled “3.3 Refresh-on-focus”A single visibilitychange handler on the /items page calls
refreshCardsForItems(visibleEids) once when the tab transitions from
hidden to visible. visibleEids is the union of eids for rows in the
SSRM block cache, plus the open detail panel’s eid if any. The panel
contribution covers the “left panel open, scrolled row out, switched
tab, came back” case — without it the panel can outlive its block-cache
entry and miss the refresh.
This handler is the one explicit non-interaction refresh trigger in the model. It catches the dominant “user came back from lunch / another tab” pattern without committing to continuous polling. Cost is ~5 LoC plus the coalescer already in place from §3.2.
Behavior:
- On hidden → visible transition, one batched refresh is issued for the visible block(s).
- Coalescing applies — if a scroll-triggered or interaction-triggered refresh is already in flight, the focus refresh joins the next batch rather than racing.
- TTL is irrelevant to this trigger — the focus event always refreshes, on the assumption that any time spent backgrounded is enough to warrant a check.
3.4 Edit-surface refresh and reconciliation by rId
Section titled “3.4 Edit-surface refresh and reconciliation by rId”Verified prerequisite: KanbanCardResult already exposes rId
(per-version identifier) and asOf: { effective, recorded } (bitemporal
coordinates) at the top of every card, and the BFF route
src/app/api/arda/kanban/kanban-card/query/route.ts is a pure
passthrough (forwardAsNextResponse(upstream, data)). No BFF change is
required.
Detail panel — open with debounce:
- Panel opens →
useFreshRead(eid, { debounceMs: 200 })reads cachedcardsfor instant first paint. - Hook issues
refreshCardsForItem(eid)in parallel. - If refresh resolves within 200ms (the expected P50–P70 case given the index from operations#173), the panel paints once with fresh data — no flicker, no banner.
- If refresh takes longer (P99 tail), the panel falls through at
200ms: paints with cached data, sets
isStale: false. - When refresh eventually lands, compare the
rIdset of cached cards to therIdset of fetched cards. Three cases:- Identical
rIdsets → silently updatefetchedAt, no banner. - Differing
rIdfor one or more cards, added cards, or removed cards → setisStale: true, render the banner. - All cards removed (item now has none) → render the banner with adjusted copy.
- Identical
Banner — sticky, dismissible, with [Refresh]:
- Copy: “This item was updated. [Refresh]”
- Default: the displayed cards are not replaced. User input fields in the panel are untouched.
[Refresh]action:- If the panel has unsaved edits, show a small confirm: “Discard unsaved changes and load the latest?” On confirm, apply server state to the form; on cancel, dismiss the confirm but leave the banner.
- If no unsaved edits, apply server state immediately.
- Dismiss action: hide the banner. If the user later saves, the save proceeds with last-write-wins semantics — they were warned.
This is the simple correctness contract: the user always knows when they’re looking at superseded data; clobbering is possible but never silent. Field-level merge UI is explicitly out of scope and would land as a separate ticket under PDEV-442.
Bulk-mutation handlers — refresh-then-act:
handleDeleteMultipleItems (and any other bulk path that mutates
card state):
- Show progress indicator: “Checking selection…”.
await refreshCardsForItems(selectedEids).- Compare the
rIdset pereidagainst the cached version (captured immediately before step 2). - If any
rIdset differs (a card was added, removed, or moved), abort with a banner: “Selection changed — refresh and retry?” and do not proceed with the mutation. - If all
rIdsets match, proceed with the mutation as today.
Bulk handlers do not use useFreshRead — they need a definite
before/after on the refresh, not a debounce.
3.5 Display-only reads stay simple
Section titled “3.5 Display-only reads stay simple”Grid cells render map[eid]?.cards ?? [] directly. They participate in
the on-read TTL coalescer (so scrolling into stale blocks triggers a
batched refresh) but do not debounce or block, and they do not surface
any banner on rId mismatch — the visible count changing IS the
freshness signal.
handlePrintSelectedCards and handlePreviewSelectedCards follow the
same pattern: cache-read, no refresh. Accepted risk: a card may be
printed or previewed that has since moved server-side. Worst case is a
duplicate label sheet or a preview of a stale state — both
operationally tolerable, neither a correctness issue.
4. Stack consumer sketch
Section titled “4. Stack consumer sketch”Brief sketch of how each PR in the stack consumes the API and the freshness model above, so the surface is right the first time:
- PDEV-235 — Introduces
ensureCardsForItems/refreshCardsForItemswith the{ cards, fetchedAt }entry shape, the on-read TTL coalescer (§3.2), the refresh-on-focus handler (§3.3), and the SSRM datasource integration (§2.1). Removes the per-rowuseEffect → ensureCardsForItemchain inQuickActionsCell. Removes theIncompatibleState500 dead branch. - PDEV-548 — Introduces the
useFreshReadhook (§1.4) and the banner component (§3.4). RoutesItemDetailsPanelthroughuseFreshRead(eid, { debounceMs: 200 })for open; keeps the existing mutation-drivenfetchCardscalls and rewires them ontorefreshCardsForItem(eid). Implements therId-set diff and the banner with[Refresh]action including the unsaved-edits confirm. - PDEV-549 — Splits bulk handlers by mutation intent. Rewires
mutating handlers (
handleDeleteMultipleItems) ontoawait refreshCardsForItems(selectedEids)+ the rId-set check and abort-with-banner pattern from §3.4. Rewires non-mutating handlers (handlePrintSelectedCards,handlePreviewSelectedCards) as cache-only reads (§3.5). - PDEV-550 — Independent of the context API and the freshness
model; removes the 10+
console.loglines per request inapi/arda/kanban/query-details-by-itemandapi/arda/kanban/query. Parks at the top of the stack so the log-cleanup review does not gate the latency-critical changes.
5. Out of scope for this specification
Section titled “5. Out of scope for this specification”- Latency baseline numbers. Measure with the existing Sentry + CloudWatch surface; record results in the per-PR verification notes, not here.
getKanbanCardsForItemsBFF client signature. AcardsForItems(eids[])helper that issues a singleFilter.InPOST to/v1/kanban/kanban-card/querybelongs insrc/lib/ardaClient.ts; its exact signature is implementation detail for the PDEV-235 PR.- TTL tuning. 30s is the starting default; the value is a tuning decision based on observed staleness pain, not a specification decision. Exposed as a constant for adjustment.
- Field-level merge UI on concurrent edits. The banner +
[Refresh]pattern is the agreed correctness contract; a richer merge UI is a separate ticket under PDEV-442. - Periodic polling and server push (SSE / WebSocket). Sub-second
freshness for stationary views is the dashboard-watcher use case,
not the editor’s workbench
/itemsserves. Tracked under PDEV-442 as separate work. - Test strategy. Each PR carries its own unit-test additions; the
acceptance criteria in
goal.mdare the verification surface, not a test plan.
Copyright: (c) Arda Systems 2025-2026, All rights reserved
Copyright: © Arda Systems 2025-2026, All rights reserved