ADR-003: Concurrent-Edit Detection Strategy

Author: Miguel Pinilla Date: 2026-06-02 Status: Accepted

Context

The Arda backend already enforces optimistic locking on individual mutations through If-Match on the kanban-card record id. A mutation against a stale card fails with a server-side conflict. That is the right backstop, but it surfaces during the destructive call — and for a bulk operation, it surfaces after some per-card calls have already succeeded. A user who selects fifty items and clicks Delete sees a partial-failure toast with no easy recovery.

The detection strategy in the SPA needs to do two related things: (1) tell the user the data on screen is no longer current (read-side detection — informs the user) and (2) catch the case where a destructive bulk click was issued against a selection that went stale between selection and click (preflight detection — protects the user). The two need to use a common primitive so they cannot disagree about what “stale” means.

Decision Drivers

Common primitive for both cases. The read-side and preflight detectors must agree on the definition of “this item’s cards changed”.
No backend round-trip on the hot read path. The detector must reuse data the cache already holds; it cannot make every grid-cell render do extra work.
Cheap to invalidate on a known producer. When a producer publishes that an item changed, the detector must converge to the new state without further network probing.
Tolerant of fetch failures. A transient transport error must not produce a false-positive verdict that the user cannot clear.
Backstop assumed, not duplicated. Server-side optimistic locking on per-card mutations stays as it is; the SPA’s job is to surface the condition earlier and more usefully.

Options Considered

Option A: Server-side optimistic locking alone (status quo before this ADR)

Description: Continue to rely on the backend’s per-card If-Match enforcement; do nothing in the SPA.
Pros: Zero SPA cost. The conflict is enforced authoritatively.
Cons: Bulk operations partially succeed before the conflict is raised. The user has no signal that the data on screen is stale until they try to act and one of the calls fails. The user-visible failure mode is poor.

Option B: ETag on every read; If-None-Match conditional refresh

Description: BFF returns an ETag on the cards query; the SPA stores it and issues conditional refreshes that 304 when the backend reports no change.
Pros: Standardised mechanism. Saves payload bytes on no-change refreshes.
Cons: Adds a round-trip to discover there is no change, exactly when the cache wants the cheap point read. Operations service has to compute and stamp the ETag on every list response. Does not by itself surface “the data is stale” to the user; still needs a consumer-side signal layered on top. The cost is not justified by the saved payload on Arda’s read patterns.

Option C: Full CRDT or event-sourced merge at the SPA

Description: SPA reconstructs item-card state from a per-tenant event stream and merges concurrent writes locally.
Pros: Strongest model; correctness is automatic.
Cons: Months of design work, requires a backend event stream that does not exist, and is far beyond what the use case needs. The user wants a banner, not a CRDT.

Option D: `rId`-set diff at the consumer hook, plus snapshot-and-diff preflight on bulk actions (this ADR)

Description: The cache already stores kanban-card records keyed by item entity id. Each card carries an rId — the bitemporal record identifier of the card wrapper, advanced by every server-side mutation. The set of rId values for an item’s cards is therefore a fingerprint of “what server state am I looking at”. The detector uses two variants of this:
- Read-side detector: useFreshRead snapshots rIdSet(getCards(eid)) on mount, fires refreshCardsForItem(eid), and on resolution diffs the fresh result against the snapshot. Mismatch flips isStale = true on the panel.
- Preflight detector: useBulkSelectionStaleGuard.armAndCheck(items) snapshots the rId set per selected eid, fires refreshCardsForItems(eids), and on resolution diffs each eid’s fresh rId set against its snapshot. Any mismatch aborts the destructive call and raises a stale-selection banner.
Pros: Uses data the cache already holds (rId) and the refresh path the cache already exposes (refreshCardsForItems). No new backend contract. One semantic primitive — same-rId-set means same state — used in both detection paths. Naturally tolerant of fetch failures (the diff is skipped for eids whose refresh produced no verdict, so transient errors do not raise false positives).
Cons: The detector is consumer-driven, not push-driven; it needs a producer to publish (or a poll tick to fire) for the cache to advance under the detector. Layered with ADR-001 to close that loop.

Decision

We chose Option D because it solves both the read-side and preflight cases with one common primitive (the rId set), reuses the cache and refresh contract that already exists, and integrates with the invalidation mechanism from ADR-001 without imposing a new backend contract.

The primitive is the wrapper card.rId, not card.payload.eId. The two are distinct: payload.eId is the card’s stable identity, which survives mutations; rId is the record identifier, which advances on every mutation. The set of rId values across an item’s cards is exactly the signal “the card lineage I am looking at has not advanced”.

Consequences

Positive

Single semantic primitive for both detection paths.
Cache-friendly: every diff costs one refresh and one Set comparison, both of which are O(k) in the small per-item card count.
Tolerant of transport errors: a fetch failure is treated as no-verdict for the affected eid, not as evidence of staleness.
Bulk actions catch the stale-selection case before any destructive call goes out, replacing the partial-failure toast with a clean stale-selection banner.

Negative

The read-side detector flips isStale through the hook’s own refresh promise. Bus-driven refreshes advance the cache but do not in themselves flip isStale — the next mount-time or user-driven refresh is what surfaces the banner. This is intentional (the banner is anchored to the panel’s own view of the data) but easy to misread on first reading of the code.
The preflight adds one refresh round-trip per bulk-action click. This is a deliberate cost in exchange for the cleaner failure mode.

Neutral

Server-side If-Match enforcement remains the authoritative conflict mechanism. The SPA’s detector is an earlier, friendlier surface; it does not replace the backend check.
Today only bulk delete uses the preflight. Extending it to other destructive bulk actions is mechanical — the guard’s armAndCheck is generic over the input selection.

Follow-Up Actions

Document the payload.eId vs rId distinction prominently in the Data Flow and Caching reference so that future maintainers do not unify the two by accident.
When a new destructive bulk action is added, route it through useBulkSelectionStaleGuard (or evolve the guard if the action’s selection model differs from bulk-delete).
Revisit if and when ETag-on-list lands at the backend: the read-side detector could then short-circuit on a 304 instead of always issuing a full read.

ADR-003: Concurrent-Edit Detection Strategy

ADR-003: Concurrent-Edit Detection Strategy

Context

Decision Drivers

Options Considered

Option A: Server-side optimistic locking alone (status quo before this ADR)

Option B: ETag on every read; If-None-Match conditional refresh

Option C: Full CRDT or event-sourced merge at the SPA

Option D: rId-set diff at the consumer hook, plus snapshot-and-diff preflight on bulk actions (this ADR)

Decision

Consequences

Positive

Negative

Neutral

Follow-Up Actions

Option D: `rId`-set diff at the consumer hook, plus snapshot-and-diff preflight on bulk actions (this ADR)