Skip to content

Phase 2 Decisions

Decision log for the Phase 2 frontend pipeline work. Each entry follows the format DQ-<area>-NNN and records the question, options considered, decision, and rationale. Entries are append-only — supersession is recorded explicitly.

DQ-PIPELINE-001 — PR-body CHANGELOG vs file edit on every PR

Section titled “DQ-PIPELINE-001 — PR-body CHANGELOG vs file edit on every PR”

Question: Where should contributors author CHANGELOG entries?

Options considered:

OptionProsCons
Edit CHANGELOG.md directly in the PRSimple; familiarFrequent merge conflicts on the unreleased section; no easy way to amend post-creation
PR-body ## CHANGELOG section, assembled on mergeNo conflicts; PR description is the source of truth; can be amended via commentRequires assembly automation; CLQ validation; PR template scaffolding

Decision: PR-body CHANGELOG. Entries live in the PR description; the last ## CHANGELOG section in the description or in author comments wins. On merge, changelog-assembly.yaml extracts the entries, computes the SemVer bump from .github/clq/changemap.json, updates CHANGELOG.md + package.json + package-lock.json, runs CLQ validation, creates the GitHub Release, and triggers Deploy Frontend.

Rationale: The unreleased section was the single largest source of merge-conflict friction in Phase 1. Moving entries to the PR body eliminates the conflict entirely and makes amendments cheap (just edit the PR description).

Status: Adopted. Documented in knowledge-base/pr-body-changelog.md.

DQ-PIPELINE-002 — Inline quality gate vs polling sibling workflow

Section titled “DQ-PIPELINE-002 — Inline quality gate vs polling sibling workflow”

Question: How should Deploy Frontend know whether Extended E2E passed for the deploy commit?

Options considered:

OptionProsCons
Polling post-merge-e2e.yaml via gh api actions/runs?head_sha=…Decoupled workflows; simple to addPolling loop adds 5–10 min wall time; race conditions if the sibling workflow has not started yet; needs an arbitrary timeout
Inline build + shards + evaluate jobs in deploy.yamlSingle source of truth; no polling; reuses build artifactLarger deploy.yaml; slightly different from the queue-gate pattern

Decision: Inline. PR #805 replaced the polling implementation from PR #803 with quality-gate-build → quality-gate-{alpha,bravo} → quality-gate-evaluate. The standalone post-merge-e2e.yaml was deleted.

Rationale: Polling introduced ~10 min of wasted wall time and required arbitrary sleep 30 ...; for attempt in $(seq 1 40) retries. Inlining is operationally simpler, avoids race conditions, and lets evaluate exit 1 to skip dependent deploy jobs naturally via needs.

Status: Adopted (PR #805). Supersedes the polling implementation (PR #803).

DQ-PIPELINE-003 — quality-gate-evaluate exit-1 vs prod_blocked output

Section titled “DQ-PIPELINE-003 — quality-gate-evaluate exit-1 vs prod_blocked output”

Question: How should the gate signal a block to downstream deploy-demo / deploy-prod?

Options considered:

OptionProsCons
Output prod_blocked: true/false; deploys gate on if: needs....outputs.prod_blocked != 'true'Explicit, observable in step summaryString-typed gate; easy to misspell; depends on if: evaluation semantics
exit 1 on failure; deploys naturally skip via needs chainIdiomatic; no string output to maintain; failure is visible in the run UISlightly less explicit (the “why” requires reading the evaluate job log)

Decision: exit 1. The block_reason is still emitted as an output for stage-annotation to surface in the warning, but downstream gating is by job dependency.

Rationale: Native needs semantics are stronger than string-output gating. The annotation job preserves the human-readable reason without re-encoding it as a control signal.

Status: Adopted (PR #805).

DQ-PIPELINE-004 — Run quarantined tests post-merge or skip them

Section titled “DQ-PIPELINE-004 — Run quarantined tests post-merge or skip them”

Question: The merge gate excludes @quarantine to avoid blocking on flaky tests. Should they still run anywhere post-merge?

Options considered:

OptionProsCons
Skip entirelySimpleLoses flaky-signal data; conflicts with the documented lifecycle (“Post-merge: Run”)
Run in a separate non-blocking job in deploy.yamlRestores lifecycle; data flows into the existing flaky-test-aggregation.yamlOne extra job; cosmetic delay before workflow is marked complete
Run in a separate workflow triggered by workflow_runMost decoupledRe-introduces the polling pattern that DQ-PIPELINE-002 removed

Decision: Non-blocking job in deploy.yaml. PR #807 added quality-gate-quarantine with continue-on-error: true at the job level and no downstream needs consumers. It uploads quarantine-results.json for future metrics consumption.

Rationale: The post-merge “still run them” lifecycle is restored without re-coupling deploys to flaky tests. Living in deploy.yaml keeps the deploy-commit SHA available without polling.

Status: Adopted (PR #807).

DQ-PIPELINE-005 — Pin quality-gate checkouts to the deploy SHA

Section titled “DQ-PIPELINE-005 — Pin quality-gate checkouts to the deploy SHA”

Question: Should actions/checkout in the quality-gate jobs use the default branch HEAD or pin to the deploy commit?

Options considered:

OptionProsCons
Default branch HEADSimpler; no extra with: blockThe gate may evaluate a different commit than the one being deployed if main advances during the run
Pin to github.event.workflow_run.head_shaGate and deploy evaluate the same commitRequires the same expression duplicated on every checkout

Decision: Pin all quality-gate checkouts (build, shards, quarantine, evaluate) to ${{ github.event_name == 'workflow_run' && github.event.workflow_run.head_sha || github.sha }}. PR #805 introduced the pattern; the P1 review on the same PR caught a missed instance and the fix landed before merge.

Rationale: Without pinning, false passes/failures would block or release the wrong code if main moved between assembly and gate execution. The cost is one extra with: block per checkout — negligible.

Status: Adopted (PR #805).

DQ-PIPELINE-006 — Workflow-level GITHUB_TOKEN for npm ci

Section titled “DQ-PIPELINE-006 — Workflow-level GITHUB_TOKEN for npm ci”

Question: How should the deploy workflow authenticate npm ci against GitHub Packages?

Options considered:

OptionProsCons
Per-step env: NODE_AUTH_TOKEN: ${{ secrets.GITHUB_TOKEN }} plus actions/setup-node registry-urlExplicit per-stepRepeated boilerplate on every npm ci step
Workflow-level env: GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} matching .npmrc’s _authToken=${GITHUB_TOKEN}Single declaration; matches ci.yaml and e2e.yamlToken is exposed to all steps in all jobs (acceptable — secrets.GITHUB_TOKEN is read-scoped by default)

Decision: Workflow-level. PR #808 added the missing block. The omission was masked by warm ~/.npm cache hits and surfaced the first time the cache was cold (PR #807’s deploy run).

Rationale: Mirrors the existing pattern in ci.yaml and e2e.yaml. Avoids per-step boilerplate. The implicit secrets.GITHUB_TOKEN is sufficient because the job permissions specify packages: read.

Status: Adopted (PR #808).

DQ-PIPELINE-007 — Tiered checks: PR-fast vs queue-only

Section titled “DQ-PIPELINE-007 — Tiered checks: PR-fast vs queue-only”

Question: Which checks belong on PR push (Fast Gate) vs only inside the merge queue (Queue Gate)?

Options considered:

OptionProsCons
Run all checks on every PR pushStrongest signal per pushE2E shards add ~10–15 min to every push; high CI cost
Run lint/build/unit-tests on PR; defer E2E to the queueFast author feedback (~5 min); E2E runs on the rebased commit (more representative)Two paths to maintain; required-check matrix has to satisfy both events

Decision: Tiered. Fast Gate = lint, build, unit-tests-coverage, changelog-check, e2e (pass-through), quarantine-check. Queue Gate = e2e sanity + acceptance shards (real). Required checks are configured to accept either path via pass-through summaries.

Rationale: The expensive E2E only matters once the rebased commit is known; pre-merge runs would test stale code. Authors get a tight feedback loop without burning CI minutes on E2E for every push.

Status: Adopted.

Question: When the agent (or a contributor) needs to run gh against a specific repo from outside its working directory, how?

Options considered:

OptionProsCons
cd <path> && gh ...FamiliarTriggers a directory-change permission prompt in agent harnesses
gh -R <owner>/<repo> ...No cd; mirrors git -CSlight readability cost the first time

Decision: gh -R. Documented in workspace/CLAUDE-ROOT.md alongside the existing git -C rule. gh has no -C flag; gh api ... already encodes the repo in the URL path so -R is unnecessary there.

Rationale: Operational convention. Avoids permission prompts and keeps the shell cwd stable.

Status: Adopted (workspace docs).