Token Cipher Capability
The Token Cipher capability lets an L3 service encrypt and decrypt small secrets stored at rest — primarily per-tenant tokens whose plaintext must never leave L3 — under a versioned envelope that survives both algorithm rotation and key-material rotation without re-encrypting historical rows. It ships as a self-contained unit in cards.arda.common.lib.crypto, with no AWS SDK dependency: key material is supplied to the cipher by the caller from an ESO-projected source.
1. Specification
Section titled “1. Specification”The capability provides three abstractions:
A1 — Encrypt to a versioned envelope. Given a UTF-8 info constant (per-purpose HKDF salt — different consumers derive non-overlapping keys from the same source material), a MaterialRegistry, and a currentVersionId: UUID, produce an envelope string a{N}.k{SM-VERSION-ID}:<base64url-payload> over arbitrary plaintext bytes (base64url, no padding — see the on-disk shape below). The envelope captures both the algorithm version and the source-material version, so historical envelopes remain decryptable when either axis rotates.
A2 — Decrypt a versioned envelope. Given an envelope produced under any prior algorithm version that is still implemented and any source-material version still present in the caller’s registry, recover the original plaintext bytes. Decrypt is read-only against the registry.
A3 — HmacSHA256 wrapper. A small helper (Hmac) wraps javax.crypto.Mac with a uniform Result<T>-shaped API. The cipher uses it internally for HKDF derivation; two pre-existing JDK-Mac call sites in common-module (OpaqueId, S3AssetService) migrate to it for byte-identical behaviour.
The capability has four invariants the L3 service can rely on:
- No application-side calls to AWS Secrets Manager.
TokenCipherconsults only the in-memoryMaterialRegistry. The caller (not the cipher) populates the registry from a single ESO-projected JSON map of every live key-material version, and may mutate the registry at runtime in response to ESO refresh events. - Plaintext never logged or cached. Per DQ-206, the cipher logs nothing about plaintext, derives the AES key in-stack per call, and caches no derived keys. The registry holds source material (the 64-byte SM input), never derived keys.
- Auth-tag failure is bug-class, not application-recoverable. AES-GCM tag-verification failure on decrypt indicates storage corruption, key-material desync, or active tampering. Surfaced as
AppError.Internal.IncompatibleStateso it pages on-call. - Unknown versionId on decrypt is bounded-transient. Surfaced as
AppError.Transient.FailoverFailed, the existing transient-retry layer (e.g. an L3 service’s coroutine retry, a webhook re-delivery, a job re-enqueue) absorbs the bounded ESO propagation lag between AWS Secrets Manager and the pod’s projection.
The on-disk envelope shape is a{N}.k{SM-VERSION-ID}:<base64url-payload>:
a{N}— algorithm version, code-indexed; v1 ships onlya1(AES-256-GCM + HKDF-SHA256). Never retired; bumpingNrequires a release.k{SM-VERSION-ID}— AWS Secrets ManagerversionId(UUID) of the source material used at write time. Runtime-indexed via theMaterialRegistry.<base64url-payload>— base64url-encodedIV(12) || ciphertext || auth-tag(16).
Malformed envelope shapes — missing :, missing ., missing k prefix, invalid UUID, invalid base64, too-short ciphertext — surface as AppError.Invocation.GeneralValidation. They are caller-input errors, not corruption.
2. Functional Elements
Section titled “2. Functional Elements”The capability is one public package (lib/crypto/) containing three caller-facing types and two internal types behind a sealed-interface dispatch on a{N}.
FE-1: TokenCipher
Section titled “FE-1: TokenCipher”Non-generic class with a private constructor and companion operator fun invoke(info, materials, currentVersionId): Result<TokenCipher> factory. The factory validates that info is non-blank and that currentVersionId is present in the supplied MaterialRegistry; otherwise it returns Result.failure(AppError.Invocation.GeneralValidation). Once constructed, the cipher exposes encrypt(plaintext) and decrypt(envelope); both are pure on the registry’s current contents — neither mutates the registry, neither calls out to any external system.
FE-2: MaterialRegistry
Section titled “FE-2: MaterialRegistry”Thread-safe registry mapping UUID (SM versionId) to 64-byte source material, backed by a ConcurrentHashMap. of(initial) rejects an empty map and any value whose size is not 64 bytes; add(versionId, material) enforces the same length invariant on subsequent additions. get returns a defensive copy. The caller populates the registry at construction time with every live key-material version and may mutate it at runtime in response to ESO refresh events; TokenCipher itself is read-only against it.
FE-3: Hmac
Section titled “FE-3: Hmac”Thin wrapper over javax.crypto.Mac for HmacSHA256 with a Result<T> API. Hmac.sha256(key) validates a non-empty key and returns Result<Hmac>; mac(input) returns the 32-byte tag as Result<ByteArray>. Used internally by TokenCipher for the HKDF Extract + Expand steps, and externally by OpaqueId and S3AssetService (migrated from inline Mac.getInstance("HmacSHA256") for DRY and consistent error handling).
FE-4: EnvelopeAlgorithm (internal sealed interface)
Section titled “FE-4: EnvelopeAlgorithm (internal sealed interface)”Dispatch type for the a{N} axis. Each implementation declares its version string and provides encrypt(derivedKey, plaintext) / decrypt(derivedKey, ciphertext) over a 32-byte derived key. Adding a2 is a new object EnvelopeAlgorithmA2 : EnvelopeAlgorithm plus a single when arm in TokenCipher.decrypt — encrypt continues to use the current algorithm; decrypt remains backwards-compatible for as long as the old algorithm object is present in the package.
FE-5: EnvelopeAlgorithmA1 (internal object)
Section titled “FE-5: EnvelopeAlgorithmA1 (internal object)”V1 implementation: AES-256-GCM with a 12-byte IV (cryptographically random per encrypt), 16-byte (128-bit) authentication tag, and a 32-byte derived key. Validates derived-key length on both encrypt and decrypt. Raw output layout is IV(12) || ciphertext || tag(16). Auth-tag verification failure is mapped to AppError.Internal.IncompatibleState; too-short ciphertext (below the minimum IV + tag length) is mapped to AppError.Invocation.GeneralValidation (malformed shape — the cipher hasn’t been invoked yet).
3. Behaviors
Section titled “3. Behaviors”3.1 Encrypt — current version, fresh IV per call
Section titled “3.1 Encrypt — current version, fresh IV per call”3.2 Decrypt — registry-resolved, no external call
Section titled “3.2 Decrypt — registry-resolved, no external call”3.3 Algorithm-version transition
Section titled “3.3 Algorithm-version transition”Until the operator decides to introduce a2, every envelope produced by the system carries the a1. prefix. When a2 lands as a new EnvelopeAlgorithm object:
- Encrypt switches to
a2on the next deploy (the cipher uses the latestEnvelopeAlgorithmfor new writes). - Decrypt routes by the
a{N}axis:a1envelopes continue to decrypt underEnvelopeAlgorithmA1;a2envelopes decrypt underEnvelopeAlgorithmA2. - No re-encryption pass is required. Historical envelopes remain readable for as long as their algorithm object remains in the package; retiring
a1requires a coordinated data drain (out of scope for the cipher itself).
3.4 Key-material rotation
Section titled “3.4 Key-material rotation”The deployed SM secret holds a JSON map of every live key-material version, projected to the pod as a single ESO mount. On rotation:
- Operator (or future Rotation Lambda — tracked separately, see PDEV-659) writes a new version to the JSON map and updates the current-version pointer.
- ESO projects the refreshed map into the pod’s mounted secret.
- The caller reads the refreshed map (file watcher, scheduled re-read, or pod restart — the choreography is the caller’s choice) and calls
MaterialRegistry.add(versionId, material)for new entries and reconstructsTokenCipherifcurrentVersionIdchanges. - New encrypts use the new current; old envelopes continue to decrypt because their materials remain in the registry until the operator explicitly drops them.
3.5 Bounded propagation lag
Section titled “3.5 Bounded propagation lag”Between step 1 and step 2 above, AWS Secrets Manager holds the new version but the pod’s mount has not refreshed yet. If an L3 service tries to decrypt an envelope that references the new versionId, TokenCipher.decrypt returns Result.failure(AppError.Transient.FailoverFailed(...)) whose cause message names the missing version. The L3 caller’s existing transient-retry layer — Postmark webhook retries, outbound idempotency replays, L4 client retries — fires after timescales that exceed ESO’s reconciliation interval; the next attempt finds the registry refreshed and decrypt succeeds. FailoverFailed is the least-bad fit semantically (existing sealed Transient hierarchy with Aurora-failover-shaped names); a more specific subtype is intentionally not added to avoid the breaking change of extending the sealed hierarchy.
3.6 Failure-mode classification
Section titled “3.6 Failure-mode classification”| Condition on decrypt | Surfaced as | Behaviour |
|---|---|---|
| Auth-tag mismatch | AppError.Internal.IncompatibleState | Bug-class. Pages on-call. Indicates corruption / desync / tampering. |
Unknown versionId (registry miss) | AppError.Transient.FailoverFailed | Bounded transient. Caller’s existing retry absorbs. |
Malformed envelope (missing : / . / k, bad UUID, bad base64, too-short ciphertext) | AppError.Invocation.GeneralValidation | Caller-input error. Surfaces to the caller of the caller. |
Unknown algorithm a{N} | AppError.Invocation.GeneralValidation | Caller-input error (envelope carries an a{N} the deployment does not implement). |
4. Verification
Section titled “4. Verification”| Test scope | What it asserts |
|---|---|
TokenCipherTest | Factory: blank info → GeneralValidation; missing currentVersionId → GeneralValidation. Round-trips at 0/1/16/1024/65536 bytes. Material-version transition: envelope produced under one version decrypts when that version remains in the registry alongside a newer one. Auth-tag failure on a tampered base64 byte → IncompatibleState. Unknown versionId on decrypt → Transient.FailoverFailed whose message names the missing UUID. Malformed shape (missing :, missing ., invalid UUID, empty input) → GeneralValidation. |
MaterialRegistryTest | of rejects empty map, rejects non-64-byte values, accepts valid maps. add rejects non-64-byte material. get returns defensive copies. contains reports membership accurately. |
HmacTest | Empty-key rejection. RFC 4231 known-answer vector for HmacSHA256. Round-trips and deterministic output for fixed inputs. |
OpaqueIdTest (existing) | After migration to Hmac.sha256, byte-identical output confirmed against canned inputs. |
S3AssetServiceTest (existing) | After migration to Hmac.sha256, behaviour preserved. |
5. Implementation Artifacts
Section titled “5. Implementation Artifacts”All paths relative to Arda-cards/common-module/.
Functional Elements
Section titled “Functional Elements”| File | Role |
|---|---|
lib/src/main/kotlin/cards/arda/common/lib/crypto/TokenCipher.kt | Public class + companion operator fun invoke(info, materials, currentVersionId) factory; envelope parsing; HKDF derivation; dispatch into EnvelopeAlgorithmA1. |
lib/src/main/kotlin/cards/arda/common/lib/crypto/MaterialRegistry.kt | Thread-safe versionId → 64-byte material store; length-enforced of / add. |
lib/src/main/kotlin/cards/arda/common/lib/crypto/Hmac.kt | HmacSHA256 wrapper over javax.crypto.Mac with Result<T> API. |
lib/src/main/kotlin/cards/arda/common/lib/crypto/EnvelopeAlgorithm.kt | Internal sealed interface for the a{N} axis. |
lib/src/main/kotlin/cards/arda/common/lib/crypto/EnvelopeAlgorithmA1.kt | Internal object; AES-256-GCM with 12-byte IV + 128-bit tag; auth-tag failure → IncompatibleState, too-short ciphertext → GeneralValidation. |
lib/src/main/kotlin/cards/arda/common/lib/runtime/observability/OpaqueId.kt | Modified — inline Mac.getInstance("HmacSHA256") replaced by Hmac.sha256(...); byte-identical output. |
lib/src/main/kotlin/cards/arda/common/lib/infra/storage/S3AssetService.kt | Modified — inline Mac.getInstance("HmacSHA256") replaced by Hmac.sha256(...); byte-identical output. |
Verification
Section titled “Verification”| File | Scope |
|---|---|
lib/src/test/kotlin/cards/arda/common/lib/crypto/TokenCipherTest.kt | Factory invariants; encrypt/decrypt round-trips across sizes; material-version transition; tampered envelope; unknown-versionId; malformed-shape variants. |
lib/src/test/kotlin/cards/arda/common/lib/crypto/MaterialRegistryTest.kt | Constructor + add length invariant; get defensive copy; contains. |
lib/src/test/kotlin/cards/arda/common/lib/crypto/HmacTest.kt | Empty-key rejection; RFC 4231 KAT; deterministic output. |
Releases
Section titled “Releases”| Release | Subject | PR |
|---|---|---|
common-module 11.2.0 | TokenCipher + Hmac + MaterialRegistry in lib/crypto/; OpaqueId and S3AssetService migrated to Hmac. | #183 |
Caller responsibilities (not shipped by common-module)
Section titled “Caller responsibilities (not shipped by common-module)”The caller owns:
- Key delivery. ESO
ExternalSecretprojecting a single JSON map of every liveversionId → materialinto the pod. - Registry population. Parsing the projected map and calling
MaterialRegistry.of(initial)(oraddfor runtime refresh). - Refresh choreography. File watch / scheduled re-read / pod-restart — whichever fits the deployment.
infoconstant. A per-purpose UTF-8 string (e.g."arda.email.serverToken.a1") so different consumers derive non-overlapping AES keys from the same source material.- Failure handling. Mapping
AppError.Transient.FailoverFailedto the consumer’s transient-retry layer; surfacingIncompatibleStateto on-call alerting.
Rotation tooling — JSON-map schema, operator rotation script, AWS SM Rotation Lambda, and the disposition of the deployed EmailEncryptionKeyFallbackRole — is tracked separately as PDEV-659.
Copyright: © Arda Systems 2025-2026, All rights reserved