OAuth2 Design Drafts and Notes

This document consolidates design notes and background research on JWT token management for Cognito-based authentication.

Token Size Limits

JWT tokens are transmitted in HTTP Authorization headers. Practical size constraints:

Source	Typical Limit
Web servers / load balancers / ALB	8 KB – 16 KB per header
Browser cookies	~4 KB per cookie
Cognito internal limits	Not published; large custom claims may hit thresholds

Recommendation: Keep tokens under 4 KB – 8 KB. Smaller is always better for performance and broad infrastructure compatibility.

What to Include in Tokens

ID Token (Identity)

Good candidates: sub, cognito:username, email, email_verified, name, cognito:groups, high-level roles.

Avoid: sensitive PII, large arrays, frequently-changing data, fine-grained permission lists.

Access Token (Authorization)

Good candidates: sub, OAuth2 scopes, high-level roles/tiers, tenant ID, simplified authorization flags.

Avoid: PII beyond sub, fine-grained per-resource permissions, volatile session data, large embedded structures.

General Recommendations

Keep it lean: Include only what is necessary for the immediate authorization decision at the point of consumption.
Prioritize scopes: Use custom Cognito Resource Server scopes (e.g., orders/read, products/manage) for API access control.
Use Pre Token Generation Lambda: The designated mechanism for injecting custom claims from backend data.
Validate on backend: Always validate JWT signature, issuer, audience, and expiry server-side.
Consider a separate authorization service: For complex ABAC requirements, store only a high-level role in the token and have the backend perform real-time permission lookups.
Monitor token size: Actively test token sizes after adding claims.
Do not rely solely on token claims for sensitive operations: Always perform a direct authoritative database lookup for high-stakes authorization decisions.

Token Exchange Flow (Authorization Code)

The standard OAuth2 Authorization Code flow with Cognito:

Browser requests a protected resource; the Ktor server redirects to Cognito.
User authenticates via Cognito Hosted UI.
Cognito redirects to the Ktor callback with an authorization code.
Ktor validates the state parameter.
Ktor exchanges the code for ID, Access, and Refresh tokens via a server-to-server POST /oauth2/token.
Ktor validates token signatures, stores tokens in a server-side session, and establishes the user session.
On token expiry, Ktor uses the refresh token to obtain new tokens.

The sequence below shows the Cognito-backed sign-up flow as initiated from the Arda frontend via the Amplify SDK.

PlantUML diagram

Key points:

The Pre Sign-up Lambda is optional. It is invoked synchronously before the user record is persisted. It can auto-confirm the user, auto-verify attributes, or reject the sign-up entirely.
The Post Confirmation Lambda is the correct place to create application-side records (user profile, tenant association) because it fires only after Cognito has durably confirmed the user.
Verification can be email or SMS depending on the Cognito User Pool configuration.

Token Augmentation via Pre Token Generation Lambda

What it does

Cognito provides a Pre Token Generation Lambda trigger that is invoked synchronously immediately before it issues tokens. The Lambda can add, suppress, or override claims in the ID and Access tokens.

PlantUML diagram

Why this approach was considered

Token augmentation via Lambda was explored because it allows the system to embed application-specific claims — such as tenant_id, subscription_tier, and coarse-grained role flags — directly into the JWT. This eliminates the need for every downstream service to perform a separate lookup on each request and avoids the latency of an additional network hop at authorization time.

This is appropriate for stable, high-level claims that change infrequently (e.g., tenant membership, billing tier) and where the overhead of a Lambda invocation at login time is acceptable relative to the per-request savings.

Separate Claims Server (External Authorization Service)

Architecture

For fine-grained, resource-level access decisions, a separate authorization service is more appropriate than embedding all claims in the token.

PlantUML diagram

Why this is the recommended direction for Arda

The external authorization service pattern is recommended for Arda’s Attribute-Based Access Control (ABAC) requirements for the following reasons:

Token size stays lean. Tokens carry only high-level claims (tenant_id, coarse role). Fine-grained permissions are never embedded in the JWT, keeping tokens well within header-size limits.
Dynamic policy evaluation. Authorization decisions are made at request time against the current policy state. There is no stale-permission problem caused by cached claims in a long-lived token.
Cedar policy language. Cedar supports expressive, auditable policies over entities and attributes. It is a natural fit for Arda’s multi-tenant, role-and-attribute model.
Separation of concerns. Business logic (Ktor) is decoupled from authorization logic (the claims server). Policies can be updated without redeploying the application.
Scalability. The authorization service can be scaled, cached, and evolved independently.

The trade-off is additional infrastructure and an extra network hop on each protected request. This is mitigated by running the authorization service as a sidecar or within the same VPC, and by caching policy decisions for read-heavy, low-sensitivity paths.

Design Decision Summary

Approach	Strengths	Weaknesses
Token augmentation (Pre Token Generation Lambda)	Fewer network hops per request; no additional service to operate; simple to reason about for coarse claims	Bloats token size if overused; Lambda cold starts add latency at login; claims are static until token refresh
Separate claims server (External Authorization Service)	Keeps tokens lean; dynamic policy evaluation at request time; supports expressive ABAC with Cedar; independently deployable	Additional infrastructure; extra network hop per authorized request; requires policy authoring discipline

Arda’s direction: a hybrid approach.

Basic identity claims are embedded in the token via the Pre Token Generation Lambda: tenant_id, top-level role (e.g., OWNER, MEMBER), and subscription tier. These change rarely and are safe to cache in the token lifetime.
Fine-grained resource access uses the External Authorization Service with Cedar policy evaluation. The Ktor backend consults the service for any action that requires attribute-level or resource-level decisions.

This hybrid keeps the common path (coarse authorization) fast and avoids the overhead of a policy lookup for every request, while retaining the flexibility and auditability of Cedar for the cases that require it.

OAuth2 Design Drafts and Notes

Token Size Limits

What to Include in Tokens

ID Token (Identity)

Access Token (Authorization)

General Recommendations

Token Exchange Flow (Authorization Code)

Sign-Up Flow

Token Augmentation via Pre Token Generation Lambda

What it does

Why this approach was considered

Separate Claims Server (External Authorization Service)

Architecture

Why this is the recommended direction for Arda

Design Decision Summary