Skip to content

Upload Product Images and Managed File Assets

You are a Principal Engineer (see workspace/instructions/claude/agents/principal-engineer.md).
We are running a complex project definition and planning session using the
/complex-project-definition-and-planning skill.
Project directory: workspace/projects/mvp2/12-upload-product-images/
Read project-description.md for full context.
We are in Phase 1: Context Gathering. The project description has been written
and confirmed at a high level, but we are continuing to explore non-functional
constraints and infrastructure decisions before moving to Phase 2 (Design with
Alternatives).
Repositories involved: common-module, operations, infrastructure, api-test.
Reference the existing code patterns documented in the project description.
Key open areas still being explored:
- Security model for asset URLs (DQ-1)
- Bucket strategy and lifecycle (DQ-2)
- CDN architecture (DQ-6)
- Non-functional requirements (performance, cost, operational concerns)
PhaseStatus
Phase 1: Context GatheringIn progress — project description written, continuing NFR exploration
Phase 2: Design with AlternativesNot started
Phase 3: Three-Document CreationNot started
Phase 4: Decision RoundsNot started
Phase 5: Release PlanningNot started
Phase 6: Plan FinalizationNot started

The Arda platform needs a general-purpose file and asset upload mechanism that allows the UI to upload files to a managed S3 bucket. The uploaded assets can then be referenced by entity fields (e.g., Item.imageUrl) and served directly to HTTP clients without the backend server acting as a proxy.

The first use case is uploading product images (PNG, SVG, JPEG) to be set as the imageUrl of Item entities in the Item module of the operations component.

Implement the ability to create and update Items that use product image URLs pointing to a managed S3 bucket. The use case includes:

  1. A workflow for uploading image files via presigned S3 URLs.
  2. Storing the resulting stable URL as Item.imageUrl.
  3. Serving images directly to HTTP clients, ideally via an AWS CDN (CloudFront) to optimize delivery of static assets.

The user-facing behavioral contracts for this project are defined in the product use case documentation. See the Use Cases Analysis (not yet published) for a summary of requirements, decisions, and rationale.

Use CaseDescriptionLink
GEN::MEDIA::0001Set Entity Image — unified input surface covering file upload, drag-and-drop, clipboard paste, and URL entryentity-media.md
GEN::MEDIA::0002Remove Entity Image — clear image and revert to placeholderentity-media.md
REF::ITM::0003::0010Set Item Image During Creationitems.md
REF::ITM::0004::0006Change or Remove Item Imageitems.md
REF::ITM::0006::0005Image Column in Bulk Import/Exportitems.md

A single UploadBucket is created per partition (e.g., alpha001-prod-partition-upload-bucket) via the BulkStoresStack CDK construct in /infrastructure/src/main/cdk/stacks/purpose/partition-bulk-stores.ts.

Current bucket characteristics:

  • S3-managed encryption (AES256), BLOCK_ALL public access, no versioning.
  • 1-day TTL lifecycle with automatic expiration — designed for ephemeral upload processing (CSV files).
  • Conditional CORS for PUT/POST from whitelisted app URLs.
  • A presigning IAM role (UploadPreSigningRole) that the backend pod assumes to generate presigned URLs.
  • Cross-stack exports: UploadBucketArn, UploadBucketName, UploadPresignRoleArn keyed under ${Infrastructure}-${Purpose}-API-*.

The UploadBucket construct (/infrastructure/src/main/cdk/constructs/storage/public-upload-bucket.ts) is parameterized with name and expirationDays, making it reusable for creating additional buckets with different lifecycle policies.

/operations/src/main/cloudformation/pre-install.cfn.yml imports the bucket ARN and presign role ARN from the infrastructure exports. The pod’s service account role gets s3:GetObject, s3:PutObject, s3:ListBucket and the ability to assume the presigning role.

CsvS3BucketDirectAccess in /common-module/lib/src/main/kotlin/cards/arda/common/lib/infra/storage/CsvS3ObjectDirectService.kt is the only S3 abstraction. It handles:

  • Presigned PUT URL generation with metadata headers (x-amz-meta-tenant-id, x-amz-meta-author).
  • Streaming reads via Flow<RawLine> with batch processing.
  • Compression support (GZIP, BZIP2).

This abstraction is CSV-specific — not a general file/asset service.

operations — CSV Upload Workflow (existing pattern)

Section titled “operations — CSV Upload Workflow (existing pattern)”

The existing CSV upload flow provides the architectural pattern:

  1. CsvUploadService (/operations/src/main/kotlin/cards/arda/operations/common/lib/service/csvUpload/CsvUploadService.kt) orchestrates: generate presigned URL, return job ID, client uploads CSV, server processes rows asynchronously.
  2. JobService / JobTracker (/operations/src/main/kotlin/cards/arda/operations/system/batch/service/JobService.kt) provides async job tracking with status state machine (PENDING, RUNNING, COMPLETED, FAILED).
  3. ItemCsvUploadService maps CSV rows to domain entities including imageUrl validation as URI.

The Item entity already has imageUrl: URL? fully implemented at every layer:

LayerTypeLocation
Business EntityURL?/operations/.../item/business/Item.kt
API Input ModelString?/operations/.../item/api/Model.kt
Persistenceurl("image_url").nullable()/operations/.../item/persistence/ItemPersistence.kt
CSV Protostring image_url (URI validated)/operations/.../item/csv/v1beta1/item_row.proto

The field is fully wired — it just lacks a mechanism to populate it from an uploaded file.

  1. General S3 file access abstraction in common-module — a reusable capability (not CSV-specific) for presigned URL generation, metadata management, and object key structuring. Should support multiple use cases without over-engineering.
  2. S3 object key structure that provides tenant isolation and feature/module namespacing to minimize collisions and enable per-prefix lifecycle policies.
  3. AWS resource creation/configuration — either a new persistent-asset bucket (no TTL expiration) or reconfiguration of the existing bucket with prefix-based lifecycle rules.
  4. CDN integration — CloudFront distribution for serving uploaded assets directly to HTTP clients without backend proxying.
  5. API endpoints for the image upload workflow:
    • Request a presigned upload URL for a product image.
    • Confirm the upload and set Item.imageUrl to the resulting asset URL.
  6. Item module integration — wire the upload workflow into the existing Item create/update flow.
  • Image processing pipelines (resizing, format conversion, thumbnailing).
  • Bulk image upload (batch processing of multiple images in one operation).
  • UI implementation (frontend upload component — separate project).
  • Migration of existing imageUrl values.

Future Use Cases (inform design, do not implement)

Section titled “Future Use Cases (inform design, do not implement)”

The implementation should be structured to support these future scenarios without requiring architectural changes:

  • User profile images.
  • Order document scans.
  • CSV file uploads for bulk processing (already exists, could be unified).
  • Other static assets referenced by entity fields.

Question: How should access to uploaded assets be secured? Is it possible to restrict content access based on tenant-id?

Considerations:

  • Public URLs via CloudFront: Simple, fast, cacheable. No tenant isolation at the URL level. Object keys would include tenant-id as a path prefix but anyone with the URL could access the content. Acceptable if image content is not sensitive.
  • Presigned GET URLs: Time-limited access, generated by the backend on each request. Provides per-request authorization but defeats CDN caching and requires backend involvement for every image load.
  • CloudFront signed URLs or signed cookies: Tenant-scoped access via CloudFront key pairs. More complex setup but enables CDN caching with access control. Could scope cookies to tenant-specific path prefixes.
  • CloudFront + Origin Access Control (OAC) + Lambda@Edge: Full tenant isolation by validating tenant tokens at the edge. Most secure but most complex.

Security risks to evaluate:

  • URL guessability if object keys contain predictable patterns.
  • Cross-tenant data leakage if URLs are shared or logged.
  • Whether product images are considered sensitive data requiring access control.

Use case cross-reference: The HTTPS-only scheme constraint and data: URI rejection are specified in GEN::MEDIA::0001::0004.FS. See Use Cases Analysis (not yet published, key decision: HTTPS-only URL scheme).

Question: How many S3 buckets should exist and how should they be organized?

Options:

  1. One bucket per partition (current state) — all content types share one bucket, differentiated by object key prefix. Lifecycle rules applied per prefix. Simple to manage, but mixes ephemeral and persistent content.
  2. Multiple buckets per partition by lifecycle/purpose — separate buckets for different content lifecycles:
    • ephemeral-upload (current bucket, 1-day TTL for CSV processing).
    • http-assets (persistent, no TTL, CloudFront-fronted, for images and static assets).
    • Future: internal-bulk-storage (longer TTL, no public access, for internal processing).
  3. One bucket per component — each microservice gets its own bucket. Maximum isolation but more infrastructure to manage.
  4. Hybrid — one ephemeral bucket (current) plus one persistent assets bucket per partition, shared across components.

Factors:

  • CloudFront can only have one S3 origin per behavior (path pattern), so bucket organization affects CDN routing.
  • IAM policies and presigning roles are per-bucket.
  • The existing UploadBucket construct is parameterized and reusable — adding a second bucket to BulkStoresStack is straightforward.
  • Lifecycle rules can be prefix-based within a single bucket, but separate buckets provide cleaner operational boundaries.

Question: What hierarchy should S3 object keys use?

Candidates:

  • {tenant-id}/{feature}/{entity-id}/{filename} — tenant-first for IAM policy scoping and prefix-based access control.
  • {feature}/{tenant-id}/{uuid}.{ext} — feature-first for CloudFront path pattern routing and lifecycle rules.
  • {feature}/{tenant-id}/{entity-id}/{uuid}.{ext} — hybrid with entity context for debugging/audit.

Constraints:

  • Must support prefix-based IAM policies for tenant isolation (if required).
  • Must support CloudFront path pattern routing.
  • Must minimize collision risk (UUID component required).
  • Should be predictable enough for the backend to construct without a lookup table.

Question: What should the new general-purpose S3 abstraction look like?

Considerations:

  • Should generalize presigned URL generation (PUT and GET) beyond CSV files.
  • Should encapsulate the object key structure convention.
  • Should handle metadata (tenant-id, author, content-type, feature context).
  • Should be usable from any module in any component.
  • Should not over-abstract — start with what the image upload use case needs.
  • Relationship to existing CsvS3BucketDirectAccess: complement it, do not replace it (CSV-specific streaming logic remains valuable).

Question: What is the upload-then-link workflow?

Candidates:

  • Two-step: (1) POST to get presigned URL, (2) client uploads to S3, (3) PUT to Item to set imageUrl. Simple, but the Item update is a separate call and the image may be orphaned if step 3 never happens.
  • Upload-and-link: (1) POST to get presigned URL with Item context, (2) client uploads to S3, (3) POST to confirm upload, which validates the S3 object exists and atomically sets Item.imageUrl. Prevents orphaned images.
  • S3 event-driven: Upload triggers S3 event notification, Lambda or SQS consumer validates and links. Most decoupled but most infrastructure.

Use case cross-reference: The presigned upload workflow is the internal implementation for the managed upload path in GEN::MEDIA::0001::0006.FS (Confirm and Persist). The user-facing input detection is defined in GEN::MEDIA::0001::0002.FS. See Use Cases Analysis (not yet published, key decision: unified input surface).

Question: How should CloudFront be configured for serving assets?

Considerations:

  • Origin Access Control (OAC) vs. Origin Access Identity (OAI) — OAC is the modern recommended approach.
  • Cache behavior routing by path prefix (e.g., /assets/* routes to the assets bucket).
  • Cache invalidation strategy when an image is replaced.
  • Custom domain and SSL certificate requirements.
  • Whether the CDN is created in the infrastructure CDK or managed separately.
RepositoryRoleChanges Expected
common-moduleGeneral S3 file access abstractionNew classes in lib/infra/storage/
operationsItem module integration, API endpointsNew upload routes, module wiring
infrastructureS3 bucket creation, CloudFront, IAMNew/updated CDK constructs and stacks
api-testAPI verificationNew Bruno test collections
  • Existing upload construct: /infrastructure/src/main/cdk/constructs/storage/public-upload-bucket.ts
  • Bulk stores stack: /infrastructure/src/main/cdk/stacks/purpose/partition-bulk-stores.ts
  • CSV upload service: /operations/src/main/kotlin/cards/arda/operations/common/lib/service/csvUpload/CsvUploadService.kt
  • S3 access abstraction: /common-module/lib/src/main/kotlin/cards/arda/common/lib/infra/storage/CsvS3ObjectDirectService.kt
  • Item entity: /operations/src/main/kotlin/cards/arda/operations/reference/item/business/Item.kt
  • Item endpoint: /operations/src/main/kotlin/cards/arda/operations/reference/item/api/rest/ItemEndpoint.kt
  • Pre-install CloudFormation: /operations/src/main/cloudformation/pre-install.cfn.yml
  • Download items spec: /technical-documentation/contents/1_specifications/demo202509/use-cases/download-items.md
  • Module design docs: /technical-documentation/contents/2_design/2_functional/general/module-design/index.md

S3 Bucket Architecture at the system level.

Section titled “S3 Bucket Architecture at the system level.”

The system is expected to need bulk storage for different purposes.

  • Objects stored will always be referenced by business entities in the sytem and their lifecycle and identity will be tied to the business entities that reference them or use them.
  • In all cases, the relationship between business entities and the objects they reference will be one-to-many in terms of referential integrity. When additional business entities need access to a bulk object, they will access it (referentially) through the business entity that owns it, regardless of the actual access path to the contents. i.e. to retrieve an item’s image, the client entity will request the image form the item entity and will not denormalize the reference except in rare cases.
  • In general, bulk objects will be considered immutable as their change can be handled by updating references in the business entities that point to them.
  • Bulk storage, under no circumstances will be used by different Modules to communicate or exchange data as shared global state

The different characteristics of the files to be stored:

  • Http Accessible/Internal Use only:

    • Http Accessible assets need to be served over http as-is so that clients can display them or use them in other ways. A typical example is an item’s image, a user picture, a company logo, etc.
      • Http Accessible assets areexpected to be served over a CDN and may be large. They can be considered immutable as their change can be handled by updating references in the business entities that point to them.
      • Access to these assets needs to be partitioned by tenant. The security guarantees and design to acomplish this is to be defined. It needs to be a balance between security and leveraging AWS native capabilities (including CDN) without requiring involvement of the backend micro-services.
    • Internal Use Assets are those that will be accessed only by internal backend services that are trusted and have appropriate AWS IAM permissions.
  • Internally Sourced/Externally Uploaded: Http Assets can be uploaded by external clients (users through the UI) or could be generated by internal processes in the system and directly placed in the S3 bucket using AWS SDKs from backend services that have the appropriate AWS IAM permissions.

  • Durable/Ephemeral: Some assets will be long lived (durable), with a lifecycle explicitly tied to the business entities that reference them. Others will be ephemeral, possibly “single use” (subject to retries) like an uploaded CSV file that once it is processed is not longer needed and can be purged, or a file provided to a user for download that has an expiration date or a “single-download” policy.

Key Questions:

  • Access control for Http Accessible Assets

    • Integrate with API Gateway & Cognito via Lambda Functions?
    • Separate access control?
    • How it impacts CloudFront integration?
    • How it impacts Signed URLs and upload/download performance?
    • Is it possible to partition and secure based on tenant keys?
  • S3 Operational Configuration

    • How many buckets to configure?
    • By Partition, by Component or by Module?
    • How many based on usage characteristics?

Items to research and resolve before moving to Phase 2 (Design with Alternatives). Mark items [x] as they are addressed and summarize findings inline or in linked sections above.

  • CloudFront signed URLs vs. signed cookies — Researched. Signed cookies can be scoped to /{tenant-id}/* via custom policy. Sub-ms edge verification. Not part of cache key, so full CDN caching preserved. Requires RSA/ECDSA trusted key groups. See DQ-002 in decision-log.md.
  • Origin Access Control (OAC) — Researched. OAC is the modern replacement for OAI. Uses SigV4 to sign requests to S3. Bucket policy grants s3:GetObject to cloudfront.amazonaws.com with SourceArn condition. OAC applies to the entire origin (no per-path scoping). See DQ-001, DQ-006.
  • Tenant isolation at the URL level — Options documented in DQ-002. Awaiting decision on whether product images are sensitive enough to require per-tenant access control. (Informs DQ-002)
  • Lambda@Edge / CloudFront Functions — Researched. CloudFront Functions cannot verify RS256/ES256 JWTs (only HMAC). Lambda@Edge can but adds latency and us-east-1 deployment constraint. See DQ-002 Option C.
  • CloudFront pricing model — Researched. For 10K images x 500KB x 50 serves/month (US/EU): ~$0.50 requests + ~$20.23 data transfer = ~$20.73/month. However, CloudFront Always Free tier (1 TB out + 10M requests/month) likely covers this workload entirely ($0.00). Cache invalidation is irrelevant with immutable keys (DQ-008). See decision-log.md DQ-006.
  • S3 storage cost projection — Researched. For 100 tenants x 1,000 images x 500KB = ~47.7 GB: ~$1.10/month storage + $0.52 requests = ~$1.62/month. Orphan waste (10%) adds ~$0.11/month. Negligible cost. See DQ-001.
  • Multi-bucket vs. prefix-based lifecycle — Researched. S3 supports up to 1,000 prefix-scoped lifecycle rules per bucket. Different prefixes can have different expiration/transition rules. Separate buckets provide cleaner IAM and OAC boundaries. See DQ-001.
  • Existing CloudFront constructs — Found. ApiCloudFront construct exists at /infrastructure/src/main/cdk/constructs/xgress/api-cloudfront.ts but is API-specific (no caching, all methods, HTTP origin). No S3-origin CloudFront construct exists — net-new for assets. See DQ-006.
  • Orphaned object cleanup — Researched. Simplest approach: staging prefix + lifecycle rule. Uploads land in staging/{tenant}/{uuid}, backend copies to images/{tenant}/... on confirm, lifecycle rule expires staging/ objects after 7 days. Zero Lambda/SQS infrastructure. Also add a lifecycle rule to abort incomplete multipart uploads after 1 day. See DQ-005, DQ-008.
  • Upload failure handling — Researched. For single-part PUT/POST: if client disconnects mid-upload, S3 does not create the object (requires full body match to Content-Length). For multipart uploads: incomplete parts persist as invisible fragments that incur storage charges — lifecycle rule to abort after 1 day handles this. Presigned URL expiration: client gets 403, must request a new URL. See DQ-005.
  • Monitoring and alerting — Deferred to design phase. S3 metrics, CloudFront cache hit ratios, upload error rates. Existing ApigwDashboard construct provides a pattern for CloudWatch dashboards. (NFR — defer to design)
  • Storage growth and retention — Deferred to design phase. Per-tenant quotas and retention policy for replaced images should be specified in the design document. At projected volumes (~50 GB total), not a blocking concern. (NFR — defer to design)
  • File type and size enforcement — Researched. Critical finding: presigned PUT URLs cannot enforce Content-Type or Content-Length server-side. Presigned POST (form-based upload) can enforce via policy conditions: content-length-range (e.g., 1 byte to 10 MB) and Content-Type starts-with (e.g., image/). S3 validates these server-side. This means DQ-005 should use presigned POST, not presigned PUT. Post-upload Lambda validation (magic byte inspection) provides defense-in-depth against spoofed MIME types.
  • Malware scanning — Researched. GuardDuty Malware Protection for S3: ~$25.79/month for 100K images (post Feb 2025 price reduction). Likely disproportionate for MVP with authenticated B2B users uploading product images. Presigned POST constraints + post-upload magic byte validation covers the realistic threat model. Defer GuardDuty to a later phase; staging prefix pattern supports drop-in enablement later. (NFR — defer)
  • LocalStack/MockAWS compatibility — Researched. MockAWS in common-module uses LocalStack with S3 only (LocalStackContainer.Service.S3). Supports: bucket creation, PutObject, GetObject, HeadObject, presigned URLs with SigV4 signing and custom metadata. CloudFront is not mocked by LocalStack — signed cookie/URL testing requires WireMock or real AWS integration tests. The S3 presigning abstraction is fully testable; CloudFront distribution logic is CDK-level (synthesized, not runtime-tested). See DQ-007.
  • Presigned URL testing — Researched. CsvS3DirectAccessTest in common-module tests presigned PUT URL generation including SigV4 signature validation, custom metadata headers (x-amz-meta-tenant-id, x-amz-meta-author), and URL structure assertions. The S3Presigner pattern is reusable for the new abstraction. Note: presigned POST (recommended for content validation) uses a different signing mechanism than presigned PUT — the test harness will need extension. See DQ-007.
  • Two-phase upload pattern in the industry — Denis’s FileStore design follows the standard pattern (presigned URL -> direct upload -> persist key in entity). Same pattern used by GitHub (release assets), Slack (file uploads), Stripe (file uploads). See DQ-005.
  • Image replacement semantics — Addressed by DQ-008. Recommendation: immutable objects with new UUID keys. Old objects orphaned, cleaned up by lifecycle or periodic scan. Entity bitemporal history tracks which image was active when.
  • Bulk image upload (future) — Assessed. The recommended key structure ({tenant-id}/{feature}/{uuid}.{ext}) and presigned POST workflow accommodate batch uploads without redesign: the client requests N presigned POST forms in parallel, uploads N files to S3, then updates N entities. The abstraction in common-module would expose a batch variant that returns multiple presigned forms. The staging prefix + lifecycle cleanup pattern handles partial batch failures (some images uploaded, some not) naturally. No architectural changes needed. (Informs DQ-004)

Copyright: (c) Arda Systems 2025-2026, All rights reserved