Skip to content

Static Asset Repository

This document describes the FileStore, a static asset repository used by Arda Cloud to store and serve customers’ assets such as images, videos, documents, etc.

The FileStore provides per-tenant storage and access control, ensuring that each customer’s assets are securely stored and accessible only to authorized users. It is designed to be scalable, reliable, and performant, leveraging cloud storage technologies to meet the needs of Arda Cloud’s customers. The FileStore provides only for securely reading and writing files.

The FileStore consists of the following components:

  • Access Control: The FileStore implements access control mechanisms to ensure that only authorized users can access the stored assets. This includes authentication and authorization processes to verify user identities and permissions.
  • API Layer: The FileStore exposes a set of APIs that allow customers to interact with their stored assets. These APIs support operations such as uploading, downloading, and deleting files.
  • Storage Backend: The FileStore uses a cloud storage service (e.g., AWS S3, Azure Blob Storage, Google Cloud Storage) as the underlying storage backend to store customers’ assets. Each tenant has its own isolated storage space within the backend.

Our API Gateway, along with Cognito for authentication and authorization, serves as the access control layer for the FileStore. It ensures that only authenticated and authorized users can access the stored assets. It relies on request’s JWT token to identify the tenant and enforce access control policies and redirect unauthenticated access to the authentication service.

The API layer is a simple AWS lambda function that associates the request URL with the tenant id to form partitioned namespace in the storage backend. It also handles file upload and download requests, ensuring that they are properly routed to the correct storage location based on the tenant’s namespace. It redirects read requests to signed URL and replies to write requests with signed URL to allow direct upload to the storage backend, improving performance and reducing latency.

The storage backend is one, or more if needed, S3 buckets that are accessible only to the lambda. The number of buckets and their configuration being entirely transparent to the users of the FileStore, who interact with it solely through the API layer.

These interactions diagrams introduce the generic “Arda Component” as a stand-in for specific components that keep track of document kinds, such as a user headshot for account or a product image for operations.

These interactions diagrams focus on the interactions between the SPA, the API Gateway, the FileStore lambda and the S3 storage backend; other components of Arda Cloud use the same FileStore API to manage their assets, but they might bypass the API Gateway and interact directly with the FileStore lambda, as they are trusted components of the system. Opening up this path would be easier if the FileStore lambda were instead deployed to the cluster as a regular component, which might be considered in the future.

PlantUML diagram

PlantUML diagram

The FileStore API provides the following endpoints:

The call requires a JWT token for authentication and authorization. The payload specifies the key and the contentType of the asset to be uploaded, and optionally its contentLength and a checksum for integrity verification.

Return

  • a 200 with a presigned upload URL for the asset in the storage backend.
  • a 302 redirect to the login page if the user is not authenticated.
  • a 403 if the user is not authorized to access the asset.

The call requires a JWT token for authentication and authorization, and the key query parameter to specify the asset to be retrieved.

As per the S3 API, this call supports two optional query parameters, versionId and versions:

  • The optional versionId query parameter can be used to specify a particular version of the asset to retrieve. If not specified, the latest version of the asset will be retrieved.
  • The optional versions query parameter can be used to retrieve a list of all versions of the asset, instead of a specific version, in a payload TBD.

If both versionId and versions are specified, the request will be rejected with a 400 Bad Request error.

Return

  • a 302 redirect to a presigned URL for the asset in the storage backend.
  • a 302 redirect to the login page if the user is not authenticated.
  • a 403 if the user is not authorized to access the asset or if the asset does not exist.

The FileStore uses initially a single S3 bucket to store all tenants’ assets, with a partitioned namespace based on tenant IDs. This approach simplifies management and reduces costs while providing sufficient isolation between tenants. The FileStore might in the future introduce tenant-specific buckets if required, or shard the storage across multiple buckets. This all remains transparent to the users of the FileStore, who interact with it solely through the API layer.

The FileStore ensures the key follows best naming practices for S3 objects, including using a consistent naming convention that incorporates tenant IDs and asset types to facilitate organization and retrieval.

Objects in the bucket are versioned to protect against accidental overwrites and deletions, and to allow for recovery of previous versions if needed.

FileStore maintains the custom object metadata ARDA-USER to track the user that created or last modified the asset. The ARDA-USER is set to the JWT’s oidc sub.

[!important] Security is a top priority for the FileStore, and the following section is to be confirmed. Consider it a draft.

  • Create an S3 bucket:
new s3.Bucket(this, 'SiteBucket', {
blockPublicAccess: s3.BlockPublicAccess.BLOCK_ALL,
enforceSSL: true,
encryption: s3.BucketEncryption.S3_MANAGED,
objectOwnership: s3.ObjectOwnership.BUCKET_OWNER_ENFORCED,
versioned: true,
removalPolicy: cdk.RemovalPolicy.RETAIN,
autoDeleteObjects: false,
cors: [{
allowedMethods: [s3.HttpMethods.PUT, s3.HttpMethods.GET, s3.HttpMethods.HEAD],
allowedOrigins: props.allowedOrigins ?? ['https://app.example.com'],
allowedHeaders: ['*'],
exposedHeaders: ['ETag'],
maxAge: 300
}]
});
  • CORS allows PUTs from frontend apps using pre-signed URLs.
  • Keep Block Public Access ON and no ACLs.
  • Create a CloudFront Origin Access Control (SigV4) for the bucket.
  • Add a bucket policy allowing only this OAC’s distribution to perform s3:GetObject.

Objects are stored with server-side encryption using S3-managed keys (SSE-S3) to ensure data security at rest.

FileStore is the only component in the system with direct access to the S3 bucket. The shape of its IAM policy is TBD, but it will be designed to follow the principle of least privilege, allowing only the necessary permissions for FileStore to perform its operations.