Inspecting Cluster Logs
Arda services run on EKS clusters. Application logs are available from two sources:
- Kubernetes pod logs — live logs from running pods via
kubectl logs. These rotate when pods restart and have limited retention. - AWS CloudWatch Logs — persisted logs collected by fluent-bit. These survive pod restarts and are retained according to the log group policy.
When pod logs have rotated off, CloudWatch is the authoritative source.
Clusters and AWS Profiles
Section titled “Clusters and AWS Profiles”| Cluster | AWS profile | Log group | Region |
|---|---|---|---|
| Alpha001 (Production) | Admin-Alpha1 | /Alpha001/eks-logs | us-east-2 |
| Alpha002 (Dev/Stage) | Admin-Alpha2 | /Alpha002/eks-logs | us-east-2 |
Prerequisites
Section titled “Prerequisites”1. AWS SSO login
Section titled “1. AWS SSO login”Authenticate with the AWS profile for your target cluster:
aws sso login --profile <aws-profile>2. Set kubectl context
Section titled “2. Set kubectl context”export AWS_PROFILE=<aws-profile>kubectl config use-context <cluster-context>Verify connectivity:
kubectl get namespacesKubernetes Namespaces
Section titled “Kubernetes Namespaces”Services are deployed in namespaces following the pattern
<env>-<component>. Common namespaces:
| Component | Prod namespace | Dev namespace |
|---|---|---|
| Operations (item, kanban, orders) | prod-operations | dev-operations |
| Item Data Authority | prod-item-data-authority | dev-item-data-authority |
| Accounts | prod-accounts | dev-accounts |
| Ingress | prod-ingress-nginx | dev-ingress-nginx |
| Bastion (DB access) | prod-bastion | dev-bastion |
Method 1: Kubernetes Pod Logs
Section titled “Method 1: Kubernetes Pod Logs”Pod logs are the fastest way to inspect a running service. They are not persisted across pod restarts.
List pods in a namespace
Section titled “List pods in a namespace”kubectl get pods -n <namespace>Tail live logs
Section titled “Tail live logs”kubectl logs -n <namespace> <pod-name> --tail=200 -fSearch recent logs for a pattern
Section titled “Search recent logs for a pattern”kubectl logs -n <namespace> <pod-name> --tail=5000 | grep "<pattern>"Search across all pods in a namespace
Section titled “Search across all pods in a namespace”for pod in $(kubectl get pods -n <namespace> -o name); do echo "=== $pod ===" kubectl logs -n <namespace> "$pod" --tail=5000 2>/dev/null | grep "<pattern>"doneLimitations
Section titled “Limitations”- Pod logs are lost when a pod restarts or is replaced.
- Log buffer size varies; older entries may have rotated off.
- For historical logs, use CloudWatch (Method 2 below).
Method 2: AWS CloudWatch Logs
Section titled “Method 2: AWS CloudWatch Logs”CloudWatch retains logs collected by fluent-bit from all EKS pods. Each pod’s logs appear as a log stream within the cluster’s log group. Log stream names follow the pattern:
<namespace>.<pod-name>_<namespace>_<container-name>-<container-id>Find log streams for a service
Section titled “Find log streams for a service”aws logs describe-log-streams \ --profile <aws-profile> \ --log-group-name "/<cluster>/eks-logs" \ --log-stream-name-prefix "<namespace>" \ --order-by LastEventTime \ --descending \ --limit 10 \ --output json | jq '.logStreams[] | {name: .logStreamName, lastEvent: (.lastEventTimestamp/1000 | todate)}'Replace <cluster> with the cluster name (e.g., Alpha001) and <namespace>
with the Kubernetes namespace prefix (e.g., prod-operations).
Filter logs by time window and pattern
Section titled “Filter logs by time window and pattern”aws logs filter-log-events \ --profile <aws-profile> \ --log-group-name "/<cluster>/eks-logs" \ --log-stream-names "<stream-name>" \ --start-time <epoch-ms> \ --end-time <epoch-ms> \ --filter-pattern "<pattern>" \ --output json > scratch/cw-output.jsonThe --filter-pattern argument uses CloudWatch filter syntax:
| Pattern form | Example | Meaning |
|---|---|---|
| Quoted literal | "399a4ea4" | Substring match |
| Multiple terms (AND) | "PUT" "item" | Both terms present |
| JSON field match | { $.level = "ERROR" } | Structured log field |
Always redirect output to scratch/ — raw CloudWatch output can be very large.
Parse fluent-bit JSON logs
Section titled “Parse fluent-bit JSON logs”CloudWatch stores fluent-bit structured JSON. Each event’s message field
contains a JSON envelope with a log field holding the actual application log
line. Use this snippet to extract readable log lines from saved output:
python3 -c "import json, sysdata = json.load(open('scratch/cw-output.json'))for event in data.get('events', []): msg = event.get('message', '') try: parsed = json.loads(msg) print(parsed.get('log', msg).rstrip()) except json.JSONDecodeError: print(msg.rstrip())" > scratch/cw-parsed.txtTime conversion helpers
Section titled “Time conversion helpers”CloudWatch timestamps are epoch milliseconds. These helpers convert between ISO-8601 timestamps and epoch milliseconds:
# ISO-8601 date to epoch millisecondsdate -d "2026-03-03T17:18:50Z" +%s000 # Linuxdate -j -f "%Y-%m-%dT%H:%M:%SZ" "2026-03-03T17:18:50Z" +%s000 # macOS
# Epoch milliseconds to ISO-8601date -r $((1772558330000/1000)) -u "+%Y-%m-%dT%H:%M:%S UTC" # macOSdate -d @$((1772558330000/1000)) -u "+%Y-%m-%dT%H:%M:%S UTC" # LinuxIngress logs
Section titled “Ingress logs”To find the API call that triggered a specific operation, search the ingress controller logs. These include source IP, HTTP method, path, status code, and user-agent — useful for tracing which client initiated a request.
aws logs filter-log-events \ --profile <aws-profile> \ --log-group-name "/<cluster>/eks-logs" \ --log-stream-name-prefix "<env>-ingress-nginx" \ --start-time <epoch-ms> \ --end-time <epoch-ms> \ --filter-pattern "<entity-id-or-path>" \ --output json > scratch/cw-ingress.jsonOperational Notes
Section titled “Operational Notes”- Always use
--output jsonwithaws logscommands for reliable downstream parsing. - Prefer
filter-log-eventswith--filter-patternover downloading entire log streams — it reduces data transfer and processing time. - For time-sensitive investigations, narrow the
--start-time/--end-timewindow as tightly as possible before expanding. - When pod logs are unavailable (rotated, pod restarted), fall back to CloudWatch immediately rather than waiting for the pod to restart.
Related Topics
Section titled “Related Topics”- Accessing Arda APIs — Make authenticated API calls to Arda environments.
- API Testing with Bruno — Run Bruno API tests against local or remote.
Copyright: (c) Arda Systems 2025-2026, All rights reserved
Copyright: © Arda Systems 2025-2026, All rights reserved