Suleiman Abdulkadir

Loading...
Back to Projects
AWS

EventForge

Hybrid event platform pairing ECS Fargate and Lambda with a Step Functions saga across 16 AWS services

AWSTypeScriptECS FargateLambdaStep FunctionsEventBridgeSQSDynamoDBCognitoReact
View on GitHub

16 AWS services

343 tests / 19 properties

Saga with compensation

X-Ray traced

Overview

Hybrid event processing platform that runs containers and serverless together in one system. The REST API runs on ECS Fargate for consistent sub-200ms latency, while bursty background work (email, PDF receipts, webhooks) runs on scale-to-zero Lambda. Orders flow through an EventBridge bus into a Step Functions saga that validates, reserves inventory, charges payment, and compensates automatically on failure. DynamoDB single-table design handles idempotency with conditional writes, and X-Ray traces the full request path end to end.

Architecture Diagram

Request Flow

Request Flow

Order Workflow (Step Functions Saga)

Order Workflow (Step Functions Saga)

Background Processing

Background Processing

Design Decisions

  • ECS Fargate for the API because it needs sub-200ms responses at all times. It autoscales between 1-4 tasks on CPU and runs in private subnets behind an ALB.
  • Lambda for background work (email, PDF generation, webhook delivery) because the load is bursty and intermittent no point paying for idle containers when these fire a few times per hour.
  • Step Functions for the order workflow so retry logic, error handling, and the saga compensation pattern are declarative. The ASL definition is version controlled and X-Ray traced rather than buried in custom code.
  • DynamoDB single-table design because the access patterns are predictable. Conditional writes handle idempotency and TTL cleans up stale locks.
  • EventBridge as the event bus to decouple the API from everything downstream. The API publishes an event and doesn't know or care what consumes it.
  • Each background processor gets its own dead letter queue with a CloudWatch alarm on DLQ depth, so failures surface immediately instead of disappearing silently.

Deployment

Deployed as 11 CloudFormation templates wired together as nested stacks. The API ships as a multi-stage Docker build to ECR and runs on ECS Fargate (1-4 tasks, CPU autoscaling) in private subnets behind an ALB, fronted by CloudFront and API Gateway. Ten Lambda functions cover the six workflow steps, three background processors, and one ingestion handler. EventBridge routes events into a Step Functions state machine running the saga, and downstream SQS queues feed each processor with its own dead letter queue. Full observability comes from X-Ray distributed tracing and CloudWatch alarms on DLQ depth. The codebase is a TypeScript monorepo of 5 packages with 343 tests including 19 property-based tests.

Lessons Learned

The saga pattern is where the real learning happened. It's easy to reserve inventory and charge payment in sequence the hard part is what happens when payment fails after inventory is already reserved. Step Functions made the compensation path explicit: release the reservation, mark the order failed, and never leave the system in a half-committed state. Splitting compute by workload (Fargate for the latency-sensitive API, Lambda for bursty background jobs) meant the architecture matched the traffic shape instead of forcing one model everywhere. X-Ray tracing across the full path API to EventBridge to Step Functions to SQS to Lambda turned 'where did this order get stuck?' from guesswork into a single trace view.

← View all projects