Help share my friend’s project. I believe this architecture is a fantastic engineering implementation, and I hope fellow experts can provide feedback.
TL;DR
A Hybrid Sessions architecture based on Lambda + S3 + DynamoDB that reduces the cost of stateful Agents from $500–1000/month to $30/month, with no single point of failure.
Project Source Code
Project Overview
This project addresses the dilemma in deploying AI Agents:
- Stateful deployment on dedicated servers costs $500–1000/month
- Serverless is cheap but loses conversation history on every invocation
The solution implements a Hybrid Sessions architecture using Lambda + S3 + DynamoDB, enabling stateful Agents at just $30/month, with no single point of failure.
This architecture breaks the conventional notion that “Serverless = Stateless”. By maintaining session history in S3, Lambda gains the capability of a stateful Agent while retaining the cost benefits of pay-per-use pricing.
State management does not need to be tied to containers—it can be completely decoupled from computation. By combining DynamoDB for mapping, S3 for persistence, and Lambda for elastic compute, this design achieves higher reliability at a much lower cost.
Use Cases
- Telegram / WeChat / Slack Bots (this project uses Telegram Bot as an example; other clients can be implemented)
- AI assistants in SaaS applications (requires rewriting the client)
- Multi-tenant platforms
Technical Details
Why Agents Are Hard to Serverless
Claude Agent SDK maintains conversation state, which is fundamentally different from stateless APIs:
- Each Agent requires a persistent shell, working directory, and full conversation tree
- Lambda environments are stateless: each invocation starts with a clean slate, and
/tmpis wiped after execution - Among the four official recommended deployment models, Hybrid Sessions (ephemeral containers + state restoration) offers the best cost efficiency
Project Architecture
Three-layer storage separation
graph LR
subgraph DynamoDB [DynamoDB Mapping Layer]
d_chat[chat_id:thread_id --> d_session[session_id]
end
subgraph S3 [S3 Persistence Layer]
s_root[session_id/]
s_root --> s_conv[conversation.jsonl]
s_root --> s_debug[debug.txt]
s_root --> s_todos[todos.json]
end
subgraph Lambda [Lambda /tmp Execution Layer]
l_root[/.claude-code/]
l_root --> l_projects[projects/]
l_root --> l_config[Configuration Files]
l_root --> l_sessions[Session Files]
end
d_session --> s_root
d_session --> l_root
DynamoDB: Stores chat_id:thread_id → session_id mappings, only 7 fields (session_key, session_id, chat_id, thread_id, created_at, updated_at, ttl), query time <5ms
- New session: no mapping exists → create new
session_id - Existing session: retrieve
session_id→ trigger S3 state restoration
S3: Stores session files (conversation.jsonl + debug + todos), grouped by session_id
conversation.jsonl: append-only JSONL-formatted conversation history, required by SDK for recovery- Supports unlimited number of sessions, with built-in multi-tenant isolation
/tmp/.claude-code: Lambda’s temporary working directory
- On startup: copy static configurations (e.g.,
agents.json,mcp.json,skills) from Docker image - On request: download the session’s
conversation.jsonlfrom S3 - SDK is pointed to this directory (
CLAUDE_CONFIG_DIR=/tmp/.claude-code) - After request:
/tmpis destroyed, all files disappear