🏆 2025 AI 새싹톤 우수상 🏆 2025 AI SeSACTHON Excellence Award

이코에코(Eco²)
Backend Portfolio Backend Portfolio

Multi-LLM Agent 기반 비동기 분산 클러스터로
서비스 수준의 제품 완성도를 목표로 한 고도화 여정 Evolution journey toward production-ready quality
with Multi-LLM Agent based async distributed cluster

클러스터 노드 📊 Cluster Nodes 📊

2,500 VU

ext-authz 1,477 RPS 📊 ext-authz 1,477 RPS 📊

1000 VU

Scan API (LLM×2) 97.8% · 373 RPM 📊 Scan API (LLM×2) 97.8% · 373 RPM 📊

100+ · 76+

개발 기록 · Knowledge Base Dev Logs · Knowledge Base

-일차

개발 기간 (10.31~01.28) 🖱️ Development (10.31~01.28) 🖱️

Closed

서비스 종료 (12.02~02.25) Service Closed (12.02~02.25)

Service Service

Vision LLM과 Multi-Agent로 환경과 일상을 연결하는 에이전트입니다. An agent connecting environment and daily life through Vision LLM and Multi-Agent.

8개의 도메인 API, 6개의 Worker, 3개의 인프라 피쳐를 설계, 구현, 배포, 운영 중입니다. Designing, implementing, deploying, and operating 8 domain APIs, 6 Workers, and 3 infrastructure features.

🌿 서비스 종료 (Closed)Service Closed

🔐

Auth

API Worker

OAuth 2.0 + JWT 인증 OAuth 2.0 + JWT Auth

💬

Chat

API Worker FE

LangGraph 에이전트 LangGraph Agent

📷

Scan

API Worker

Vision AI 폐기물 분류 Vision AI Waste Classifier

🌍

Character

API Worker

캐릭터 수집 + gRPC Character Collection + gRPC

📍

Location

API FE

Kakao 검색 + PostGIS 공간 쿼리 Kakao Search + PostGIS Spatial

👤

Users

API Worker

프로필 + 캐릭터 소유 Profile + Character Ownership

🖼️

Images

API

S3 + CloudFront CDN S3 + CloudFront CDN

📰

Info

API Worker FE

환경/에너지/AI 뉴스피드 Eco/Energy/AI Newsfeed

🛡️

ext-authz

Infra

Istio 외부 인가 (Go) Istio External Auth (Go)

📡

SSE Gateway

Infra

실시간 이벤트 스트리밍 Real-time Event Streaming

🔀

Event Router

Infra

Redis Streams 라우팅 Redis Streams Routing

Agent-Driven Development Agent-Driven Development

ADR·장애 로그·도메인 규칙을 Knowledge Base로 축적 → 자체 RAG + 27개 커스텀 Skills로 에이전트 온보딩 가속. Sub agent에 Skills를 주입하여 Task 도구 7,500+ 호출이 관측되며, 병렬 포함 멀티파트 작업을 수행합니다. Anthropic·Vercel 등 공식 벤더 Skills도 적극 활용하며, 설계-구현-테스트-관측-검증 피드백 루프를 구성해 사이클을 운용합니다. Accumulates ADRs, incident logs, and domain rules as Knowledge Base → accelerates agent onboarding with self-RAG + 27 custom Skills. Injects Skills into sub agents with 7,500+ Task tool invocations observed, including parallel multi-part execution. Actively leverages official vendor Skills from Anthropic, Vercel, and others. Operates a design-implement-test-observe-verify feedback loop.

➤

📊

📈

🔔

메트릭 Metrics

컨텍스트 주입 Context Injection

🔁

Recursive Self-Improvement

💡 클릭하여 상세 보기 💡 Click for details

🧑‍💻

Developer

설계 & 리뷰 Design & Review

🤖

AI Agent

구현 & 디버깅 Implement & Debug

☸️

Cluster

Runtime Environment Runtime Environment

📊

Observability

피드백 수집 Feedback

🔄 Development Lifecycle 🔄 Development Lifecycle

Research

Foundations

→

Design

Plans

→

Implement

Agent + Code

→

Deploy

GitOps

→

∞

Feedback

Reports → Loop

🎯 Single Source of Truth 🎯 Single Source of Truth

📝 Codebase

→

🔄 GitOps

→

☸️ Cluster State

→

📊 Observability

🧠 Self-RAG Knowledge Base 🧠 Self-RAG Knowledge Base

📁 docs/ — 프로젝트 의사결정 아카이브 — Project decision archive

📚 Foundations 근원 기술 21편 21 core tech

🔧 Applied 응용 기술 10편 10 applied

📋 Plans 설계 문서 10편 10 ADRs

📈 Reports 결과 기록 17편 17 reports

🔥 Troubleshooting 트러블슈팅 16편 16 issues

🛠️ .claude/skills/ — 27개 커스텀 Skills + 벤더 Skills — 27 custom Skills + vendor Skills

langgraph

pipeline

chat-agent

flow

event-driven

architecture

clean-arch

CQRS

k8s-debug

kubectl

redis

patterns

postgres

schema

grpc

service

rag

pipeline

llm-eval

swiss-cheese

mermaid

diagrams

pptx

slide-gen

celery

rabbitmq

prompt

engineering

data-scientist

analysis

resume-tailor

multi-format

skill-creator

meta-skill

+10 more

skills

🏢 공식 벤더 Skills Official Vendor Skills — Anthropic·Vercel 등 검증된 패턴 — Verified patterns from Anthropic, Vercel, etc.

Anthropic

Best Practices

Anthropic

Agent Skills

Vercel

React Best Practices

Anthropic

Effective Harness

docs/는 ADR·장애 로그·도메인 규칙을 담은 Knowledge Base이고, .claude/skills/는 27개 커스텀 Skills + Anthropic·Vercel 벤더 Skills를 Sub agent에 주입하여 온보딩을 가속합니다.
Task 도구 7,500+ 호출(8,251 세션)이 관측되며, 병렬 포함 멀티파트 작업을 수행합니다. 설계-구현-테스트-관측-검증 피드백 루프를 구성해 사이클을 운용합니다.
이 체계는 Anthropic Engineering의 Effective Harness 패턴과 구조적으로 대응합니다. 📊 Anthropic Insights → docs/ is a Knowledge Base of ADRs, incident logs, and domain rules. .claude/skills/ injects 27 custom Skills + vendor Skills from Anthropic and Vercel into sub agents to accelerate onboarding.
7,500+ Task tool invocations observed across 8,251 sessions, including parallel multi-part execution. Operates a design-implement-test-observe-verify feedback loop.
This system structurally corresponds to Anthropic Engineering's Effective Harness pattern. 📊 Anthropic Insights →

프로젝트 카테고리 Project Categories

🤖

Multi-Agent Chat

11 posts

개발 및 E2E 검증 완료, 실시간 배포 중 Dev & E2E Verified, Live Deployed LangGraph Send API Token v2 Function Calling Multi-Model

LangGraph Send API로 11종 서브에이전트를 동적 병렬 라우팅하는 멀티에이전트 시스템입니다. 9분류 Intent 분석 후 Multi-Intent Fanout을 통해 "이 페트병 어떻게 분리해? 근처 수거함도 알려줘" 같은 복합 질의를 단일 요청으로 병렬 처리합니다. Eco² 캐릭터 13종과 대한민국 폐기물 분류체계를 도메인 지식으로 주입하고, Nano Banana Pro 기반 이미지 생성, 사용자 위치 연동 실시간 날씨·수거함·재활용센터 검색, 네이티브 웹 검색까지 지원합니다. 3-Tier Memory(Redis hot + PostgreSQL persistent) 위에 Token v2 스트리밍을 구현하여 연결 단절 후 재접속 시에도 토큰 catch-up이 가능합니다. A multi-agent system that dynamically routes across 11 sub-agents in parallel via LangGraph Send API. After 9-class intent classification, Multi-Intent Fanout decomposes compound queries like "How do I separate this PET bottle? Find nearby collection points too" into parallel branches within a single request. Injects domain knowledge (13 Eco² characters, Korean waste classification system), supports Nano Banana Pro-powered image generation, real-time weather and recycling center search linked to user location, and native web search. Built on 3-Tier Memory (Redis hot + PostgreSQL persistent) with Token v2 streaming, enabling token catch-up even after connection drops.

🔀

LangGraph Workflow 11개 서브에이전트 StateGraph, Send API 병렬 실행, Aggregator 결과 수집 → 상세 보기 LangGraph Workflow 11 subagent StateGraph, Send API parallel execution, Aggregator result collection → View Details

🎯

9분류 Intent + Multi-Intent 키워드 맵 신뢰도 보정, Chain-of-Intent로 연속 질문 문맥 유지 → 상세 보기 9-class Intent + Multi-Intent Keyword map confidence calibration, Chain-of-Intent for continuous query context → View Details

⚡

Send API 병렬 라우팅 Dynamic Router + Enrichment Rules로 waste → weather 자동 추가, Aggregator에서 결과 수집 → 상세 보기 Send API Parallel Routing Dynamic Router + Enrichment Rules auto-add weather for waste, results collected at Aggregator → View Details

🔧

Tool Calling GPT-5.2 Strict Mode + Gemini 3 Function Calling, MOIS·KECO·Kakao API 네이티브 도구 호출 → 상세 보기 Tool Calling GPT-5.2 Strict Mode + Gemini 3 Function Calling, MOIS·KECO·Kakao API native tool invocation → View Details

🧠

Multi-Model Orchestration LLMClientPort 추상화, OpenAI Agents SDK(Primary) + Responses API(Fallback) + Gemini SDK, Provider 자동 추론 및 런타임 모델 전환 → 상세 보기 Multi-Model Orchestration LLMClientPort abstraction, OpenAI Agents SDK (Primary) + Responses API (Fallback) + Gemini SDK, Provider auto-inference & runtime model switching → View Details

🚌

Event Bus Redis Streams + Pub/Sub + State KV 3-Tier, Event Router Consumer Group, SSE Gateway Catch-up → 상세 보기 Event Bus Redis Streams + Pub/Sub + State KV 3-Tier, Event Router Consumer Group, SSE Gateway Catch-up → View Details

🧀

LLM Evaluation Pipeline Swiss Cheese 3-Tier (Code/LLM/Calibration), 5-Axis BARS Rubric, Expert Review 99.8/100 → 상세 보기 LLM Evaluation Pipeline Swiss Cheese 3-Tier (Code/LLM/Calibration), 5-Axis BARS Rubric, Expert Review 99.8/100 → View Details

💾

3-Tier Memory Redis Primary(~1ms) + PostgreSQL Async Sync + ChatState, ReadThroughCheckpointer 패턴 → 상세 보기 3-Tier Memory Redis Primary (~1ms) + PostgreSQL Async Sync + ChatState, ReadThroughCheckpointer pattern → View Details

🔄

Token v2 + Context 압축 XRANGE 복구, 동적 Summarization (272K trigger, 5-Tier 구조) → 상세 보기 Token v2 + Context Compression XRANGE recovery, dynamic Summarization (272K trigger, 5-Tier structure) → View Details

🛡️

Production Resilience NodePolicy (FAIL_OPEN/CLOSE/FALLBACK), Circuit Breaker 5회 → 60s → 상세 보기 Production Resilience NodePolicy (FAIL_OPEN/CLOSE/FALLBACK), Circuit Breaker 5 fails → 60s → View Details

📊

Feedback + Fallback Chain 4-dim RAG 평가, rag → web_search → general_llm 체인 → 상세 보기 Feedback + Fallback Chain 4-dim RAG eval, rag → web_search → general_llm chain → View Details

🔭

LangSmith + OTEL TelemetryConfigPort 추상화, Feature 단위 Run 추적, OTEL Span 통합 → 상세 보기 LangSmith + OTEL TelemetryConfigPort abstraction, Feature-level Run tracking, OTEL Span integration → View Details

📈

LangSmith Token Tracking 11개 LLM 호출 경로별 토큰 추적, usage_metadata 표준화 → 상세 보기 LangSmith Token Tracking Token tracking for 11 LLM call paths, usage_metadata standardization → View Details

🔒

Data Consistency Lamport Clock 순서 보장, cleanup_sequence 메모리 관리, Aggregator 정합성 검증 → 상세 보기 Data Consistency Lamport Clock ordering, cleanup_sequence memory management, Aggregator integrity validation → View Details

☸️

Kubernetes Cluster + GitOps + Service Mesh

8 posts

AWS Terraform ArgoCD Istio DOMA

8개 도메인 API(Auth, Character, Chat, Scan, Location, Users, Image, Info)를 개발하고, 장애 격리와 독립 스케일링을 위해 DOMA 원칙 기반 노드 분리를 설계했습니다. Bridge Ingress 단일 진입점으로 ALB → Istio Gateway → VirtualService 토폴로지를 구성하고, North-South(NodePort)와 East-West(ClusterIP/Calico VXLAN) 트래픽을 분리합니다. ArgoCD App-of-Apps + Sync Wave(00~63)로 CRD → Operator → Instance 순서의 선언적 배포를 자동화하고, Helm Chart + Kustomize Overlay로 환경별 오버레이를 관리합니다. Developed 8 domain APIs (Auth, Character, Chat, Scan, Location, Users, Image, Info) and designed DOMA-based node separation for fault isolation and independent scaling. Configured Bridge Ingress single entry point with ALB → Istio Gateway → VirtualService topology, separating North-South (NodePort) and East-West (ClusterIP/Calico VXLAN) traffic. Automated declarative deployment with ArgoCD App-of-Apps + Sync Wave (00~63) following CRD → Operator → Instance order, managing environment overlays with Helm Chart + Kustomize Overlay.

🏗️

24-nodes 클러스터 도메인별 장애 격리 + 스케일 실험 목적 24-node cluster domain fault isolation + scaling experiments

📊

노드 스펙 24-node EC2 인스턴스 상세 사양 Node Specs 24-node EC2 instance specifications

🏷️

5-Layer + Taints Label 계층 + 스케줄링 정책 5-Layer + Taints Label hierarchy + scheduling policy

🔄

Sync Wave (00~63) 의존성 기반 배포 순서 제어 Sync Wave (00~63) dependency-based deployment ordering

📦

Operators & Helm Charts CR 기반 선언적 관리로 확장성 확보 Operators & Helm Charts CR-based declarative management for extensibility

🕸️

Istio Service Mesh Sidecar Injection + VirtualService 라우팅 Istio Service Mesh Sidecar Injection + VirtualService routing

☁️

AWS 외부 컴포넌트 Route53, ALB, CloudFront, S3, ACM AWS External Components Route53, ALB, CloudFront, S3, ACM

🔗

IRSA 자동화 ExternalSecrets + ExternalDNS + ALB Controller IRSA Automation ExternalSecrets + ExternalDNS + ALB Controller

✅

CI Quality Gate black/ruff/pytest/coverage로 품질 기준 강제 CI Quality Gate enforce quality via black/ruff/pytest/coverage

🔐

Auth Offloading (ext-authz)

4 posts

Envoy Go gRPC Redis HPA Fanout

모든 요청의 인증/인가가 통과하는 Global Choke Point를 별도 서버로 분리해 API 서버 부하를 제거했습니다. Python 대비 동시성 처리에 유리한 Go + gRPC를 선택하고, Redis 병목 해결 후 Local Cache + Fanout으로 클러스터 처리량 Baseline을 확보했습니다. Separated Global Choke Point (auth/authz for all requests) into a dedicated server, eliminating API server load. Chose Go + gRPC for better concurrency over Python, established cluster throughput baseline via Local Cache + Fanout after resolving Redis bottleneck.

🎯

Global Baseline 2,500 VU 기준 RPS 1,200+ (ext-authz 단독) Global Baseline RPS 1,200+ at 2,500 VU (ext-authz only)

🔧

병목 해결 RPS 42 → 1,200 (28배 개선) Bottleneck fix RPS 42 → 1,200 (28x improvement)

📡

Cache 일관성 Fanout 브로드캐스트 + TTL 10s + 버전 스탬프 Cache consistency Fanout broadcast + TTL 10s + version stamp

🔍

🛡️

운영 정책 fail-close, Blacklist TTL 24h, 메모리 cap Ops policy fail-close, Blacklist TTL 24h, memory cap

🔍

📊

Observability

7 posts

ECK Elasticsearch Kibana Fluent Bit Jaeger Kiali

Agent-Driven Development의 Recursive Self-Improvement를 위해 로그-트레이스-메트릭 통합이 필수였습니다. Helm Chart 대신 ECK Operator를 선택해 ES 클러스터 관리 복잡도를 낮추고, ECS 스키마로 8개 도메인의 로그 포맷을 표준화했습니다. Log-trace-metric integration was essential for Agent-Driven Development's Recursive Self-Improvement. Chose ECK Operator over Helm Charts to reduce ES cluster management complexity, standardized log format across 8 domains with ECS schema.

📝

24-nodes 로그 중앙화 Fluent Bit DaemonSet → ES 24-node log centralization Fluent Bit DaemonSet → ES

🔗

trace_id 연결 ECS 로그 → Jaeger 원클릭 trace_id correlation ECS logs → Jaeger one-click

🚨

Alertmanager + Slack 장애 감지 자동화 Alertmanager + Slack automated alerting

🗺️

Kiali + Jaeger 서비스 메시 시각화 Kiali + Jaeger service mesh visualization

📨

Message Queue

15 posts

RabbitMQ Celery Taskiq Gevent Fanout DLQ

LLM API 호출(~12초/req)의 HTTP 타임아웃 회피와 동시 접속 부하 분산을 위해 비동기 Job Queue를 도입했습니다. Kafka 대비 운영 복잡도가 낮은 RabbitMQ를 선택하고, I/O-bound 워크로드에 Gevent, CPU-bound에 Thread Pool로 동시성 전략을 분리했습니다. Introduced async Job Queue to avoid HTTP timeouts from LLM API calls (~12s/req) and distribute concurrent load. Chose RabbitMQ over Kafka for lower operational complexity, separated concurrency strategies: Gevent for I/O-bound, Thread Pool for CPU-bound workloads.

⛓️

4단계 Celery Chain Vision→Rule→Answer→Reward 순차 의존성 4-stage Celery Chain Vision→Rule→Answer→Reward sequential dependency

🔀

동시성 전략 분리 prefork(단순 태스크) / Gevent(동기 I/O) / asyncio(LangGraph) Concurrency split prefork(simple tasks) / Gevent(sync I/O) / asyncio(LangGraph)

🆔

at-least-once + 멱등성 키 task_id 기반 중복 처리 방지 at-least-once + idempotency task_id based deduplication

📡

Topology CR 관리 K8s Operator로 Queue/Exchange 선언적 관리 Topology CR management declarative Queue/Exchange via K8s Operator

⚡

Taskiq Asyncio Worker LangGraph asyncio-native 요구사항 → Taskiq 도입 (Chat) Taskiq Asyncio Worker LangGraph asyncio-native requirement → Taskiq adoption (Chat)

🌊

Event Streams & Scaling

12 posts

Redis Streams Pub/Sub State KV Reclaimer KEDA SSE Gateway

LLM 파이프라인을 동기 요청으로 배포한 초기에는, 10+초 API 지연이 scan-api 스레드를 점유해 동시 처리에 한계가 있었습니다. SSE 전환 후에도 Celery Events 구조(SSE당 RabbitMQ 21연결)는 50 VU에서 연결 341개로 폭증, 503 에러가 발생했습니다. Redis Streams 기반 이벤트 릴레이로 연결 복잡도를 O(n×m) → O(n)으로 개선하고, Streams(영속) + Pub/Sub(실시간) + State KV(복구)로 책임을 분리해 500 VU 100% 완료율을 달성했습니다. When initially deploying the LLM pipeline as synchronous requests, 10+s API latency occupied scan-api threads, limiting concurrent processing. After SSE transition, Celery Events structure (21 RabbitMQ connections per SSE) exploded to 341 connections at 50 VU, causing 503 errors. Switched to Redis Streams-based event relay, improved connection complexity from O(n×m) → O(n), separated responsibilities: Streams (persistence) + Pub/Sub (realtime) + State KV (recovery), achieving 500 VU 100% completion.

🚌

Event Bus Layer O(n×m) → O(n) 연결 복잡도, Streams + Pub/Sub + State KV 분리 Event Bus Layer O(n×m) → O(n) connection complexity, Streams + Pub/Sub + State KV separation

🎯

Scan SSE 지표 500 VU 100% 완료율, 367.9 req/m, E2E p95 83.3초 Scan SSE metrics 500 VU 100% completion, 367.9 req/m, E2E p95 83.3s

📊

LLM 파이프라인 특성 Vision 3초 + Answer 8초 → 총 ~12초 I/O-bound LLM pipeline profile Vision 3s + Answer 8s → total ~12s I/O-bound

🔄

SSE 재연결 복구 State KV에서 마지막 상태 로드 SSE reconnect recovery load last state from State KV

⚖️

KEDA 트리거 RabbitMQ 큐 길이 (10-20msg) Worker 2→4, SSE 연결 수 (100/pod) 1→3 KEDA trigger RabbitMQ queue (10-20msg) Worker 2→4, SSE connections (100/pod) 1→3

🔧

Event Router 내부 Multi-domain (scan+chat) + Lua 멱등성 + XAUTOCLAIM Reclaimer Event Router internals Multi-domain (scan+chat) + Lua idempotency + XAUTOCLAIM Reclaimer

🔄

Eventual Consistency

5 posts

Persistence Offloading Worker Queue Batch Processing Idempotency Async Sync

강결합과 즉시 응답이 필요한 로직(OAuth 플로우 등)을 제외한 모든 영역에 Eventual Consistency를 적용했습니다. Strong Consistency는 Latency를 늘리고, Fire-and-forget은 데이터 유실 위험이 있어, at-least-once + 멱등성 키로 Exactly-once Semantics를 구현했습니다. Application Layer가 Persistence를 Worker에 Offloading하면서 DB 병목 없이 수평 확장이 가능해졌고, API Pod는 빠른 응답만 담당하여 처리량(Throughput)이 크게 향상되었습니다. Applied Eventual Consistency to all areas except tightly-coupled, immediate-response logic (e.g., OAuth flow). Strong consistency increases latency, fire-and-forget risks data loss. Implemented Exactly-once Semantics via at-least-once + idempotency keys. Application Layer offloads persistence to Workers, enabling horizontal scaling without DB bottleneck. API Pods focus solely on fast responses, significantly improving throughput.

📤

Write Offloading API는 이벤트만 발행, DB 쓰기는 Worker가 처리 Write Offloading API publishes events only, Worker handles DB writes

🆔

Deterministic UUID (user_id + scan_id) 기반 멱등성 키 Deterministic UUID (user_id + scan_id) based idempotency key

📡

Fanout 1:N 라우팅 단일 이벤트 → character_worker + users_worker Fanout 1:N routing single event → character_worker + users_worker

🔄

캐시 무효화 Fanout broadcast + TTL 10s + 신규 Worker 부팅 시 워밍 Cache invalidation Fanout broadcast + TTL 10s + warming on new Worker boot

🔁

Chat Async Persistence CheckpointSyncService (batch dedup) + Persistence Consumer (done→PG) Chat Async Persistence CheckpointSyncService (batch dedup) + Persistence Consumer (done→PG)

🏛️

Clean Architecture Migration

16 posts

DOMA DIP Port & Adapter CQRS FastAPI

Fallback Outbox 패턴 도입 후 Integration+Persistence 레이어가 중첩되며 레이어드 아키텍처의 한계에 직면했습니다. Redis 직접 의존 등 DI 위반을 제거하고, Port/Adapter 기반으로 인프라 교체 용이성과 역할 명확화를 확보했습니다. 도메인별 응집으로 비즈니스 요구사항 변경 시 수정 범위가 단일 도메인으로 제한되어, 사이드 이펙트 추적과 코드 리뷰 부담이 크게 줄었습니다. Port 인터페이스만 Mock하면 외부 인프라 없이 도메인 로직을 독립 검증할 수 있어, 테스트 작성 비용이 낮아지고 병렬 개발이 자연스럽게 가능해졌습니다. 인프라 의존성이 Adapter로 캡슐화되어 특정 컴포넌트 장애가 도메인 경계를 넘어 전파되지 않습니다. After introducing Fallback Outbox pattern, Integration+Persistence layers overlapped, revealing layered architecture limits. Eliminated DI violations (e.g., direct Redis dependency), achieved infrastructure swappability and clear role separation via Port/Adapter. Domain-level cohesion limits change scope to single domain, reducing side-effect tracking and code review burden. Mocking only Port interfaces enables independent domain logic verification without external infra, lowering test costs and enabling parallel development. Infrastructure dependencies encapsulated in Adapters ensure component failures don't propagate beyond domain boundaries.

📐

4-Layer DIP Domain이 Port 정의, Infra가 Adapter로 구현 — 결합도 제거 4-Layer DIP Domain defines Port, Infra implements as Adapter — decoupled

🎼

Application Usecase CQRS 기반 Command/Query 분리, Port·Service 오케스트레이션 Application Usecase CQRS-based Command/Query separation, Port·Service orchestration

🔌

Adapter 교체 용이 LLM/Persistence/MQ Adapter 독립 교체 가능 Adapter swappability LLM/Persistence/MQ Adapters independently replaceable

🛡️

LLM 복원력 ~12s I/O-bound (Vision 3s + Answer 8s), Fallback + Checkpoint로 고지연·불안정성 헷징 LLM Resilience ~12s I/O-bound (Vision 3s + Answer 8s), Fallback + Checkpoint hedges latency & instability

📦

기술 응집도 도메인별 설정 분리로 ConfigMap/Secret 관리 단순화 Tech cohesion Domain-specific configs simplify ConfigMap/Secret management

📊

Radon 복잡도 2,120 블록 평균 A (2.28) — 낮은 복잡도 유지 Radon complexity 2,120 blocks avg A (2.28) — low complexity maintained

🧪

테스트 커버리지 Unit 88%+, Integration 핵심 플로우 중심 Test Coverage Unit 88%+, Integration core flow focused

시스템 아키텍처 System Architecture

PWA 기반 프론트엔드와 연동되는 24-Node 분산 클러스터 아키텍처입니다. 24-Node distributed cluster architecture integrated with PWA-based frontend.

🏗️ 전체 시스템 아키텍처 🏗️ System Architecture Overview

📊 데이터 흐름 요약 📊 Data Flow Summary

흐름Flow	경로Path	프로토콜Protocol
🌐 N-S Traffic	User → Route53 → ALB → Istio GW → ext-authz → APIs	HTTPS, gRPC
🔄 E-W Sync	auth ↔ users, character → users (mTLS Envoy Sidecar)	gRPC
🤖 Scan AI	Scan API → RabbitMQ → scan-worker (Vision→Rule→Answer→Reward) → OpenAI API	AMQP, Celery Chain
💬 Chat Agent	Chat API → chat-worker (Intent→TagRetriever→EvalAgent→Fallback) → `YAML inject` + OpenAI API	Taskiq, LangGraph
🎭 Character Batch	scan-worker → character-match → character-worker (batch) → users-worker (UPSERT) → PostgreSQL	AMQP, Batch Insert
📡 SSE Event	Workers → Redis Streams → Event Router → Pub/Sub → SSE GW → Client	XADD, PUBLISH, SSE
🔔 Auth Relay	Auth API → Redis Fallback Outbox → auth-relay → RabbitMQ Fanout → auth-worker → Redis Blacklist	Fallback Outbox
🔄 Cache Broadcast	RabbitMQ Fanout → ext-authz Blacklist + character-match Catalog (all replicas) Local Cache Sync	AMQP Fanout
📊 Metrics	Envoy Sidecars → Prometheus → Grafana / KEDA Autoscaling	Prometheus scrape
📝 Logs	All Pods (stdout) → Fluent Bit DaemonSet → Elasticsearch → Kibana	HTTP (ECS format)
🔍 Traces	Envoy → OTel Collector → Jaeger (trace_id correlation, Kiali viz)	OTLP, Zipkin

📱 서비스 흐름 (PWA → Backend → LLM) 📱 Service Flow (PWA → Backend → LLM)

📱

PWA Client

React + Vite React + Vite

→

🌐

Edge Layer

ALB + Istio

→

🔐

Auth

ext-authz

→

⚙️

7 Services

FastAPI

→

📨

Integration

MQ + Workers

→

🤖

LLM Agent

GPT/Gemini

👁️ Observability Layer — 모든 계층 실시간 모니터링 👁️ Observability Layer — Real-time monitoring across all layers

Prometheus Grafana Jaeger Kibana AlertManager

🤖 LLM Pipeline + Async SSE 🤖 LLM Pipeline + Async SSE

🧠 Scan AI Pipeline (Modular RAG) 🧠 Scan AI Pipeline (Modular RAG)

AI Researcher AI Researcher @taemin-steve

👁️

Vision LLM

Pre-Retrieval

→

📋

Rule Engine

온누리 규정 KB

→

💬

Answer LLM

Post-Retrieval

↑ inject

System Prompt

분류 기준

↑ inject

YAML KB

167개 품목

↓ output

JSON Output

Structured

💡 각 모듈 클릭 시 상세 보기 💡 Click each module for details

🤖 Chat Agent 완료 🤖 Chat Agent Complete

Backend / Infra Backend / Infra @mangowhoiscloud

🎯

Intent

9분류

→

⚡

Router

Send API

→

🔧

Subagents

11종 ∥

→

📦

Aggregator

Collect

→

💬

Answer

Token v2

↑

intent.txt

↑

Enrichment

↑

Function Def

↑

Lamport Clock

↓

notify_token_v2

🧠 Multi-Model (GPT-5.2 / Gemini 3)

⚙️ LangGraph StateGraph

💡 모듈 클릭 시 상세 Click for details

📦 Scan: Celery + Gevent 🔍 클릭하여 상세 📦 Scan: Celery + Gevent 🔍 Click for details

Backend / Infra Backend / Infra @mangowhoiscloud

⛓️

Celery Chain

4-Stage

→

🐰

RabbitMQ

Job Queue

→

🌊

Event Bus

3-Tier Redis

→

📺

SSE

Client

Pool gevent 100

🎯 SLA (VU 500) 100%

E2E p95 83.3s

Throughput 367.9 req/m

🤖 Chat: Taskiq + asyncio 완료 🔍 각 단계 클릭 🤖 Chat: Taskiq + asyncio Complete 🔍 Click each stage

Backend / Infra Backend / Infra @mangowhoiscloud

🔄

Taskiq

asyncio

→

🐰

RabbitMQ

Quorum

→

⚙️

LangGraph

StateGraph

→

🌊

Event Bus

Token v2

→

📺

SSE

Client

Checkpointer ReadThroughCheckpointer

Token XRANGE Recovery

Memory 3-Tier (Redis + PG)

기술 스택 Tech Stack

프로젝트에서 활용한 주요 기술들입니다. Key technologies used in the project.

Infrastructure

AWS EC2 Terraform Ansible Kubernetes Istio ArgoCD

Backend

Python 3.11 FastAPI SQLAlchemy 2.0 Celery Gevent Taskiq Go

Data & Messaging

PostgreSQL Redis Redis Streams RabbitMQ Elasticsearch

Observability

Prometheus Grafana AlertManager Slack Fluent Bit Kibana Jaeger Kiali OpenTelemetry LangSmith

LLM

GPT-5.2 Gemini-3.0-flash OpenAI Agents SDK Gemini SDK LangGraph

Scaling

KEDA HPA Prometheus Adapter

프로젝트 타임라인 Project Timeline

2025.10.30 - 11.20

🏗️ 14-nodes 클러스터 구축 & GitOps 🏗️ 14-node Cluster Setup & GitOps

AWS 14-nodes Kubernetes 클러스터 구축, ArgoCD App-of-Apps + Sync Wave 패턴 적용 Built AWS 14-node Kubernetes cluster, applied ArgoCD App-of-Apps + Sync Wave patterns

2025.11.20 - 11.30

🖥️ 7개 도메인 서버 개발 🖥️ 7 Domain Server Development

Auth, Character, Chat, Scan, Location, Users, Image 7개 마이크로서비스 개발 Developed 7 microservices: Auth, Character, Chat, Scan, Location, Users, Image

2025.12.01 - 12.02

🏆 FE-BE 연동 & 새싹톤 본선 우수상 🏆 FE-BE Integration & SeSACTHON Excellence Award

프론트엔드-백엔드 연동, 2025 AI 새싹톤 본선 우수상 수상 (Top 4) Frontend-Backend integration, 2025 AI SeSACTHON Finals Excellence Award (Top 4)

2025.12.08 - 12.17

🕸️ Istio Service Mesh & Auth Offloading 🕸️ Istio Service Mesh & Auth Offloading

Istio Sidecar Injection, ext-authz 서버 개발, gRPC 마이그레이션, RPS 1,200+ 달성 Istio Sidecar Injection, ext-authz server, gRPC migration, achieved RPS 1,200+

2025.12.18 - 12.25

📨 RabbitMQ + Celery 비동기 아키텍처 📨 RabbitMQ + Celery Async Architecture

AI 파이프라인 비동기화, 4단계 Celery Chain 구현, Gevent Pool 전환 Async AI pipeline, 4-stage Celery Chain, Gevent Pool migration

2025.12.20 - 12.22

📊 Observability 스택 구축 📊 Observability Stack Setup

ECK 기반 EFK 스택, OpenTelemetry 분산 트레이싱, Kiali 시각화 ECK-based EFK stack, OpenTelemetry distributed tracing, Kiali visualization

2025.12.26 - 01.01

🌊 Event Relay Layer & 부하 테스트 🌊 Event Relay Layer & Load Testing

Streams + Pub/Sub + State KV 3-Tier 아키텍처, 500 VU SLA, 600 VU 포화지점 도출 Streams + Pub/Sub + State KV 3-tier architecture, 500 VU SLA, 600 VU saturation point

2025.12.31 - 2026.01.13

🏛️ Clean Architecture 마이그레이션 🏛️ Clean Architecture Migration

7개 도메인 점진적 마이그레이션, DIP/Port & Adapter/CQRS 적용, Info 서비스 추가 개발 (뉴스 피드) Incremental migration of 7 domains, DIP/Port & Adapter/CQRS, Info service addition (News Feed)

2026.01.15

🔄 Cursor → Claude Code 컨텍스트 마이그레이션 🔄 Cursor → Claude Code Context Migration

Sidebar 병렬 세션 + Worktree 분기(Human-in-the-Loop)에서 Task 도구 기반 Agent Fleet 자율 운용으로 전환, 세션 지식을 docs/ KB + .claude/skills/로 코드화하여 영속적 컨텍스트 확보 Migrated from Sidebar parallel sessions + Worktree branching (Human-in-the-Loop) to Task tool-based Agent Fleet autonomous operation, codifying session knowledge into docs/ KB + .claude/skills/ for persistent context

2026.01.13 - 01.25

🤖 Chat 도메인 Agentic Workflow 전환 🤖 Chat Domain Agentic Workflow Transition

LangGraph 기반 Multi-Agent 아키텍처로 Chat 도메인 고도화, 도구 호출 + 상태 관리 + 스트리밍 통합 Upgrading Chat domain to LangGraph-based Multi-Agent architecture, Tool calling + State management + Streaming integration

2026.01.27 - 01.28

📊 VU 1000 부하 테스트 · 프로덕션 레벨 처리량 확보 📊 VU 1000 Load Test · Production-Level Throughput Achieved

Redis Pub/Sub Shard 최적화로 SSE Gateway 연결 O(N)→O(4) 절감, Scan API(LLM×2) 1000 VU 97.8% 완료율 · 373 RPM · p95 173s 달성, 프로덕션 레벨 처리량과 가용 유저풀 확보 Redis Pub/Sub Shard optimization reduced SSE Gateway connections O(N)→O(4), Scan API(LLM×2) 1000 VU 97.8% success · 373 RPM · p95 173s, production-level throughput and available user pool secured

📹 API 동작 📹 API Action

Auth Relay Fallback Outbox

┌──────────────────────────────────────────────────────────────────────────┐ │ Auth Domain Architecture │ ├──────────────────────────────────────────────────────────────────────────┤ │ │ │ ┌──────────┐ ┌─────────────┐ ┌─────────────┐ ┌───────────────┐ │ │ │ Client │──▶│ Auth API │──▶│ OAuth │ │ Users gRPC │ │ │ └──────────┘ └──────┬──────┘ │ Google/Kakao│ │GetOrCreateUser│ │ │ │ └─────────────┘ └───────────────┘ │ │ │ │ │ ┌──────▼──────┐ │ │ │ JWT 발급 │ Access 3h / Refresh 30d │ │ │ (HS256) │ JTI 기반 무효화, Token Rotation │ │ └──────┬──────┘ │ │ │ │ │ ┌─────────────────────┼─────────────────────────────────────────────┐ │ │ │ ▼ Async Layer │ │ │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────────┐ │ │ │ │ │ Auth Relay │──▶│ RabbitMQ │──▶│ ext-authz (All Pods) │ │ │ │ │ │ (Redis Out- │ │ Fanout │ │ Local Cache Update │ │ │ │ │ │ box Pattern)│ │ (blacklist) │ └─────────────────────────┘ │ │ │ │ └─────────────┘ └─────────────┘ │ │ │ └───────────────────────────────────────────────────────────────────┘ │ │ │ └──────────────────────────────────────────────────────────────────────────┘

Google OAuth

• PKCE (S256)

• openid, email, profile

• access_type=offline

Kakao OAuth

• PKCE (S256)

• Developer Console 기반

• kakao_account.profile

Naver OAuth

• PKCE 미지원

• profile, email

• response 필드 매핑

PKCE Flow

code_verifier (64byte urlsafe) → SHA256 → base64url = code_challenge (S256)

s_access / s_refresh, HttpOnly + Secure + SameSite=Lax, domain: .dev.growbin.app

Token Rotation

Refresh 시 old JTI → Blacklist 발행, new pair 발급 + 세션 등록

Session Store

Redis: user:tokens:{uid} (Set), token:meta:{jti} (JSON), TTL = token exp

Auth Relay

Redis Fallback Outbox → RabbitMQ Fanout → ext-authz Local Cache 동기화

gRPC

Users.GetOrCreateUser() → 신규 시 기본 캐릭터(이코) 자동 부여

📹 API 동작 📹 API Action

HTTP REST

• GET /catalog — 캐릭터 카탈로그

• POST /internal/rewards — 보상 평가

Internal: Istio AuthorizationPolicy

gRPC (CharacterService)

• GetCharacterReward() — 보상 판정

• GetDefaultCharacter() — 기본 캐릭터

• GetCharacterByMatch() — 분류 매칭

Chat Worker, Scan Worker에서 호출

🗄️ 2-Tier Cache Architecture 🗄️ 2-Tier Cache Architecture

┌──────────────────────────────────────────────────────────────────────────┐ │ 2-Tier Local Cache Strategy │ ├──────────────────────────────────────────────────────────────────────────┤ │ │ │ ┌───────────────────────────────────────────────────────────────────┐ │ │ │ Tier 1: Local In-Memory Cache (Primary) │ │ │ │ ───────────────────────────────────────── │ │ │ │ • Thread-safe Singleton (Double-Checked Locking) │ │ │ │ • ~50KB (13 character records) │ │ │ │ • Zero-latency access (~0.01ms) │ │ │ │ • No Fixed TTL → Event-driven refresh │ │ │ └───────────────────────────────────────────────────────────────────┘ │ │ │ miss │ │ ▼ │ │ ┌───────────────────────────────────────────────────────────────────┐ │ │ │ Tier 2: Database Fallback (Secondary) │ │ │ │ ───────────────────────────────────────── │ │ │ │ • PostgreSQL via SQLAlchemy │ │ │ │ • Graceful degradation │ │ │ └───────────────────────────────────────────────────────────────────┘ │ │ │ │ Initialization & Sync │ │ ───────────────────────────────────────── │ │ ┌─────────────┐ Eager Loading ┌─────────────────────────┐ │ │ │ FastAPI │ ──────────────────────▶│ CharacterLocalCache │ │ │ │ Lifespan │ (서버 시작 전) │ (Singleton) │ │ │ └─────────────┘ └─────────────────────────┘ │ │ ▲ │ │ ┌─────────────┐ Event-Based Sync ┌────────┴────────────────┐ │ │ │ RabbitMQ │ ──────────────────────▶│ MQ Consumer (goroutine)│ │ │ │ Fanout │ (Eventual Consist) │ → All Pods Broadcast │ │ │ └─────────────┘ └─────────────────────────┘ │ │ │ └──────────────────────────────────────────────────────────────────────────┘

~0.01ms

Local Cache Lookup

~50KB

13 Records Cached

No TTL

Event-driven Refresh

🏗️ Implementation Patterns 🏗️ Implementation Patterns

패턴Pattern	구현Implementation	설명Description
Decorator	`LocalCachedCatalogReader`	Cache 래핑 → SqlaCharacterReader delegateCache wrapper → SqlaCharacterReader delegate
Singleton	`CharacterLocalCache`	Double-Checked Locking으로 thread-safeThread-safe via Double-Checked Locking
Eager Loading	FastAPI Lifespan	서버 시작 전 DB에서 캐시 워밍업Cache warmup from DB before server start
Eventual Consistency	RabbitMQ Fanout	수 초 지연 허용, Redis 의존성 제거Accept seconds latency, Redis-independent

Reward Policy

disposal_rules=True + insufficiencies=False → middle_category 기반 캐릭터 매칭

Ownership

UNIQUE(user_id, character_code) 멱등성, source: scan-reward / default-onboard

Chat Integration

GetCharacterByMatch(match_label) → CDN 이미지 Reference → Gemini Image Gen

Character Worker

Celery: match_character_task, reward_event_task, grant_default_task

Name Detector

Longest-match-first 알고리즘, aliases 기반 사용자 메시지 캐릭터 감지

CDN Asset

images.dev.growbin.app/character/{cdn_code}.png, CharacterAssetPort 추상화

📚 참고Reference

Character Local Cache 아키텍처 → Character Local Cache Architecture →

📹 API 동작 📹 API Action

🔀 LangGraph Workflow (StateGraph) 🔀 LangGraph Workflow (StateGraph)

__start__ → intent → vision → router → [11 subagents] → aggregator → summarize → answer → __end__ __start__ → intent → vision → router → [11 subagents] → aggregator → summarize → answer → __end__

🔀 Dynamic Router (Send API)

Multi-intent fanout: additional_intents → 병렬 Send

Intent enrichment: waste/bulk_waste → weather 자동 추가

Conditional: user_location → weather 조건부 추가

예: "종이 어떻게 버려? 수거함도" → 3노드 병렬

🔧 Tool Calling (Native SDK)

Agents SDK: @function_tool 데코레이터 → Location Tool

Gemini SDK: google.genai.types.Tool → Web Search

OpenAI SDK: tools=[{"type": "function"}] → 네이티브 호출

LangChain 추상화 배제, 직접 SDK 호출

📦 Aggregator

병렬 결과 수집: 9개 context 필드 병합

Required/Optional 검증: contracts.py SSOT

필수 누락 시: needs_fallback=True 트리거

Progress: 60% → 65% (정보 취합 중)

📝 Summarize (Context Compression)

OpenCode 스타일: context_window - max_output 초과 시

GPT-5.2: 400K context, 트리거 272K, 요약 60K

구조화된 5섹션: 요청/목표/완료/진행/컨텍스트

PRUNE_PROTECT: 40K 토큰 (최근 보호)

🧀 Evaluation (Swiss Cheese 3-Tier)

L1 Code Grader: <50ms, Regex/Token/Keywords (sync)

L2 LLM Judge: 5-Axis BARS Rubric, Self-Consistency 3x

L3 Calibration: CUSUM drift, Krippendorff α≥0.75

C-Grade → Regeneration 트리거, Expert Review 99.8/100

🚀 Send API (Parallel Execution)

Dynamic Fanout: Send(node, state) → N개 서브에이전트 동시 실행

Race Condition 방지: Reducer 패턴 (add/merge) + Aggregator 동기화

State Isolation: 각 브랜치 독립 state, 결과만 병합

예: waste+location+weather → 3노드 병렬 → Aggregator 수집

🎯 9-Class Intent Classification

🧠 Intent Node

LLM 분류 + Chain-of-Intent + Multi-Intent Detection + Keyword Boost

🗑️ waste

폐기물 분리배출 (RAG)

🌍 character

캐릭터 질문 (gRPC)

📍 location

위치 기반 검색 (KakaoAPI)

🛋️ bulk_waste

대형폐기물 (MOIS API)

💰 recyclable_price

재활용 시세 (KECO API)

📦 collection_point

수거함 위치 (KECO API)

🔍 web_search

웹 검색 (GPT 5.2, Gemini-3-flash 네이티브 툴콜링)

🎨 image_generation

이미지 생성 (gemini-3-pro-image + gRPC → CDN)

💬 general

일반 대화 (Fallback)

🏗️ Event Bus 아키텍처 🏗️ Event Bus Architecture

Job Dispatch: RabbitMQ (DIRECT)

Chat API → TaskIQ → chat.process 큐 → Chat Worker → LangGraph

Event Bus: Redis Streams + Pub/Sub

Worker XADD → chat:events:{shard} → Event Router → Pub/Sub → SSE Gateway

📚 상세 문서 📚 Deep Dive

🔀

LangGraph Workflow

Multi-Agent StateGraph

🎯

Intent Details

Confidence, Chain-of-Intent

⚡

Send API Router

Multi-Intent Fanout

📦

Aggregator

Result Collection, Fallback

📝

Summarize

Context Compression

🚌

Event Bus

Streams, Token v2

🔧

Tool Calling

Native Function Calling

🔄

Data Consistency

Optimistic + Eventual

🧠

Multi-Model

LLM Native API Switching

💾

3-Tier Memory

ReadThrough + Async Sync

📡

SSE Gateway

Token v2 Recoverable

🔀

Event Router

Consumer Group + Lua

🧀

LLM Evaluation Pipeline (Swiss Cheese)

3-Tier Graders (Code/LLM/Calibration) | 5-Axis BARS Rubric | Expert Review 99.8/100

L1 <50ms L2 BARS L3 CUSUM

KEDA 오토스케일링 KEDA Autoscaling

Worker: RabbitMQ 큐 길이 (5msg, 1-4 pods) | SSE: 연결 수 (100/pod, 1-3) | Router: Pending (100, 1-2)

네이티브 스트리밍 Native Streaming

astream_events v2 + DynamicProgressTracker (Phase 기반 병렬 서브에이전트 진행률)

Event-First Event-First

Single Write Path → Consumer Group으로 SSE + PostgreSQL 분기

토큰 스트리밍 Token Streaming

Token v2: chat:tokens:{job_id} + 10토큰마다 State 스냅샷 → 재연결 시 catch-up

Resilience Resilience

NodePolicy (FAIL_OPEN/CLOSE/FALLBACK) + Circuit Breaker + NodeExecutor

LangGraph Send API Pipeline

┌──────────────────────────────────────────────────────────────────────────────┐ │ LangGraph Pipeline - Send API Multi-Intent Fanout │ ├──────────────────────────────────────────────────────────────────────────────┤ │ │ │ START → intent_node → [vision_node?] → dynamic_router │ │ │ │ │ ╔══════════════════════════════════╧══════════════════════════════╗│ │ ║ Send API (Parallel Execution) ║│ │ ║ ║│ │ ║ Primary Intent Additional Intents Enrichments ║│ │ ║ ┌─────────────┐ ┌─────────────┐ ┌─────────┐ ║│ │ ║ │ waste_rag │ │ collection │ │ weather │ ║│ │ ║ │ (RAG) │ │ _point │ │ (KMA) │ ║│ │ ║ └──────┬──────┘ └──────┬──────┘ └────┬────┘ ║│ │ ║ │ │ │ ║│ │ ╚═════════╧═════════════════════╧═════════════════════╧══════════╝│ │ │ │ │ merge_node (결과 병합) │ │ │ │ │ response_node (In-Context Learning 응답 생성) │ │ │ │ │ END │ │ │ └──────────────────────────────────────────────────────────────────────────────┘

Dynamic Router 구현

🌤️ ENRICHMENT_RULES

Intent-based:

• waste → +weather (분리배출 팁)

• bulk_waste → +weather (배출일 날씨)

Conditional:

• user_location 존재 시 → +weather

※ exclude_intents: weather, image_generation

🗺️ INTENT_TO_NODE

waste → waste_rag character → character location → location bulk_waste → bulk_waste recyclable_price → recyclable_price collection_point → collection_point web_search → web_search image_generation → image_generation general → general weather → weather

📊 Multi-Intent Example

User: "종이 어떻게 버려? 수거함도 알려줘"

Intent Detection:
primary: waste
additional: [collection_point]

Send API Calls:
→ Send(waste_rag, state)
→ Send(collection_point, state)
→ Send(weather, state) (enrichment)

Parallel Execution:
3개 노드 동시 실행
merge_node에서 수집

Event Bus: Redis Streams + Pub/Sub

┌───────────────────────────────────────────────────────────────────────────────────┐ │ Chat Event Bus Architecture │ ├───────────────────────────────────────────────────────────────────────────────────┤ │ │ │ ┌─────────────────┐ XADD ┌────────────────────┐ XREADGROUP │ │ │ Chat Worker │──────────────▶│ Redis Streams │◀──────────────┐ │ │ │ (LangGraph) │ │ chat:events:{shard}│ │ │ │ └────────┬────────┘ │ (4 shards) │ ┌──────────┴───────┐ │ │ │ └────────────────────┘ │ Event Router │ │ │ │ Token v2 │ (Consumer Group) │ │ │ ▼ └──────────┬───────┘ │ │ ┌─────────────────┐ ┌────────────────────┐ │ │ │ │ chat:tokens: │ │ State KV │◀──────────────┤ Lua │ │ │ {job_id} │ │ chat:state:{job_id}│ UPDATE │ Script │ │ └─────────────────┘ └────────────────────┘ │ │ │ │ │ │ PUBLISH │ │ │ Catch-up │ Fallback ▼ │ │ │ │ ┌─────────────────┐ │ │ ▼ ▼ │ Pub/Sub │ │ │ ┌───────────────────────────────────────────────────────┐ │sse:events: │ │ │ │ SSE Gateway │◀─┤ {job_id} │ │ │ │ SUBSCRIBE + State Fallback + Streams Catch-up │ └─────────────────┘ │ │ └────────────────────────────┬──────────────────────────┘ │ │ │ SSE │ │ ▼ │ │ [ Client ] │ │ │ ├───────────────────────────────────────────────────────────────────────────────────┤ │ Checkpoint: Redis Primary + PostgreSQL Async Sync │ │ ┌─────────────────┐ ~1ms ┌──────────────┐ 5s batch ┌──────────────────┐ │ │ │ SyncableRedis │──────────▶│ Redis │───────────▶│ PostgreSQL │ │ │ │ Saver │ SET │ checkpoint │ UPSERT │ (Durable Store) │ │ │ └─────────────────┘ └──────────────┘ └──────────────────┘ │ │ ReadThroughCheckpointer: Redis Hit → return / Miss → PG read → Redis promote │ └───────────────────────────────────────────────────────────────────────────────────┘

Token Streaming v2: Recoverable Stream

📝 Stream Key Structure

chat:tokens:{job_id}

• TTL: 3600s (1시간)

• Max Length: 10,000 entries

• Seq-based ID: 자동 증가

📸 State Snapshot

• 10토큰마다 State 스냅샷 저장

• Key: chat:state:{job_id}

• 재연결 시 catch-up 지원

• Last-Event-ID 헤더 활용

🔄 Catch-up Mechanism

1. 클라이언트 재연결

2. Last-Event-ID 전송

3. XREAD STREAMS ... {last_id}

4. 누락 이벤트 순차 전송

5. 실시간 스트림 재개

📊 Event Types

• progress: 진행 단계

• token: LLM 토큰

• agent_state: 에이전트 상태

• metadata: 메타데이터

• complete: 완료

• error: 오류

KEDA Autoscaling Configuration

Chat Worker

Trigger: RabbitMQ Queue Length

Queue: chat.process

Threshold: 5 messages

Replicas: 1 → 4

SSE Gateway

Trigger: Prometheus

Metric: sse_active_connections

Threshold: 100/pod

Replicas: 1 → 3

Event Router

Trigger: Prometheus

Metric: pending_events

Threshold: 100

Replicas: 1 → 2

┌─────────────────────────────────────────────────────────────────────────────┐
│                     Taskiq Worker Architecture                               │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│   Command:                                                                   │
│   taskiq worker chat_worker.main:broker --workers 4 --max-async-tasks 10    │
│                                                                              │
│   ┌──────────────────────────────────────────────────────────────────────┐  │
│   │  Worker Pool (4 processes × 10 async = 40 concurrent tasks)          │  │
│   │  ─────────────────────────────────────────────────────────────────   │  │
│   │                                                                       │  │
│   │   Worker 1        Worker 2        Worker 3        Worker 4           │  │
│   │   ┌─────────┐    ┌─────────┐    ┌─────────┐    ┌─────────┐          │  │
│   │   │asyncio  │    │asyncio  │    │asyncio  │    │asyncio  │          │  │
│   │   │loop     │    │loop     │    │loop     │    │loop     │          │  │
│   │   │         │    │         │    │         │    │         │          │  │
│   │   │ 10 tasks│    │ 10 tasks│    │ 10 tasks│    │ 10 tasks│          │  │
│   │   └─────────┘    └─────────┘    └─────────┘    └─────────┘          │  │
│   │                                                                       │  │
│   └──────────────────────────────────────────────────────────────────────┘  │
│                                                                              │
│   Task Definition:                                                           │
│   ┌──────────────────────────────────────────────────────────────────────┐  │
│   │  @broker.task(task_name="chat.process", timeout=120, max_retries=2)  │  │
│   │  async def process_chat(job_id, session_id, message, image_url=None):│  │
│   │      async with LangGraphApp(...) as app:                            │  │
│   │          await app.ainvoke(...)                                      │  │
│   │          await event_bus.publish(...)                                │  │
│   └──────────────────────────────────────────────────────────────────────┘  │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

⚡ Celery vs Taskiq 비교 ⚡ Celery vs Taskiq Comparison

항목Aspect	Celery (Scan Worker)	Taskiq (Chat Worker)
동시성 모델Concurrency Model	gevent (greenlet)	asyncio (native)
LangGraph 호환LangGraph Compat	△ (래핑 필요Wrapping needed)	✅ Native async
BrokerBroker	RabbitMQ	RabbitMQ (AioPikaBroker)
결과 반환Result Return	AsyncResult	TaskiqResult
사용처Use Case	Scan Worker (Vision)	Chat Worker (LangGraph)

📊 성능 지표 📊 Performance Metrics

동시 처리 태스크Concurrent Tasks

4 workers × 10 async

120s

태스크 타임아웃Task Timeout

timeout=120

최대 재시도Max Retries

max_retries=2

📚 참고Reference

Taskiq asyncio 워커 설정 → Taskiq asyncio Worker Configuration →

┌─────────────────────────────────────────────────────────────────────────────┐
│                     RabbitMQ Job Submission Flow                             │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│   Dual-Path Messaging Architecture                                           │
│   ════════════════════════════════════════════════════════════════════       │
│   Path 1: RabbitMQ (Job Execution)    │   Path 2: Redis (Event Streaming)   │
│   ─────────────────────────────────   │   ─────────────────────────────     │
│   DIRECT exchange: chat_tasks          │   Streams + Pub/Sub (3-Tier)        │
│   Queue: chat.process                  │   Durable → Router → Volatile       │
│                                                                              │
│   ┌─────────────────────────────────────────────────────────────────────┐   │
│   │  chat-api                              chat-worker                   │   │
│   │  ┌─────────────────────┐              ┌─────────────────────┐       │   │
│   │  │ AioPikaBroker       │   AMQP       │ @broker.task        │       │   │
│   │  │ .kiq(job_id,        │ ──────────▶  │ async def process() │       │   │
│   │  │      session_id,    │   JSON       │   → LangGraph       │       │   │
│   │  │      message,       │              │   → Event Publish   │       │   │
│   │  │      image_url?)    │              └─────────────────────┘       │   │
│   │  └─────────────────────┘                                             │   │
│   └─────────────────────────────────────────────────────────────────────┘   │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

⚙️ RabbitMQ Topology 구성 ⚙️ RabbitMQ Topology Configuration

구성 요소Component	설정Config	설명Description
Broker	`AioPikaBroker`	RabbitMQ asyncio 클라이언트RabbitMQ asyncio client
Exchange	`chat_tasks (direct)`	Topology CR로 미리 생성Pre-created by Topology CR
Queue	`chat.process`	DLX, TTL 설정 포함With DLX, TTL settings
Routing Key	`chat.process`	Direct 라우팅Direct routing

🔄 Message Flow 🔄 Message Flow

1️⃣

Submit

broker.kiq(job_id, session_id, message)

2️⃣

Route

DIRECT exchange → Queue DIRECT exchange → Queue

3️⃣

Consume

Taskiq Worker 수신 Taskiq Worker consumes

4️⃣

Execute

LangGraph 실행 LangGraph execution

⚡ KEDA Autoscaling ⚡ KEDA Autoscaling

트리거Trigger	메트릭Metric	임계값Threshold	스케일Scale
RabbitMQ	Queue Length	5 messages	1 → 4 replicas

┌─────────────────────────────────────────────────────────────────────────────┐
│                        SSE Gateway Architecture                              │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│   ┌───────────────────┐                      ┌──────────────────────────┐   │
│   │   Event Router    │    PUBLISH           │     SSE Gateway          │   │
│   │  (Consumer Group) │ ────────────────────▶│                          │   │
│   └───────────────────┘                      │   ┌──────────────────┐   │   │
│                                               │   │ In-memory fan-out│   │──▶│ SSE
│   ┌───────────────────┐                      │   └──────────────────┘   │   │
│   │   Redis Streams   │    XRANGE            │   ┌──────────────────┐   │   │
│   │ chat:events:{id}  │ ◀────────────────────│   │ State recovery   │   │   │
│   └───────────────────┘   (catch-up)         │   │ (Redis KV)       │   │   │
│                                               │   └──────────────────┘   │   │
│   ┌───────────────────┐                      │   ┌──────────────────┐   │   │
│   │   Redis Pub/Sub   │    SUBSCRIBE         │   │ Last-Event-ID    │   │   │
│   │ chat:{job_id}     │ ────────────────────▶│   │ header           │   │   │
│   └───────────────────┘                      │   └──────────────────┘   │   │
│                                               └──────────────────────────┘   │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

🔄 Token v2 Recovery Protocol 🔄 Token v2 Recovery Protocol

기능Feature	구현Implementation	설정값Config
Stale Detection	3초 타임아웃 시 자동 재연결Auto-reconnect on 3s timeout	`3s threshold`
Max Reconnections	지수 간격으로 최대 3회 시도3 attempts with exponential spacing	`3 attempts`
Fallback Polling	SSE 실패 시 폴링 전환Polling fallback if SSE fails	`3s interval, 120s max`
Seq Encoding	Stage: STAGE_ORDER×10, Token: 1000+Stage: STAGE_ORDER×10, Token: 1000+	`Last-Event-ID`
Catch-up	XREVRANGE로 누락 이벤트 복구XREVRANGE for missed events	`State KV + Streams`

📊 Consumer Groups (Two-Path) 📊 Consumer Groups (Two-Path)

🚀 eventrouter

목적Purpose: 실시간 SSE 브로드캐스트Real-time SSE broadcast

지연Latency: ~10ms

동작Action: PUBLISH → Pub/Sub

💾 chat-persistence

목적Purpose: PostgreSQL 영구 저장PostgreSQL persistence

지연Latency: ~200ms

동작Action: XACK → INSERT

⚡ KEDA Autoscaling ⚡ KEDA Autoscaling

트리거Trigger	메트릭Metric	임계값Threshold	스케일Scale
Prometheus	sse_active_connections	100/pod	1 → 3 replicas

📚 참고Reference

SSE Token v2 프로토콜 → SSE Token v2 Protocol → Token v2 XRANGE Recovery → Token v2 XRANGE Recovery →

🎯 Intent Node

Multi-Intent ICL (arxiv:2304.11384)

Chain-of-Intent (CIKM '25 논문)

IntentSignals = llm_confidence + keyword_boost + transition_boost

전이 부스트 상한: MAX_TRANSITION_BOOST = 0.15

신뢰도 < 0.6 → general fallback

Cache: SHA256 해싱, TTL 1h

🔀 Dynamic Router (Send API)

Multi-intent fanout: additional_intents → 병렬 Send

Intent enrichment: waste/bulk_waste → weather 자동

Conditional: user_location → weather 조건부

                            "종이 어떻게 버려? 수거함도"

                            → Send(waste_rag) + Send(collection_point) + Send(weather)

📦 Aggregator

병렬 결과 수집: 9개 context 필드 병합

Required/Optional: contracts.py SSOT

필수 누락: needs_fallback=True 트리거

Context Fields: disposal_rules, character_context, location_context, web_search_results, bulk_waste_context, recyclable_price_context, weather_context, collection_point_context, image_generation_context

📝 Summarize (Context Compression)

OpenCode 스타일: context_window - max_output 초과 시

GPT-5.2: 400K context, 트리거 272K, 요약 60K

구조화된 5섹션:

1. 사용자 요청 (원문)
2. 목표 및 기대 결과
3. 완료된 작업
4. 진행 중/남은 작업
5. 중요 컨텍스트

PRUNE_PROTECT: 40K 토큰 보호

11 Subagent Nodes

Node	Intent	Data Source	Protocol
waste_rag	waste	JSON 규정 + Feedback Loop	In-Memory
character	character	Character Service	gRPC
location	location	Kakao Local API	HTTP
bulk_waste	bulk_waste	행정안전부 MOIS API	HTTP
recyclable_price	recyclable_price	한국환경공단 KECO	HTTP
collection_point	collection_point	한국환경공단 KECO	HTTP
weather	weather / enrichment	기상청 KMA API	HTTP
web_search	web_search	DuckDuckGo/Tavily	HTTP
image_generation	image_generation	gemini-3-pro-image + Images gRPC	gRPC
feedback	(after waste_rag)	LLM Evaluator + Fallback	Internal
general	general	Passthrough (LLM Only)	-

왜 LangGraph를 단일 Task로 큐잉하는가?

@broker.task("chat.process")는 LangGraph StateGraph 전체를 하나의 Task 단위로 큐잉합니다. 개별 Task로 분리해 큐잉하지 않은 이유는 아래와 같습니다.

1. 런타임 동적 라우팅 — Send API는 실행 시점에 additional_intents와 Enrichment Rules를 평가해 병렬 노드를 결정. 큐잉 시점에 어떤 서브에이전트가 실행될지 예측 불가.
2. Conditional Edges — route_after_feedback처럼 노드 결과에 따라 다음 경로가 결정되는 조건부 엣지는 DAG를 정적으로 분할할 수 없음.
3. State 일관성 — LangGraph의 ChatState는 모든 노드가 공유하는 단일 상태 객체. 분산 Task 간 상태 동기화는 복잡도와 오버헤드를 증가시킴.

→ gRPC(grpc.aio)는 asyncio 네이티브로 Graph 내부에서 ~1-100ms 응답. Celery는 Fire & Forget에 적합하나, await 기반 결과 대기에는 부적합.

💾 3-Tier Memory (ReadThroughCheckpointer)

// 노드 실행마다 Checkpoint 저장 → 8개 서브에이전트 병렬 시 동시 write

                        Worker → Redis Primary (~1ms, pool 불필요)
                    
                        └─ SyncableRedisSaver: chat:checkpoint:cache:{thread_id} (TTL 24h)
                    
                        └─ Sync Queue 적재 (RPUSH checkpoint:sync:queue)
                    
                        CheckpointSyncService (별도 Deployment, replicas=1)
                    
                        └─ BRPOP → Bulk Upsert PostgreSQL (5초 주기 배치)
                    
                        └─ pool_max_size=5 (단일 인스턴스)
                    
                        Read Miss (Cold Start) → PostgreSQL → Redis promote (LRU)
                    
                        └─ Temporal Locality: 연속 요청 hit rate ≈ 99%+
                    
// Connection Pool: 212 → 33 (84% 감소), psycopg_pool.PoolTimeout 해결

→ 3-Tier Memory 상세 보기

Aggregator Architecture

┌──────────────────────────────────────────────────────────────────────────┐ │ Aggregator Node │ ├──────────────────────────────────────────────────────────────────────────┤ │ │ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ │ │ │waste_rag│ │character│ │location │ │bulk_... │ │ weather │ ... │ │ └────┬────┘ └────┬────┘ └────┬────┘ └────┬────┘ └────┬────┘ │ │ │ │ │ │ │ │ │ ▼ ▼ ▼ ▼ ▼ │ │ ┌─────────────────────────────────────────────────────────────┐ │ │ │ LangGraph State Auto-Merge │ │ │ │ disposal_rules | character_ctx | location_ctx | ... │ │ │ └─────────────────────────────────────────────────────────────┘ │ │ │ │ │ ▼ │ │ ┌─────────────────────────────────────────────────────────────┐ │ │ │ Aggregator Node │ │ │ │ 1. 수집된 context 필드 확인 (9개) │ │ │ │ 2. Required vs Optional 검증 (contracts.py SSOT) │ │ │ │ 3. 필수 누락 시 needs_fallback=True │ │ │ │ 4. Progress: 60% → 65% │ │ │ └─────────────────────────────────────────────────────────────┘ │ │ │ │ │ ▼ │ │ [summarize?] → answer │ │ │ └──────────────────────────────────────────────────────────────────────────┘

9 Context Fields

disposal_rules
RAG 검색 결과

character_context
캐릭터 정보

location_context
장소 정보

web_search_results
웹 검색 결과

bulk_waste_context
대형폐기물 정보

recyclable_price_context
재활용 시세

weather_context
날씨 정보

collection_point_context
수거함 위치

image_generation_context
이미지 생성

Required/Optional Validation

# contracts.py - Single Source of Truth
INTENT_REQUIRED_FIELDS = {
    "waste": {"disposal_rules"},
    "character": {"character_context"},
    "location": {"location_context"},
    "bulk_waste": {"bulk_waste_context"},
    ...
}

# Aggregator 검증 로직
missing_required, missing_optional = validate_missing_fields(
    intent=intent,
    collected_fields=collected_fields,
)
needs_fallback = len(missing_required) > 0

Fallback Trigger

필수 컨텍스트(Required)가 누락되면 needs_fallback=True를 설정하여 Answer Node에서 Fallback Chain(Web Search → General LLM)을 실행합니다.

OpenCode Style Compaction

┌──────────────────────────────────────────────────────────────────────────┐ │ Context Compression Strategy │ ├──────────────────────────────────────────────────────────────────────────┤ │ │ │ Current Tokens │ │ ══════════════════════════════════════════════════════════ 400K │ │ ████████████████████████████████████████░░░░░░░░░░░░░░░░░░ │ │ ↑ ↑ │ │ 272K (trigger) 128K (max_output) │ │ │ │ if tokens > (context_window - max_output): # OpenCode isOverflow │ │ trigger_summarization() │ │ │ │ ┌─────────────────────────────────────────────────────────────┐ │ │ │ After Compression │ │ │ │ ══════════════════════════════════════════════════════════ │ │ │ │ [Summary 60K] + [PRUNE_PROTECT 40K Recent] + [New Context] │ │ │ └─────────────────────────────────────────────────────────────┘ │ │ │ └──────────────────────────────────────────────────────────────────────────┘

Model Context Window Registry

Model	Context Window	Max Output	Trigger Threshold	Summary Tokens
gpt-5.2	400,000	128,000	272,000	60,000 (15%)
gemini-3-flash-preview	1,000,000	64,000	936,000	65,536 (cap)

구조화된 5섹션 요약

📝

1. 사용자 요청

원문 그대로

🎯

2. 목표

기대 결과

✅

3. 완료

수행 완료

🔄

4. 진행 중

남은 작업

⚠️

5. 컨텍스트

제약/실패

PRUNE_PROTECT: 40K 토큰

OpenCode 기준 상수. 최근 도구 출력과 메시지를 보호하여 작업 연속성을 유지합니다. 요약 시 최근 ~80개 메시지는 원문 유지됩니다.

Development Tools

⌨️

Cursor IDE

AI-First Code Editor

📊 Year Recap →

76일

사용 기간

10.66B

토큰 사용량

$7,473

총 비용

Ultra Plan

$200/월

🤖 모델별 사용량

Max Plan 주요 모델 37.5% GPT-5.1 21.6% Agent 16.1% Sonnet 4.5 13.9% Gemini-3 4.9% Others 6.1%

⬇️ Context Migration

세션 지식 → docs/ KB 축적 수동 패턴 → .claude/skills/ 코드화 Sidebar 병렬 + Worktree → Task 도구 병렬

Human-in-the-Loop 오케스트레이션에서 Agent Fleet 자율 운용으로 전환 Transition from Human-in-the-Loop orchestration to Agent Fleet autonomous operation

🧠

Claude Code

Anthropic CLI Agent

📊 Insights Report →

8,251

Sessions

51.4K

Messages

10.2K

Commits

+3.7M

Lines Added

Opus 4.5

Main Model

// Usage Stats (25 days, 8,251 sessions)
sessions/day      330    → 하루 평균 세션 수
messages/day    2,059    → 하루 평균 메시지
files_touched    32,796   → 수정된 파일 수
task_tool_calls   7,527   → 병렬 Task 호출
// Key Patterns (from Insights)
• Cross-System Architecture Analysis: 시스템 간 디자인 패턴 이식
• Parallel Task Orchestration: 멀티 세션 조율, Task 도구 7,500+ 사용
• Structured Documentation: Markdown 44K+ lines, TodoWrite 17K+ 호출
// Interaction Style
Semi-autonomous agent fleet 운영, 최소 사양으로 병렬 작업 시작 → 지능적 탐색 기대, 세션 자주 중단/재개하는 지속적 개발 워크플로우

Agent Workflow

┌─────────────────────────────────────────────────────────────────────────────────┐ │ 🤖 AI Agent Workflow │ ├─────────────────────────────────────────────────────────────────────────────────┤ │ │ │ ┌───────────────────┐ ┌───────────────────────────────────────┐ │ │ │ 🧑‍💻 Developer │────────▶│ 📋 Task Input │ │ │ │ (Direction) │ │ "chat-worker 리팩토링" / "K8s 디버깅" │ │ │ └───────────────────┘ └──────────────────┬────────────────────┘ │ │ ↓ │ │ ┌─────────────────────────────────────────────────────────────────────────┐ │ │ │ 🛠️ Skills (27개 모듈) │ │ │ │ ┌────────────┬────────────┬────────────┬─────────────┬────────────┐ │ │ │ │ │langgraph- │clean- │k8s- │prompt- │event- │ │ │ │ │ │pipeline │architecture│debugging │engineering │driven │ │ │ │ │ ├────────────┼────────────┼────────────┼─────────────┼────────────┤ │ │ │ │ │redis- │grpc- │git- │code- │celery- │ │ │ │ │ │patterns │service │workflow │review │rabbitmq │ │ │ │ │ └────────────┴────────────┴────────────┴─────────────┴────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────────────┘ │ │ ↓ │ │ ┌─────────────────────────────────────────────────────────────────────────┐ │ │ │ 📁 Context Sources │ │ │ │ │ │ │ │ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │ │ │ │ │ 📚 docs/ │ │ 💻 Codebase │ │ 📊 Cluster │ │ │ │ │ │ - foundations/ │ │ - Backend │ │ - kubectl │ │ │ │ │ │ - reports/ │ │ - Frontend │ │ - stern logs │ │ │ │ │ │ - blogs/ │ │ - Infra YAML │ │ - ArgoCD │ │ │ │ │ └─────────────────┘ └─────────────────┘ └─────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────────────┘ │ │ ↓ │ │ ┌─────────────────────────────────────────────────────────────────────────┐ │ │ │ 📤 Outputs │ │ │ │ │ │ │ │ ✅ Code Generation ✅ Refactoring ✅ Documentation │ │ │ │ ✅ Bug Fix ✅ Portfolio Update ✅ K8s Troubleshooting │ │ │ │ ✅ Test Writing ✅ PR Creation ✅ Architecture Docs │ │ │ └─────────────────────────────────────────────────────────────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────────────────────┘

📦 27개 Custom Skills — Sub Agent에 주입하여 도메인 전문성 부여

langgraph-pipeline clean-architecture k8s-debugging prompt-engineering event-driven redis-patterns grpc-service git-workflow code-review celery-rabbitmq chat-agent-flow rag-pipeline postgres-schema mcp-builder skill-creator doc-coauthoring chat-agent-persistence location-service agent-feature map-feature data-integrity code-quality troubleshooting webapp-testing vercel-react-best-practices resume-writer pdf

Sub agent Skills 주입 + Task 도구 활용

각 Skills는 Sub agent(Task 도구)에 주입되어 도메인 전문성을 부여합니다. Task 도구 7,500+ 호출이 관측되며, 병렬 포함 멀티파트 작업(아키텍처 분석·코드 생성·문서화)을 수행합니다.
Anthropic Insights: "task decomposition과 parallel coordination을 적극 활용하여 Claude를 one-shot 어시스턴트가 아닌 멀티파트 협업자로 운영" 📊 Insights →

Claude Code Cycle Mode

역할별 세션을 분리하고, shift+tab으로 Permission Mode를 순환(Plan → Auto-Accept → Normal)하며 운용합니다. Plan Mode에서 코드베이스를 read-only 분석한 뒤 방향을 확정하고, Auto-Accept로 전환해 무중단 실행합니다.

FE Session

UI 컴포넌트, SSE 스트리밍, IndexedDB 스키마, iOS 호환성

BE Session

도메인 로직, gRPC 서비스, LangGraph 에이전트, Worker 파이프라인

Observability

OTEL 트레이싱, K8s 매니페스트, 배포 파이프라인, 모니터링

운용 흐름:
1. Sub agent에 Skills 주입 + 병렬 Task 분배
2. ⏸ Plan Mode 코드베이스 분석·방향 확정
3. ⏵⏵ Auto-Accept 구현
4. Unit Test 작성
5. 린팅/포매팅 검증
6. Skills 코드리뷰
7. 클러스터 환경 E2E 테스트
8. PR 생성 → 개발자 리뷰 후 머지

Progressive Disclosure Pattern

L1: name + description (~100 words) → 항상 로드
L2: SKILL.md body (<5K words) → 스킬 트리거 시 로드
L3: references/, scripts/ → 필요 시 선택적 로드

류지환 (Jihwan Ryu)

Backend Engineer

Resume GitHub LinkedIn Blog

Tech Stack

Languages

Python, Go, C/C++, Solidity

Backend

FastAPI, LangGraph, Celery, RabbitMQ, Redis, PostgreSQL, gRPC, SSE

Infrastructure

Kubernetes, Istio, Calico VXLAN, Terraform, Ansible

DevOps / Observability

ArgoCD, Helm, GitHub Actions, Prometheus, Grafana, Jaeger, EFK, OTEL, LangSmith

Agent

Claude Code, Cursor

LLM API

OpenAI Agents SDK / Responses API, Gemini API

Education

부산대학교 컴퓨터공학과 학사 (2017–2023)

카카오테크 부트캠프 Backend 과정 (2024)

정보처리기사 (2024) OPIc IH (2024)

🏆 Awards

2025 AI 새싹톤 우수상

서울특별시 주최 · DACON 운영 | 본선 4위/32팀 (예선 181팀)

2025.12

Professional Experience

Company	Period	Title / Role	Description
Rakuten Symphony Korea 정규직 · Cloud BU	2024.12 – 2025.08	Cloud Engineer Jr. Storage Developer (Server) / Full-time	Rakuten CNP v5.5.0 — CNP 분산 스토리지 서버 개발, Rakuten Mobile(JP) + 1&1(DE) 20M 글로벌 사용자 Rakuten OStore v1.0.0 — Object Storage 분산 게이트웨이 개발, MinIO/S3 대체 엔터프라이즈 프로덕션
Eco² Project SeSACTHON 2025	2025.10.31 – 2026.01.28	Backend/Infra Lead Agent-Driven Workflow 설계 · 백엔드/인프라 개발	24노드 K8s 분산 비동기 클러스터 (8 APIs, 6 Workers, 3 Infra) · 설계 → 개발 → 배포 → 운영 📐 Performance Improvement • Auth Offloading — Istio EnvoyFilter + ext-authz(Go) JWT 검증 분리, 48 → 1,500 RPS (31×) • Event Bus Layer — Redis Streams + Pub/Sub 기반 실시간 전달, 동시 부하 600% 개선 ⚙️ Implementation • LangGraph StateGraph — Send API 기반 동적 병렬 라우팅, Multi-Agent 오케스트레이션 • 4-Stage Celery Pipeline — Vision/Rule/Answer/Reward, KEDA 이벤트 기반 오토스케일링 • 3-Tier Memory Architecture — Redis Hot + PostgreSQL Persistent + Summary Compression

Redis Primary + PostgreSQL Async Sync

┌───────────────────────────────────────────────────────────────────────────────┐ │ ReadThroughCheckpointer + CheckpointSyncService │ ├───────────────────────────────────────────────────────────────────────────────┤ │ │ │ ┌─────────────────────────────────────────────────────────────────────┐ │ │ │ L1: Redis Primary (~1ms, 20 max connections) │ │ │ │ ├─ Checkpoint: cp:{thread_id}:{ns}:{checkpoint_id} (TTL 24h) │ │ │ │ ├─ Intent Cache: chat:intent:{message_hash} (TTL 3600s) │ │ │ │ ├─ Sync Queue: checkpoint:sync:queue (LPUSH/BRPOP) │ │ │ │ └─ DLQ: checkpoint:sync:dlq (failed events) │ │ │ └─────────────────────────────────────────────────────────────────────┘ │ │ ↑ Write (Hot Path) ↓ Read Miss (Cold Start Only) │ │ │ │ │ │ Worker (2 pods × 40 tasks) │ pool 1-2 (read-only) │ │ ↓ │ │ ┌─────────────────────────────────────────────────────────────────────┐ │ │ │ L2: PostgreSQL Persistent (Async Batch Sync) │ │ │ │ ├─ CheckpointSyncService (별도 Deployment, replicas=1) │ │ │ │ ├─ 5초 주기 배치 동기화 (BRPOP → Bulk Upsert, batch 50) │ │ │ │ ├─ pool 1-5 (단일 인스턴스) │ │ │ │ └─ langgraph_checkpoints (permanent retention) │ │ │ └─────────────────────────────────────────────────────────────────────┘ │ │ ↓ On Read │ │ ┌─────────────────────────────────────────────────────────────────────┐ │ │ │ L3: LangGraph ChatState (45+ fields, 9 layers) │ │ │ │ ├─ messages: Annotated[list, add_messages] (accumulator) │ │ │ │ ├─ 10 Context Channels: disposal_rules, location, weather, ... │ │ │ │ ├─ intent_history: list[str] (Chain-of-Intent tracking) │ │ │ │ └─ summary: str (compressed context, 272K trigger) │ │ │ └─────────────────────────────────────────────────────────────────────┘ │ │ │ └───────────────────────────────────────────────────────────────────────────────┘

ReadThroughCheckpointer + SyncableRedisSaver

Connection Pool 절감 (psycopg_pool.PoolTimeout 해결)

Before (CachedPostgresSaver)

Worker pool: 192 (4 pods × 4 workers × 12)

Syncer: 없음

전체: 212 connections

8개 서브에이전트 병렬 checkpoint → PoolTimeout

After (ReadThroughCheckpointer)

Worker pool: 8 (Cold start only, size=2)

Syncer pool: 5 (단일 인스턴스)

전체: 33 connections (84% 감소)

Redis 99%+ hit rate (Temporal Locality)

🔥 L1: Redis Primary

Latency: ~1ms

Pattern: Read-Through

TTL: 86400s (Checkpoint), 3600s (Intent)

Worker Hot Path, pool 불필요

🗄️ L2: PostgreSQL (Async)

Sync: 5초 배치

Driver: psycopg_pool

Pool: max=5 (Syncer only)

영속 보장, 단일 Syncer 프로세스

⚡ L3: ChatState

Latency: ~100μs

Scope: Request

Type: TypedDict

현재 턴 컨텍스트

📊 구현 상세: ADR: PostgreSQL Checkpointer → Redis + Async Sync 전환

Token v2 Recoverable Streaming

┌──────────────────────────────────────────────────────────────────────────────┐ │ Token v2 Recoverable Streaming Flow │ ├──────────────────────────────────────────────────────────────────────────────┤ │ │ │ Client SSE Gateway Redis Streams Worker │ │ │ │ │ │ │ │ │ 1. SSE Connect │ │ │ │ │ ├─────────────────────▶│ │ │ │ │ │ │ 2. Subscribe │ │ │ │ │ ├─────────────────────▶│ │ │ │ │ │ │ │ │ │ │ │ │ 3. XADD token│ │ │ │ │ │◀──────────────┤ │ │ │ 4. SSE: id:1001 │ │ │ │ │ │◀─────────────────────┤◀─────────────────────┤ │ │ │ │ │ │ │ │ │ │ [Connection Lost] │ │ 5. XADD more │ │ │ │ X │ │◀──────────────┤ │ │ │ │ │ │ │ │ │ 6. Reconnect │ │ │ │ │ │ (last_id=1001) │ │ │ │ │ ├─────────────────────▶│ │ │ │ │ │ │ 7. XRANGE 1001 + │ │ │ │ │ ├─────────────────────▶│ │ │ │ │ 8. Missed tokens │ │ │ │ │ │◀─────────────────────┤◀─────────────────────┤ │ │ │ │ │ │ │ │ │ │ 9. Continue stream │ │ │ │ │ │◀─────────────────────┤ │ │ │ │ │ └──────────────────────────────────────────────────────────────────────────────┘

📤 Write Path (Worker)

1. LLM → astream chunks

2. Redis Streams XADD (seq-based)

3. Redis Pub/Sub PUBLISH (실시간)

4. State snapshot (10토큰마다)

maxlen: 1000, TTL: 1시간

📥 Reconnect Path (Client)

1. Last-Event-ID 헤더 전송

2. SSE Gateway XRANGE catch-up

3. Missing tokens replay

4. XREAD BLOCK 재개

Zero data loss guarantee

Dynamic Context Compression

⚙️ Trigger Conditions

                            threshold = context_window - max_output

                            GPT-5.2: 400K - 128K = 272K

                            Gemini-3: 1M - 64K = 936K

                            Summary size: 15% (min 20K, max 65K)

📋 5-Tier Summary Structure

1. User request (original)

2. Goal & expected result

3. Completed tasks

4. In-progress/remaining

5. Critical context (failures, constraints)

PRUNE_PROTECT: 40K tokens (recent)

NodeExecutor Policy System

NODE_POLICIES Configuration (코드베이스 기반)

Node	Timeout	Retry	CB Threshold	FailMode	Rationale
waste_rag	1000ms	1	5	FALLBACK	로컬 파일 검색 1초 이내
bulk_waste	10000ms	2	5	FALLBACK	MOIS API DEFAULT_TIMEOUT=15s
location	3000ms	2	5	FALLBACK	gRPC PostGIS ~100ms
general	30000ms	2	3	CLOSE	LLM HTTP_TIMEOUT.read=60s
character	3000ms	1	3	OPEN	gRPC LocalCache ~1-3ms
collection_point	10000ms	2	5	FALLBACK	KECO API DEFAULT_TIMEOUT=15s
weather	5000ms	1	3	OPEN	KMA API, 보조 정보
web_search	10000ms	2	5	FALLBACK	DuckDuckGo timeout=10s
recyclable_price	10000ms	2	5	FALLBACK	KECO API, 시세 정보
image_generation	30000ms	1	3	OPEN	DALL-E 10-30초 소요

🎚️ FailMode Enum

FAIL_OPEN: 실패해도 진행 (선택적 노드)

FAIL_CLOSE: 실패하면 전체 실패 (필수 노드)

FAIL_FALLBACK: 실패하면 대체 로직 실행

⚡ Circuit Breaker

States: CLOSED → OPEN → HALF_OPEN

Threshold: cb_threshold 연속 실패

Recovery: 60초 후 HALF_OPEN

allow_request(), record_success(), record_failure()

FALLBACK_CHAIN Configuration

Rule-based Quality Evaluation (4-Dimension)

📊 Scoring Weights

1. 결과 존재 여부0.3

2. 카테고리 매칭0.2

3. 정보 풍부도 (disposal_info)0.2

4. 키워드 매칭률0.3

🏷️ FeedbackQuality Enum

EXCELLENT: score ≥ 0.9

GOOD: 0.7 ≤ score < 0.9

FAIR: 0.4 ≤ score < 0.7

POOR: score < 0.4 → Fallback 트리거

Clarification Messages

💬 Intent별 메시지

waste: "어떤 물건의 분리수거 방법이 궁금하신가요? 🤔"

location: "어떤 위치 정보가 필요하신가요? 📍"

character: "이코에 대해 더 알고 싶으신 점이 있으신가요? 🌱"

general: "질문을 정확히 이해하지 못했어요. 🙏"

🔄 FallbackReason Enum

RAG_NO_RESULT → web_search

RAG_LOW_QUALITY → web_search

INTENT_LOW_CONFIDENCE → clarify

SUBAGENT_FAILURE → retry/skip

Swiss Cheese Model: 단일 Grader는 반드시 실패한다 Swiss Cheese Model: A Single Grader Must Fail

James Reason(1990)의 Swiss Cheese Model을 LLM Agent 평가에 적용. 사고는 단일 실패가 아닌 '다층 방어의 구멍이 동시에 정렬될 때' 발생합니다. Anthropic의 Responsible Scaling Policy와 동일한 Defense-in-Depth 전략을 따릅니다. Applied James Reason's Swiss Cheese Model (1990) to LLM Agent evaluation. Accidents occur not from single failures but when 'multiple defense layers' holes align simultaneously.' Following Anthropic's Responsible Scaling Policy Defense-in-Depth strategy.

직교 슬라이스 (Orthogonal Slices): 각 Grader가 서로 다른 failure mode를 포착. Code Grader는 형식/길이/키워드를, LLM Judge는 의미적 품질을, Calibration Monitor는 시간적 드리프트를 감지. Orthogonal Slices: Each Grader catches different failure modes. Code Grader detects format/length/keywords, LLM Judge evaluates semantic quality, Calibration Monitor detects temporal drift.

다중 평가자 합의 (Multi-Evaluator Consensus): Anthropic의 Bloom Automated Evaluations처럼 단일 평가자 편향을 Self-Consistency 3x + Cross-model 검증으로 완화. Multi-Evaluator Consensus: Like Anthropic's Bloom Automated Evaluations, mitigate single-evaluator bias via Self-Consistency 3x + Cross-model validation.

연속적 모니터링 (Continuous Monitoring): 배포 후에도 CUSUM 통계적 공정 제어로 품질 저하를 조기 탐지. 이전 방어층을 통과한 문제도 최종 방어선에서 포착. Continuous Monitoring: Post-deployment quality degradation early detection via CUSUM statistical process control. Catches issues that passed previous defense layers at the final line.

📊 LangGraph Evaluation Subgraph 📊 LangGraph Evaluation Subgraph

eval_entry : 평가 모드 결정 (sync/async/shadow) : Determines eval mode (sync/async/shadow)

code_grader : L1 동기 평가 (<50ms), Regex/Token/Keywords : L1 sync eval (<50ms), Regex/Token/Keywords

llm_grader : L2 비동기 평가 (~1-2s), 5-Axis BARS Rubric : L2 async eval (~1-2s), 5-Axis BARS Rubric

calibration_check : L3 주기적 드리프트 감지, CUSUM 알고리즘 : L3 periodic drift detection, CUSUM algorithm

eval_aggregator : 3개 Grader 결과 병합, 가중치 적용 : Merges 3 Grader results with weights

eval_decision : 최종 Grade 결정 (S/A/B/C), C-Grade 시 재생성 트리거 : Final Grade decision (S/A/B/C), triggers regeneration on C-Grade

📝 Mermaid 코드 보기 📝 View Mermaid Code

graph TD
    __start__([__start__]) --> eval_entry
    eval_entry -.-> calibration_check
    eval_entry -.-> code_grader
    eval_entry -.-> llm_grader
    calibration_check --> eval_aggregator
    code_grader --> eval_aggregator
    llm_grader --> eval_aggregator
    eval_aggregator --> eval_decision
    eval_decision --> __end__([__end__])

3-Tier Evaluation Architecture

┌─────────────────────────────────────────────────────────────────────────────────────┐ │ LLM Evaluation Pipeline (Swiss Cheese) │ ├─────────────────────────────────────────────────────────────────────────────────────┤ │ │ │ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │ │ │ L1: Code │ │ L2: LLM │ │ L3: Calibration│ │ │ │ Grader │─────▶│ Judge │─────▶│ Monitor │ │ │ │ <50ms │ │ ~1-2s │ │ Periodic │ │ │ └─────────────────┘ └─────────────────┘ └─────────────────┘ │ │ │ │ │ │ │ ▼ ▼ ▼ │ │ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │ │ │ Regex, Token │ │ 5-Axis BARS │ │ CUSUM Drift │ │ │ │ Count, Keywords │ │ Structured Eval │ │ Detection │ │ │ └─────────────────┘ └─────────────────┘ └─────────────────┘ │ │ │ │ Execution Modes: │ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ │ │ sync │ │ async │ │ shadow │ │ │ │ L1 only │ │ L1+L2/L3 │ │ Full eval│ │ │ │ Cost: 0 │ │ Cost:~800│ │Cost:~7500│ │ │ └──────────┘ └──────────┘ └──────────┘ │ └─────────────────────────────────────────────────────────────────────────────────────┘

⚡ L1: Code Grader

• Latency: <50ms (Critical Path)

• Regex, Token Count, Keywords

• Deterministic, Reproducible

• C-Grade → Regeneration 트리거

🧠 L2: LLM Judge (BARS)

• 5-Axis: Faithfulness(0.30), Relevance(0.25), Completeness(0.20), Safety(0.15), Communication(0.10)

• Structured Output (Pydantic)

• Self-Consistency 3x (CV<0.2)

📊 L3: Calibration Monitor

• CUSUM (k=0.5, h=4.0)

• Krippendorff's α ≥0.75 target

• 50-sample Calibration Set

• 0.5-point drift @ 80% power

🎯 5-Axis BARS Rubric (1-5 Scale)

Axis	Weight	Description
Faithfulness	0.30	Factual grounding in context
Relevance	0.25	Direct answer to query
Completeness	0.20	TREC Nugget-based coverage
Safety	0.15→0.25	Risk mitigation (hazmat boost)
Communication	0.10	Clarity and tone
Total	5^5=3,125	→ 4 Grades (S/A/B/C), ~9.61 bits info loss

99.8

Expert Review Score

Review Rounds

≥96

Target Threshold

69→99

Score Progression

📈 Expert Review Progression

• Round 1 (Initial): 69.4/100

• Round 2 (v2): 89.2/100 (+19.8)

• Round 3 (v2.1): 95.4/100 (+6.2)

• Round 4 (v2.2): 98.8/100 (+3.4) ✅

• Round 5 (Gap Fix): 99.8/100 ✅

🎭 Bias Mitigation (7 Techniques)

• Central Tendency: Logprob norm

• Leniency: 1-2pt failure behaviors

• Sycophancy: Separate prompts/axis

• Position: Randomized axis order

• Self-Consistency: 3x runs, CV<0.2

• Verbosity: Length↔Score guard

• Self-Enhancement: Cross-model

📊 Grade Thresholds

S ≥90 / A: 75-89 / B: 55-74 / C <55

💰 Cost Model (10K/day)

L1: $0 / L2: ~$20/mo / Full: ~$190/mo

🔄 Lifecycle Phases

Capability → Graduation → Regression → Refresh

🚨 Alerts

Cost >80%: WARN / CUSUM crit: 2h SLA

📚 Swiss Cheese Model for LLM Evaluation → 📚 LLM-as-Judge 루브릭 설계 → 📚 Chat Eval Pipeline Integration Plan → 📚 ADR: Chat LangGraph Eval Pipeline →

📹 API 동작 📹 API Action

4-Stage Celery Pipeline

👁️

Vision

OpenAI Vision API

~4.5s (40%)

📋

Rule

JSON 규정 검색

~0.3s (5%)

💬

Answer

LLM 응답 생성

~4.8s (45%)

🎁

Reward

캐릭터 매칭

~1.7s (10%)

Total: ~11.3s (LLM I/O-bound 85%)

📈 아키텍처 진화 과정 📈 Architecture Evolution

💀 Phase 0: Sync

Thread Pool: 6 concurrent

100 VU: 0% (150s+ timeout)

GIL 병목, 스레드 점유

❌ Phase 1: Celery Events

SSE : RabbitMQ = 1:21

50 VU → 341 connections

503 Error, 메모리 초과

⚠️ Phase 2: KEDA

50 VU: 35% → 86.3%

CPU 85%, Context Switch

고부하 시 30%로 하락

✅ Phase 3: Event Bus

1000 VU: 97.8% (포화 지점)

SLA: 500 VU 100%

O(n×m) → O(n) 연결

🏗️ Event Bus 아키텍처

4-Stage Celery Pipeline (scan.direct Exchange)

Scan API → scan.vision → scan.rule → scan.answer → scan.reward

Event Bus: Redis Streams + Pub/Sub

Worker XADD → scan:events:{shard} → Event Router → Pub/Sub → SSE Gateway

VU	완료율	처리량	E2E p95	API p95	Snapshot
500	100%	367.9 req/m	83.3s	232ms	⭐ SLA
600	99.7%	358.6 req/m	108.3s	360ms	Grafana
700	99.2%	329.1 req/m	122.3s	444ms	Live
800	99.7%	367.3 req/m	144.6s	734ms	Grafana
900	99.7%	405.5 req/m	149.6s	635ms	Grafana
1000	97.8%	373.4 req/m	173.3s	787ms	⚠️ 포화 지점

📚 상세 문서 📚 Deep Dive

⚙️

Pipeline Details

Celery Chain, Retry

🚌

Event Bus

Streams, Pub/Sub

📊

SSE Metrics

Load Test Results

KEDA 오토스케일링

Worker: RabbitMQ 큐 길이 (10-20msg, 2-4 pods) | SSE: 연결 수 (100/pod) | Router: Pending (100)

Vision API

GPT-5.2, detail: high, Structured Output

Cross-Domain Fanout

reward.events Exchange (Fanout) → character-worker + users-worker 동시 발행

Image Storage

S3 Presigned URL → CloudFront CDN, TTL 30일

📹 API 동작 📹 API Action

🌐 HTTP REST (External)

• GET /locations/centers

lat, lng, zoom → radius/limit 자동 결정

• GET /locations/search

Kakao 키워드 + PostGIS 반경 병합, 50m 중복 제거

• GET /locations/suggest

자동완성 (Kakao accuracy 정렬, max 5)

• GET /locations/centers/{id}

상세 조회 + Kakao place_url 보강

⚡ gRPC (Internal)

• SearchNearby(lat, lng, zoom)

Chat Worker → Location 내부 호출

🗺️ Kakao Local API

• KakaoLocalClientPort (Port/Adapter)

httpx.AsyncClient + Lazy Init + asyncio.Lock

일일 30만 건, place_url 제공

Zoom Policy Service (Continuous 1-20)

Zoom 1

50,000m

limit: 10

Zoom 7

20,000m

limit: 30

Zoom 12

3,000m

limit: 80

Zoom 16

500m

limit: 150

Zoom 20

200m

limit: 200

ZOOM_RADIUS_TABLE + ZOOM_LIMIT_TABLE: 줌 레벨별 반경/개수 연속 매핑

DB-First Merge Search

// SearchByKeywordQuery (DB-First Merge)
Kakao 키워드 검색 (accuracy 정렬) → 중심 좌표 획득
획득 좌표 기반 Kakao 재검색 (distance 정렬)
PostGIS earth_distance() 반경 쿼리 → DB 결과 (메타데이터 풍부)
50m Haversine 좌표 비교로 DB↔Kakao 중복 제거
DB 우선 병합 + Kakao 보충 (음수 ID = Kakao only, source 구분)
distance_km 기준 최종 정렬 → 응답
// DB: 수거품목·운영시간·소개 등 풍부한 메타데이터 보유 → 우선 노출

Suggest (자동완성)

300ms debounce → Kakao accuracy 정렬

max 5건, 서버사이드 (API Key 보호)

Detail 보강

DB 조회 후 Kakao place_url·전화번호 병합

Kakao 실패 시 DB 단독 응답 (Graceful Degradation)

길찾기

Geolocation(maxAge:60s, timeout:3s)

from/현위치/to/목적지 → 권한 거부 시 목적지만

PostGIS Query

earth_distance() + ll_to_earth(), cube/earthdistance extension, Haversine fallback

Kakao 중복 제거

50m Haversine 기반 좌표 비교, DB 우선 병합, 음수 ID로 source 구분

HTTP Client

httpx.AsyncClient, Lazy Init + asyncio.Lock, Graceful Degradation

MapSearchBar

300ms debounce → suggest 드롭다운, Enter/버튼 → full search 분기

MapCard + DetailSheet

source별 아이콘(keco/zerowaste/kakao), 첫 클릭: 선택 · 재클릭: 바텀시트, 수거품목·안내·place_url

StoreCategoryFilter

9개 store_category, 임시 상태 → "결과보기" 클릭 시 API 재호출

PWA Viewport

JS standalone 감지 (CSS media query iOS 불안정), 100dvh + env(safe-area-inset-*)

Chat Integration

location_node HTTP 호출, 반경 5km (radius=5000), limit=5

📹 API 동작 📹 API Action

🌐 HTTP REST (External)

• GET /me - 프로필 조회

• PATCH /me - 프로필 수정

• DELETE /me - 계정 삭제

• GET /me/characters - 소유 캐릭터

• GET /me/characters/{name}/ownership

⚡ gRPC (for Auth)

• GetOrCreateFromOAuth() - OAuth 로그인

• GetUser() - 사용자 조회

• UpdateLoginTime() - 로그인 시각 갱신

Character Reward Flow (스캔 기반 자동 매칭)

스캔 완료 → 분류 결과 (middle_category) → 캐릭터 매칭 → 소유권 확인 → UserCharacter 저장

매칭 로직: 분류 결과의 중분류(middle_category)로 캐릭터 자동 매칭
예: "플라스틱" → 플라스틱 캐릭터, "종이류" → 종이 캐릭터

gRPC Auth Integration

OAuth 로그인 시 GetOrCreateFromOAuth() 호출, UpdateLoginTime() 갱신

Character Ownership

user_characters 다대다 테이블, acquired_at + status (owned/burned/traded)

Reward Trigger

Scan 완료 → Character 도메인 EvaluateRewardCommand 호출

Default Character

첫 로그인 시 기본 캐릭터 자동 부여

📹 API 동작 📹 API Action

Presigned URL Upload Flow

┌──────────────────────────────────────────────────────────────────────────┐ │ S3 Presigned URL Upload Flow │ ├──────────────────────────────────────────────────────────────────────────┤ │ │ │ Client Images API S3 CloudFront │ │ │ │ │ │ │ │ │ 1. Request URL │ │ │ │ │ ├─────────────────────▶│ │ │ │ │ │ │ │ │ │ │ │ 2. Presigned URL │ │ │ │ │ │ + CDN URL │ │ │ │ │ │◀─────────────────────┤ │ │ │ │ │ │ │ │ │ │ │ 3. Direct Upload (PUT) │ │ │ │ ├───────────────────────────────────────▶│ │ │ │ │ (15분 만료) │ │ │ │ │ │ │ │ │ │ │ │ 4. Upload OK │ │ │ │ │ │◀───────────────────────────────────────┤ │ │ │ │ │ │ │ │ │ │ 5. Access via CDN │ │ │ │ │ ├─────────────────────────────────────────────────────────────▶│ │ │ │ │ │ │ │ │ │ 6. Cached Image │ │ │ │ │ │◀─────────────────────────────────────────────────────────────┤ │ │ │ └──────────────────────────────────────────────────────────────────────────┘

Image Channels (ImageChannel enum)

📷 scan

scan/{uuid}.{ext}

분류 스캔 이미지

💬 chat

chat/{uuid}.{ext}

채팅 첨부 이미지

👤 my

my/{uuid}.{ext}

프로필/개인 이미지

Presigned URL

PUT only, 15분 만료 (presign_expires_seconds=900)

CloudFront CDN

images.dev.growbin.app, OAI (Origin Access Identity)

S3 Path

{channel}/{uuid}.{ext}, channel: scan | chat | my

Security

IRSA + Bucket Policy (CloudFront OAI only), S3 Direct PUT

📹 API 동작 📹 API Action

CQRS + Cache Aside Architecture

┌──────────────────────────────────────────────────────────────────────────────────────┐ │ Info Domain - CQRS + Cache Aside │ ├──────────────────────────────────────────────────────────────────────────────────────┤ │ │ │ ╔═══════════════════════════════════════════════════════════════════════════════╗ │ │ ║ WRITE PATH (info_worker Pod) ║ │ │ ╠═══════════════════════════════════════════════════════════════════════════════╣ │ │ ║ ║ │ │ ║ ┌─────────────┐ ┌─────────────────────────────────────────────────────┐ ║ │ │ ║ │Beat Sidecar │ │ Celery Worker (gevent -c 100) │ ║ │ │ ║ │ │ │ │ ║ │ │ ║ │ • 5min Naver│───▶│ CollectNewsCommand │ ║ │ │ ║ │ • 30min News│ │ │ │ ║ │ │ ║ │ Data.io │ │ ▼ │ ║ │ │ ║ │ │ │ ┌─────────────┐ ┌─────────────┐ │ ║ │ │ ║ │ │ │ │ Naver API │ │NewsData.io │ API Calls │ ║ │ │ ║ └─────────────┘ │ │ (3x, 1.10s) │ │(3x, 0.69s) │ (1.79s) │ ║ │ │ ║ │ └──────┬──────┘ └──────┬──────┘ │ ║ │ │ ║ emptyDir: │ └────────┬────────┘ │ ║ │ │ ║ /tmp/celerybeat │ ▼ │ ║ │ │ ║ │ ┌─────────────────┐ │ ║ │ │ ║ │ │ OG Extraction │ httpx.Client │ ║ │ │ ║ │ │ 95/110 = 86.4% │ (14.53s) ⚠️ bottleneck │ ║ │ │ ║ │ └────────┬────────┘ │ ║ │ │ ║ │ ▼ │ ║ │ │ ║ │ ┌─────────────────────────────────────────────┐ │ ║ │ │ ║ │ │ PostgreSQL UPSERT (0.20s) │ │ ║ │ │ ║ │ │ psycopg2 + ThreadedConnectionPool(min=2,max=10)│ ║ │ │ ║ │ └────────────────────┬────────────────────────┘ │ ║ │ │ ║ │ ▼ │ ║ │ │ ║ │ ┌─────────────────────────────────────────────┐ │ ║ │ │ ║ │ │ Redis Write-Through (0.04s) │ │ ║ │ │ ║ │ │ TTL: 3600s (news_cache_ttl, 모든 캐시 동일) │ │ ║ │ │ ║ │ └─────────────────────────────────────────────┘ │ ║ │ │ ║ └─────────────────────────────────────────────────────┘ ║ │ │ ║ ║ │ │ ║ Total: 110 articles fetched → 110 cached → 16.95s/task ║ │ │ ╚═══════════════════════════════════════════════════════════════════════════════╝ │ │ │ │ ╔═══════════════════════════════════════════════════════════════════════════════╗ │ │ ║ READ PATH (info API - Cache Aside) ║ │ │ ╠═══════════════════════════════════════════════════════════════════════════════╣ │ │ ║ ║ │ │ ║ ┌──────────┐ ┌─────────────┐ ┌─────────────┐ ║ │ │ ║ │ Client │────▶│ Info API │────▶│ Redis │──── HIT ────▶ Response ║ │ │ ║ │ (React) │ │ (FastAPI) │ │ (Primary) │ source: "redis" ║ │ │ ║ └──────────┘ │ │ └──────┬──────┘ ║ │ │ ║ │ redis.aio │ │ MISS ║ │ │ ║ │ psycopg2 │ ▼ ║ │ │ ║ └─────────────┘ ┌─────────────┐ ║ │ │ ║ │ PostgreSQL │──── Fallback ──▶ Resp ║ │ │ ║ │ (Emergency) │ source: "postgres" ║ │ │ ║ └─────────────┘ ║ │ │ ║ ║ │ │ ║ Zero-downtime: Redis 장애 시에도 PostgreSQL Fallback으로 서비스 유지 ║ │ │ ╚═══════════════════════════════════════════════════════════════════════════════╝ │ │ │ └──────────────────────────────────────────────────────────────────────────────────────┘

110

Articles/Task

86.4%

OG Success

16.95s

Task Duration

14.53s

OG Bottleneck

Tech Stack Decisions

✏️ Worker (Sync)

• psycopg2 - ThreadedConnectionPool

• redis-py - 단순 연산, 비동기 불필요

• httpx.Client - OG 추출 connection pool

• Beat Sidecar - 스케줄링 분리

📖 API (Async)

• redis.asyncio - 비동기 캐시 조회

• Cache-Aside - Redis miss → PostgreSQL fallback

• source 메타데이터 - "redis"/"postgres"

• useInfiniteQuery - 무한스크롤

🐰 RabbitMQ Queue (Topology CR)

Queue: info.collect_news
Type: classic
TTL: 10min
DLX: dlx → dlq.info.collect_news

📂 News Categories

🌍 environment

⚡ energy

🤖 ai

📊 상세 구현: Info 서비스 CQRS + Cache Aside 패턴

87%

P50 지연시간 감소

72%

P99 지연시간 감소

28x

RPS 처리량 향상

>99%

Cache Hit Rate

🔀 Global Choke Point — 왜 ext-authz가 병목인가? 🔀 Global Choke Point — Why ext-authz is the bottleneck?

┌──────────────────────────────────────────────────────────────────────────────┐ │ ext-authz Global Choke Point │ ├──────────────────────────────────────────────────────────────────────────────┤ │ │ │ [Client] │ │ │ │ │ │ 1. HTTPS Request (Cookie: s_access=JWT) │ │ ▼ │ │ ┌─────────────────────────────────────────────────────────────────────┐ │ │ │ Istio Ingress Gateway │ │ │ │ ┌───────────────────────────────────────────────────────────────┐ │ │ │ │ │ EnvoyFilter: Cookie → Header 변환 │ │ │ │ │ │ Authorization: Bearer {JWT} │ │ │ │ │ └───────────────────────────────────────────────────────────────┘ │ │ │ └────────────────────────────┬────────────────────────────────────────┘ │ │ │ │ │ │ 2. gRPC Check (every request) │ │ ▼ │ │ ┌─────────────────────────────────────────────────────────────────────┐ │ │ │ ext-authz (Go gRPC Server) │ │ │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────────────┐ │ │ │ │ │ JWT Parse │→ │ JTI Extract │→ │ Blacklist Check (Local/Redis)│ │ │ │ │ │ (HS256) │ │ │ │ IsBlacklisted(jti) → O(1) │ │ │ │ │ └─────────────┘ └─────────────┘ └─────────────────────────────┘ │ │ │ └────────────────────────────┬────────────────────────────────────────┘ │ │ │ │ │ ┌─────────────────┴─────────────────┐ │ │ │ │ │ │ ▼ ▼ │ │ ┌──────────────────┐ ┌──────────────────┐ │ │ │ 3a. OK (200) │ │ 3b. DENY (401) │ │ │ │ + x-user-id │ │ Unauthorized │ │ │ │ + x-auth-provider│ └──────────────────┘ │ │ └────────┬─────────┘ │ │ │ │ │ │ 4. Forward to Backend (with injected headers) │ │ ▼ │ │ ┌─────────────────────────────────────────────────────────────────────┐ │ │ │ Backend Services (scan-api, chat-api, character-api, ...) │ │ │ │ • JWT 파싱 불필요 — 헤더에서 user_id 추출만 │ │ │ │ • 도메인 독립성 확보 — 인증 로직 분리 │ │ │ └─────────────────────────────────────────────────────────────────────┘ │ │ │ │ ⚠️ 모든 인증 요청이 ext-authz 통과 필수 → 시스템 처리량 = ext-authz 용량 │ │ │ └──────────────────────────────────────────────────────────────────────────────┘

⚠️ Global Choke Point: 모든 인증 트래픽이 ext-authz를 경유하므로, 시스템 전체 처리량이 ext-authz 성능에 종속됩니다. 이를 해결하기 위해 Local Cache + Fanout Broadcast 패턴으로 Redis 의존성을 제거했습니다.

📈 아키텍처 진화 과정 📈 Architecture Evolution

💀 Phase 0: Shared Module

각 도메인 JWT 파싱

독립 배포 불가

분산 모놀리스

❌ Phase 1: Redis Every Req

P50: 57ms, P99: 80ms

RPS: ~42

Redis 병목

⚠️ Phase 2: Pool + HPA

Pool 20→100, HPA 2-5

RPS: ~1,200 (28×)

P99 Redis 250-350ms

✅ Phase 3: Local Cache

sync.Map + MQ Fanout

RPS: ~1,500, Lookup 2.3µs

Redis 호출 0 (870×)

Local Cache Broadcast Pattern

┌─────────────────────────────────────────────────────────────────────────────┐ │ ext-authz Local Cache Broadcast │ ├─────────────────────────────────────────────────────────────────────────────┤ │ │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │ │ Pod A │ │ Pod B │ │ Pod C │ │ │ │ ┌─────────┐ │ │ ┌─────────┐ │ │ ┌─────────┐ │ │ │ │ │sync.Map │ │ │ │sync.Map │ │ │ │sync.Map │ │ ~100ns lookup │ │ │ │(Local) │ │ │ │(Local) │ │ │ │(Local) │ │ │ │ │ └────┬────┘ │ │ └────┬────┘ │ │ └────┬────┘ │ │ │ └──────┼──────┘ └──────┼──────┘ └──────┼──────┘ │ │ │ │ │ │ │ └───────────────────┼───────────────────┘ │ │ │ │ │ ┌──────▼──────┐ │ │ │ RabbitMQ │ Fanout Exchange │ │ │ (blacklist │ (blacklist.events) │ │ │ .events) │ │ │ └──────┬──────┘ │ │ │ │ │ ┌──────▼──────┐ │ │ │ Auth Relay │ Redis Fallback Outbox → Fanout Broadcast │ │ │ (Fallback Outbox) │ │ │ └─────────────┘ │ │ │ │ Flow: User Logout → Auth Relay (Fallback Outbox) → RabbitMQ Fanout → All Pods │ │ → sync.Map Update → Next Request: ~100ns (No Redis!) │ │ │ └─────────────────────────────────────────────────────────────────────────────┘

Performance Evolution

Before (v1) - Every Request → Redis

• P50 Latency: 57ms

• P99 Latency: 80ms

• Max RPS: ~1,200

모든 요청이 Redis 조회 필요

After (v2) - Local Cache First

• P50 Latency: 7.5ms

• P99 Latency: 30ms

• Max RPS: ~1,500

Cache Hit 시 Redis 조회 불필요

Go Implementation

Envoy ext_authz gRPC, sync.Map (lock-free), goroutine consumer

JWT Validation

HS256 서명 검증 → JTI Blacklist 체크 (Local → Redis fallback)

Istio Integration

AuthorizationPolicy CUSTOM provider, EnvoyFilter ext-authz-grpc

Fanout Broadcast

RabbitMQ Fanout Exchange, 각 Pod 고유 큐 자동 생성

📚 Auth Offloading 설계 | 성능 최적화 | Local Cache Broadcast | 부하 테스트 결과

Token v2 Recoverable Streaming (XRANGE catch-up)

┌──────────────────────────────────────────────────────────────────────────────┐ │ Token v2 Recoverable Streaming │ ├──────────────────────────────────────────────────────────────────────────────┤ │ │ │ Client SSE Gateway Redis Streams Chat Worker │ │ │ │ │ │ │ │ │ 1. SSE Connect │ │ │ │ │ ├─────────────────────▶│ │ │ │ │ │ │ 2. XREAD BLOCK │ │ │ │ │ ├─────────────────────▶│ │ │ │ │ │ │◀───────────────┤ XADD │ │ │ id:1001 "Hello" │◀─────────────────────┤ │ │ │ │◀─────────────────────┤ │ │ │ │ │ │ │◀───────────────┤ XADD │ │ │ id:1002 "World" │◀─────────────────────┤ │ │ │ │◀─────────────────────┤ │ │ │ │ │ │ │ │ │ │ │ [Connection Lost] │ │◀───────────────┤ XADD │ │ │ X │ │ id:1003 "!" │ │ │ │ │ │ (stored) │ │ │ │ │ │ │ │ │ │ 3. Reconnect │ │ │ │ │ │ Last-Event-ID:1002 │ │ │ │ │ ├─────────────────────▶│ │ │ │ │ │ │ 4. XRANGE 1002 + │ │ │ │ │ ├─────────────────────▶│ │ │ │ │ │ │ │ │ │ │ 5. Missed: id:1003 │ (catch-up) │ │ │ │ │◀─────────────────────┤◀─────────────────────┤ │ │ │ │ │ │ │ │ │ │ Continue streaming │ XREAD BLOCK │ │ │ │ │◀─────────────────────┤◀─────────────────────┤ │ │ │ │ └──────────────────────────────────────────────────────────────────────────────┘

SSE Event Types

📝 token

LLM 토큰 스트리밍

🔄 token_recovery

재연결 시 토큰 복구

🎯 stage

19개 파이프라인 스테이지

💓 keepalive

15초 연결 유지

🖐️ needs_input

HITL 사용자 입력 대기

❌ error

에러 발생

Stages: queued → intent → vision → waste_rag → character → location → kakao_place → bulk_waste → weather → recyclable_price → collection_point → web_search → image_generation → general → aggregator → feedback → answer → done → needs_input

Multi-Agent SSE Timeline (seq 네임스페이스 분리)

// Stage seq: 0~180 (18 stages × 10) | Token seq: 1000+ (별도 네임스페이스)
// server_id: POD_NAME → Consumer Group 내 consumer 식별 (event-router-0, -1, ...)
T1  seq:0    stage: queued          → 접수 확인
T2  seq:10   stage: intent          → 9분류 Intent 분석
T3  seq:20   stage: vision          → GPT Vision 분류 (이미지 있을 때)
// ─── Send API 병렬 Fanout (intent별 서브에이전트 동시 실행) ───
T4a seq:30   stage: waste_rag       → 폐기물 RAG 검색
T4b seq:40   stage: character       → 캐릭터 도메인 주입
T4c seq:50   stage: location        → 위치 gRPC 조회
T4d seq:60   stage: kakao_place     → Kakao 장소 검색
T4e seq:70   stage: bulk_waste      → 대형폐기물 MOIS
T4f seq:80   stage: weather         → 날씨 조회
T4g seq:90   stage: recyclable_price → KECO 시세
T4h seq:100  stage: collection_point → 수거함 검색
T4i seq:110  stage: web_search      → 네이티브 웹 검색
T4j seq:120  stage: image_generation → 이미지 생성
T4k seq:130  stage: general         → 일반 Fallback
// ─── Aggregation → Answer ───
T5  seq:140  stage: aggregator      → 서브에이전트 결과 병합
T6  seq:150  stage: feedback        → RAG 품질 평가
T7  seq:1000+ type:  token          → LLM 토큰 스트리밍 (N개, answer_node)
T8  seq:160  stage: answer         → 답변 완료
T9  seq:170  stage: done           → 파이프라인 종료
// stream_id = Redis Stream Entry ID → SSE id: 필드 → Last-Event-ID 복구 기준
// T4a~T4k: Send API 동적 생성, intent에 따라 선택적 병렬 실행

ID 관계 정의

chat_id

채팅 세션 UUID

세션 유지 동안 불변

job_id

메시지별 작업 UUID

Shard key + Pub/Sub 채널 + State key

stream_id

Redis Stream Entry ID

SSE id: 필드 → Last-Event-ID 복구 기준

server_id (POD_NAME)

Consumer Group consumer 이름

event-router-0, -1, ... Pod별 할당

Dual Publishing

Worker: XADD (Streams) + PUBLISH (Pub/Sub) 동시 발행

Recovery Flow

Last-Event-ID 헤더 → XRANGE catch-up → XREAD BLOCK 재개

Nginx Config

X-Accel-Buffering: no, proxy_buffering off, chunked transfer

Scaling

Stateless HPA, Pub/Sub 기반 모든 인스턴스 동기화

📊 구현 상세: SSE Gateway 아키텍처 | Token v2 Recoverable Streaming | Event Router/SSE-Gateway 무결성

Event Flow Architecture

┌──────────────────────────────────────────────────────────────────────────────┐ │ Event Router Architecture │ ├──────────────────────────────────────────────────────────────────────────────┤ │ │ │ ┌─────────────────────────────────────────────────────────────────────────┐ │ │ │ Event Producers │ │ │ │ ┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐ │ │ │ │ │ Auth │ │ Chat │ │ Scan │ │ Info │ │ Users │ │ │ │ │ │ Events │ │ Events │ │ Events │ │ Events │ │ Events │ │ │ │ │ └───┬────┘ └───┬────┘ └───┬────┘ └───┬────┘ └───┬────┘ │ │ │ └──────┼────────────┼───────────┼────────────┼────────────┼──────────────┘ │ │ │ │ │ │ │ │ │ └────────────┴───────────┴────────────┴────────────┘ │ │ │ │ │ ┌──────▼──────┐ │ │ │ XADD │ events:{domain} │ │ └──────┬──────┘ │ │ │ │ │ ┌───────────────────────────────▼───────────────────────────────────────┐ │ │ │ Redis Streams │ │ │ │ ┌─────────────────────────────────────────────────────────────────┐ │ │ │ │ │ events:auth │ events:chat │ events:scan │ events:info │ │ │ │ │ └───────────────┴───────────────┴───────────────┴─────────────────┘ │ │ │ └───────────────────────────────┬───────────────────────────────────────┘ │ │ │ │ │ ┌─────────────▼─────────────┐ │ │ │ Consumer Group │ │ │ │ (event-router-group) │ │ │ └─────────────┬─────────────┘ │ │ ┌──────────────────────┼──────────────────────┐ │ │ ▼ ▼ ▼ │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │ │ consumer-1 │ │ consumer-2 │ │ consumer-3 │ │ │ │ (Pod A) │ │ (Pod B) │ │ (Pod C) │ │ │ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ │ │ │ │ │ │ │ └──────────────────────┴──────────────────────┘ │ │ │ │ │ ┌──────▼──────┐ │ │ │ State KV │ state:{entity}:{id} │ │ │ (Redis Hash)│ │ │ └─────────────┘ │ │ │ └──────────────────────────────────────────────────────────────────────────────┘

Domain Event Types

🔐 auth

blacklist:add, blacklist:remove

💬 chat

status_changed, job_completed, job_failed

📷 scan

queued, vision, rule, answer, reward, done

👤 users

default_character_grant

🐾 character

full_refresh, upsert, delete

📰 info

cache_set, cache_refresh

Consumer Group 특징

Success-Only XACK

성공 시에만 XACK, 실패 시 PEL 유지 → Reclaimer 재처리

Lua Script 원자성

State: 최신 seq만 유지, router:published:{job_id}:{seq} 멱등성 키

Reclaimer (XAUTOCLAIM)

실패 메시지 주기적 재할당, 멀티 도메인 asyncio.gather 병렬 처리

XREADGROUP

BLOCK 5000ms, COUNT 10, 다중 스트림 동시 소비

State KV

Redis Hash, state:{entity}:{id}, HINCRBY 카운터, HSET 상태

ACK 정책

process_event 성공 시에만 XACK, 실패 시 continue (PEL 잔류 → Reclaimer 대상)

Reclaimer

XAUTOCLAIM idle 60s, stream_id/stream_name 주입, 멀티 도메인 병렬 (asyncio.gather)

멱등성

Lua Script: router:published:{job_id}:{seq} 키 존재 시 skip, Pub/Sub 중복 발행 방지

Monitoring

XPENDING lag 추적, domain+shard 라벨 분리, consumer별 메트릭

📊 구현 상세: Event Router/SSE-Gateway 무결성 개선

┌─────────────────────────────────────────────────────────────────────────────────┐ │ Eco² Platform Architecture (24 Nodes + AWS) │ ├─────────────────────────────────────────────────────────────────────────────────┤ │ │ │ ┌─────────────────────────────── AWS ──────────────────────────────────────┐ │ │ │ ┌─────────────┐ │ │ │ │ │ Client │ │ │ │ │ │ (PWA/Web) │ │ │ │ │ └──────┬──────┘ │ │ │ │ ┌────────────────────────────┼────────────────────────┐ │ │ │ │ ▼ ▼ ▼ │ │ │ │ ┌──────────────┐ ┌──────────────────┐ ┌──────────────┐ │ │ │ │ │ CloudFront │ │ Route 53 │ │ S3 │ │ │ │ │ │ images.*.app│ │ *.dev.growbin.app│ │ (Images) │ │ │ │ │ └──────┬───────┘ └────────┬─────────┘ └──────────────┘ │ │ │ │ │ │ │ │ │ │ │ ┌────────▼─────────┐ │ │ │ │ │ │ AWS ALB │ (ACM *.growbin.app) │ │ │ │ │ │ SSL Termination │ │ │ │ │ │ └────────┬─────────┘ │ │ │ └─────────┼──────────────────────────┼─────────────────────────────────────┘ │ │ │ │ │ │ ══════════════════════════════════════════════════════════════════════════ │ │ KUBERNETES CLUSTER (25 NODES) │ │ ══════════════════════════════════════════════════════════════════════════ │ │ │ ┌────────▼─────────┐ │ │ │ │ Istio Ingress │ │ │ │ │ Gateway │ │ │ │ └────────┬─────────┘ │ │ │ ┌────────▼─────────┐ │ │ │ │ ext-authz (Go) │ JWT + Redis Blacklist │ │ │ └────────┬─────────┘ │ │ │ ┌────────────────────┼─────────────────────┐ │ │ │ ▼ ▼ ▼ │ │ │ ┌──────┐┌──────┐┌──────┐┌──────┐┌──────┐┌──────┐┌──────┐ │ │ │ │ auth ││ scan ││ chat ││ char ││ loc ││users ││images│ │ │ └─│ API ││ API ││ API ││ API ││ API ││ API ││ API │─────┐ │ │ └──┬───┘└──┬───┘└──┬───┘└──┬───┘└──┬───┘└──┬───┘└──────┘ │ │ │ │ │ │ │ │ │ (S3 URL) │ │ │ ▼ ▼ │ │ │ │ │ │ ┌─────────────────────────────────────┐ │ │ │ │ Workers (8종) │ │ │ │ │ scan_worker │ chat_worker (AI) │ │ │ │ │ auth_worker │ auth_relay (Fallback Outbox) │◄──gRPC─────────────┘ │ │ │ character_worker │ char_match_worker│ │ │ │ │ users_worker │ celery_beat (DLQ) │ │ │ │ └─────────────────┬───────────────────┘ │ │ ▼ ▼ │ │ ┌──────────────────────────────────────────────────────────────────────┐ │ │ │ SSE Gateway (Pub/Sub→SSE) │ Event Router (Streams→Pub/Sub) │ KEDA │ │ │ └──────────────────────────────────────────────────────────────────────┘ │ │ ┌──────────────────────────────────────────────────────────────────────┐ │ │ │ RabbitMQ (Quorum) │ Redis Streams │ Redis Pub/Sub │ gRPC (mTLS) │ │ │ └──────────────────────────────────────────────────────────────────────┘ │ │ ┌──────────────────────────────────────────────────────────────────────┐ │ │ │ PostgreSQL (Bitnami) │ Redis Cache │ Redis Auth │ Redis State KV │ │ │ └──────────────────────────────────────────────────────────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────────────────────┘

Layer	Component	Nodes	Purpose
AWS External	Route53, ALB, CloudFront, S3, ACM	-	DNS, L7 LB, CDN, Storage, SSL
Edge	Istio Ingress, ext-authz	k8s-ingress	트래픽 라우팅, JWT 인증
Service	7 APIs + 2 Gateways	k8s-api-*	비즈니스 로직
Integration	RabbitMQ, Redis, gRPC	k8s-worker-*	비동기/이벤트/동기 통신
Persistence	PostgreSQL, Redis (4종)	k8s-data	데이터 영속화
Platform	ArgoCD, KEDA, Observability	k8s-platform	오케스트레이션

🖥️ 24-Node Cluster Topology

Master: 1 node (control-plane)
API Nodes: 8 nodes (auth, users, scan, chat, character, location, image, info)
Worker Nodes: 4 nodes (storage×2, ai×2) — 8종 워커 배포
Data Nodes: 5 nodes (PostgreSQL, Redis×4)
Platform Nodes: 6 nodes (RabbitMQ, Monitoring, Logging, Ingress, SSE, Event Router)

Category	Node Name	Instance	vCPU	RAM	Storage	Purpose
Control Plane	k8s-master	t3.xlarge	4	16GB	80GB	API Server, etcd, Scheduler, Controller
API Nodes	k8s-api-auth	t3.small	2	2GB	20GB	JWT 인증/인가, OAuth
	k8s-api-users	t3.small	2	2GB	20GB	사용자/마이페이지
	k8s-api-scan	t3.medium	2	4GB	30GB	폐기물 스캔 (AI)
	k8s-api-chat	t3.medium	2	4GB	30GB	챗봇 (LangGraph Agent 🔄)
	k8s-api-character	t3.small	2	2GB	20GB	캐릭터 카탈로그
	k8s-api-location	t3.small	2	2GB	20GB	위치/수거함 (PostGIS)
	k8s-api-image	t3.small	2	2GB	20GB	이미지 업로드 (S3)
	k8s-api-info	t3.small	2	2GB	20GB	뉴스 피드 (CQRS + Cache Aside)
Worker Nodes	k8s-worker-storage	t3.medium	2	4GB	40GB	auth/users/character/info Worker
	k8s-worker-storage-2	t3.medium	2	4GB	40GB	Storage Worker HA
	k8s-worker-ai	t3.medium	2	4GB	40GB	scan/chat Worker (GPT)
	k8s-worker-ai-2	t3.medium	2	4GB	40GB	AI Worker HA
Data Nodes	k8s-postgresql	t3.large	2	8GB	80GB	PostgreSQL (7 DBs)
	k8s-redis-auth	t3.medium	2	4GB	20GB	Blacklist, OAuth State
	k8s-redis-streams	t3.small	2	2GB	10GB	SSE Events (4 shards)
	k8s-redis-cache	t3.small	2	2GB	10GB	Celery Result, LRU
	k8s-redis-pubsub	t3.small	2	2GB	10GB	Realtime Broadcast
Platform Nodes	k8s-rabbitmq	t3.medium	2	4GB	40GB	AMQP Message Broker
	k8s-monitoring	t3.large	2	8GB	60GB	Prometheus, Grafana
	k8s-logging	t3.xlarge	4	16GB	100GB	EFK Stack (ES 8GB Heap)
	k8s-ingress-gateway	t3.medium	2	4GB	20GB	Istio Ingress, ext-authz
	k8s-sse-gateway	t3.small	2	2GB	20GB	SSE Gateway (Long-lived)
	k8s-event-router	t3.small	2	2GB	20GB	Streams → Pub/Sub Fan-out

💰 Total Resources

vCPU: 54 cores
RAM: 114 GB
Storage: 880 GB

⚡ Instance Types

t3.xlarge × 2 (16GB)
t3.large × 2 (8GB)
t3.medium × 9 (4GB)
t3.small × 12 (2GB)

🏗️ IaC Management

terraform/main.tf
terraform/modules/ec2/
Ubuntu 22.04 (Jammy)

┌─────────────────────────────────────────────────────────────────────────────────┐ │ 5-Layer Architecture │ ├─────────────────────────────────────────────────────────────────────────────────┤ │ │ │ Layer 1 ┌──────────────────────────────────────────────────────┐ │ │ EDGE │ Istio Ingress (v1.24.1) • ext-authz • Cert-Manager │ │ │ tier=network └──────────────────────────┬───────────────────────────┘ │ │ │ │ │ Layer 2 ┌──────────────────────────▼───────────────────────────┐ │ │ SERVICE │ auth │ scan │ chat │ character │ location │ users │ │ │ tier=business │ images │ sse_gateway │ + Workers (8종) │ │ │ └──────────────────────────┬───────────────────────────┘ │ │ │ │ │ Layer 3 ┌──────────────────────────▼───────────────────────────┐ │ │ INTEGRATION │ RabbitMQ • gRPC • Redis Streams/Pub/Sub • Event Router│ │ │ tier=integration└─────────────────────────┬───────────────────────────┘ │ │ │ │ │ Layer 4 ┌──────────────────────────▼───────────────────────────┐ │ │ PERSISTENCE │ PostgreSQL (Bitnami) • Redis Cache/Auth (Spotahome) │ │ │ tier=data └──────────────────────────┬───────────────────────────┘ │ │ │ │ │ Layer 5 ┌──────────────────────────▼───────────────────────────┐ │ │ PLATFORM │ K8s • ArgoCD • Prometheus • KEDA • Jaeger • Kiali │ │ │ tier=platform └──────────────────────────────────────────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────────────────────┘

Layer	tier Label	구성 요소
1	network	Istio Gateway, ext-authz, Envoy, Cert-Manager
2	business-logic	auth, scan, chat, character, location, users, images, sse-gw + 8 Workers
3	integration	RabbitMQ, gRPC, Redis Streams/Pub/Sub, Event Router
4	data	PostgreSQL, Redis (Cache, Auth, Streams, PubSub)
5	platform	K8s, ArgoCD, Prometheus, KEDA, Jaeger, Kiali, ECK

🔖 Label Hierarchy (5-Key System)

🚫 Taint Policy (NoSchedule)

Taint	Effect	스케줄되는 Pod
domain=auth:NoSchedule	auth 노드 격리	auth-api, ext-authz만 배치
domain=scan:NoSchedule	scan 노드 격리	scan-api만 배치
domain=worker-storage:NoSchedule	Storage Worker 격리	auth/users/character/info Worker
domain=worker-ai:NoSchedule	AI Worker 격리	scan-worker, chat-worker (GPT)
domain=data:NoSchedule	Data 노드 격리	PostgreSQL, Redis StatefulSet
domain=observability:NoSchedule	Monitoring 격리	Prometheus, Grafana, EFK

🔧 Terraform 자동 설정

terraform/main.tf의 kubelet_profiles에서 노드별 Labels & Taints를 user-data로 주입:
--node-labels=role=api,domain=auth,tier=business-logic --register-with-taints=domain=auth:NoSchedule

🔀 GitOps + CI/CD 파이프라인 🔀 GitOps + CI/CD Pipeline

📦 CD (ArgoCD)CD (ArgoCD)

• Terraform → ArgoCD root-app 배포
• Develop 브랜치 Watch → Sync-Wave 순서 배포
• 하위 → 상위 의존성 순으로 자동 배포 • Terraform → ArgoCD root-app deploy
• Develop branch Watch → Sync-Wave order deploy
• Auto-deploy in dependency order (low → high)

🔧 CI (GitHub Actions)CI (GitHub Actions)

• FastAPI Lint Test → Docker Build/Push
• Kustomize Render Test + K3s Test
• Operator + CR Manifest 렌더링 검증 • FastAPI Lint Test → Docker Build/Push
• Kustomize Render Test + K3s Test
• Operator + CR Manifest render validation

📋 Sync-Wave Timeline 📋 Sync-Wave Timeline

Wave	Category	Resources
0	Foundation	CRDs
4-6	Service Mesh	Istio, ArgoCD Image Updater
7-8	Network/Cert	NetworkPolicy, Cert-Manager
10-11	Secrets	External Secrets Operator → CRs
20-23	Monitoring	Prometheus, Grafana, Alerting
24-25	Database	PostgreSQL, Prometheus Adapter
27-28	Cache	Redis Operator → CRs
29-32	Message Queue	RabbitMQ Operator → Topology → CRs
35-36	Autoscaling	KEDA, ScaledObjects
40-43	Application	APIs, Workers, SSE Gateway, Event Router
50	Routing	Istio VirtualService
60-63	Observability	Kiali, Jaeger, ECK

💡 설계 원칙Design Principle

CRD → Operator → Instance 순서로 분리하여 의존성 충돌 방지. Operator 없이 Helm Chart + Kustomize Overlay로 관리하는 스택(PostgreSQL 등)은 운영 복잡도를 낮추기 위해 선택. Separated in CRD → Operator → Instance order to prevent dependency conflicts. Stacks managed with Helm Chart + Kustomize Overlay (e.g., PostgreSQL) chosen to reduce operational complexity.

📚 참고Reference

GitOps #04: ALB Controller, ExternalDNS, ESO 구성 → GitOps #04: ALB Controller, ExternalDNS, ESO Setup →

대상Target	워크플로우Workflow	검증Checks	트리거/특징Trigger/Notes
apps/* API	ci-services.yml	black==24.4.2, ruff==0.6.9, pytest==8.3.3	PR에선 품질 게이트만, push/수동 실행에서만 이미지 빌드/푸시 PR runs quality only, build/push only on push or manual
apps/* Workers	ci-workers.yml	black/ruff + pytest(unit) (integration은 기본 ignore)	Redis 환경변수 주입으로 단위 테스트 중심(빠른 피드백) Unit-test focused for fast feedback (Redis env injected)
SSE Components	ci-sse-components.yml	apps/sse_gateway, apps/event_router: black/ruff/pytest	apps/를 PYTHONPATH로 잡아 모듈 import 정합성 강제 Sets PYTHONPATH=apps to enforce import correctness
Infra	ci-infra.yml	Terraform validate/plan 등 IaC 검증 IaC checks (Terraform validate/plan, etc.)	docs 타입 커밋은 skip 가능(인프라 CI 비용 절감) Can skip on docs-only commits to reduce CI cost
Coverage (리포트)	ci-sonarcloud.yml	pytest-cov → coverage.xml 생성	SonarCloud 연동용(현재는 수동 트리거) For SonarCloud integration (manual trigger currently)

왜 중요한가 Why it matters

• PR 단계에서 포맷/린트/테스트를 강제해 “리뷰는 로직에 집중”하도록 만듭니다.
• 도메인/워커가 늘어날수록 회귀 리스크가 커지므로, Quality Gate는 확장성의 보험입니다.
• Clean Architecture(Port/Adapter)에서 특히 Mock 기반 단위 테스트가 쉬워져 품질 루프가 빨라집니다. • Enforces format/lint/tests on PRs so reviews focus on logic.
• As domains/workers scale, regression risk grows—quality gates are scalability insurance.
• Clean Architecture improves testability (ports mocked), accelerating recursive self-improvement.

┌─────────────────────────────────────────────────────────────────────────────────┐ │ Operators & Helm Charts │ ├─────────────────────────────────────────────────────────────────────────────────┤ │ ┌─ Kubernetes Operators ──────────────────────────────────────────────────┐ │ │ │ ┌───────────────────────┐ ┌───────────────────────┐ │ │ │ │ │ RabbitMQ Cluster │ │ Messaging Topology │ │ │ │ │ │ v2.11.0 │ │ v1.15.0 │ │ │ │ │ └───────────────────────┘ └───────────────────────┘ │ │ │ │ ┌───────────────────────┐ ┌───────────────────────┐ │ │ │ │ │ Redis Operator │ │ External Secrets │ │ │ │ │ │ (Spotahome) │ │ v0.9.11 │ │ │ │ │ │ v3.3.0 │ │ │ │ │ │ │ └───────────────────────┘ └───────────────────────┘ │ │ │ │ ┌───────────────────────┐ ┌───────────────────────┐ │ │ │ │ │ ECK Operator │ │ Cert-Manager │ │ │ │ │ │ v2.11.0 │ │ v1.16.2 │ │ │ │ │ └───────────────────────┘ └───────────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────────────┘ │ │ ┌─ Helm Charts ───────────────────────────────────────────────────────────┐ │ │ │ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │ │ │ │ │ kube-prometheus │ │ KEDA │ │ Istio Stack │ │ │ │ │ │ v56.21.1 │ │ v2.16.0 │ │ v1.24.1 │ │ │ │ │ └─────────────────┘ └─────────────────┘ └─────────────────┘ │ │ │ │ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │ │ │ │ │ Grafana │ │ Prometheus │ │ PostgreSQL │ │ │ │ │ │ v8.5.9 │ │ Adapter │ │ v18.1.11 │ │ │ │ │ │ │ │ v4.10.0 │ │ │ │ │ │ │ └─────────────────┘ └─────────────────┘ └─────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────────────┘ │ └─────────────────────────────────────────────────────────────────────────────────┘

Type	Name	Version	용도
Operator	RabbitMQ Cluster	v2.11.0	MQ 클러스터
Operator	Messaging Topology	v1.15.0	Exchange/Queue CRD
Operator	Redis (Spotahome)	v3.3.0	Sentinel HA
Operator	External Secrets	v0.9.11	Secret 동기화
Operator	ECK	v2.11.0	Elasticsearch
Operator	Cert-Manager	v1.16.2	TLS 인증서
Helm	kube-prometheus-stack	v56.21.1	모니터링
Helm	KEDA	v2.16.0	이벤트 기반 스케일링
Helm	Istio	v1.24.1	Service Mesh
Helm	Grafana	v8.5.9	대시보드
Helm	Prometheus Adapter	v4.10.0	Custom Metrics
Helm	PostgreSQL	v18.1.11	데이터베이스

┌─────────────────────────────────────────────────────────────────────────────────┐ │ ext-authz Performance Tuning Timeline │ ├─────────────────────────────────────────────────────────────────────────────────┤ │ │ │ AS-IS (Breaking Point) TO-BE (Tuned + HPA) │ │ ───────────────────── ───────────────────── │ │ Peak RPS: ~42 Peak RPS: ~1,200 │ │ Avg Latency: ~125ms Avg Latency: 10-24ms │ │ Success Rate: ~86% Success Rate: ~99.8% │ │ Pod: 1 (CPU 병목) Pod: 3-5 (HPA 자동 확장) │ │ │ │ ┌─────────────────────────────────────────────────────────────────────────┐ │ │ │ 성능 개선 흐름 │ │ │ │ │ │ │ │ Step 1 Step 2 Step 3 │ │ │ │ PoolSize=20 PoolSize=100 PoolSize=500 + HPA │ │ │ │ RPS: ~884 RPS: ~900 RPS: ~950→1,200 │ │ │ │ SR: 90-94% SR: 98-99% SR: 99.5-100% │ │ │ │ │ │ │ │ [Redis Pool]────►[CPU Limits]──────►[HPA 2-5 Pods] │ │ │ │ 병목 해소 버스트 허용 부하 분산 │ │ │ └─────────────────────────────────────────────────────────────────────────┘ │ │ │ │ 테스트 환경: 2,500 VU, 250 ramp-ups/s, 30분 지속 │ │ 테스트 날짜: 2025-12-14 ~ 2025-12-15 │ │ │ └─────────────────────────────────────────────────────────────────────────────────┘

메트릭	AS-IS (PoolSize=20)	1차 (PoolSize=100)	2차 (PoolSize=500+HPA)	개선율
Peak RPS	~884 (→42 포화)	~900	~1,200	28배
Avg Latency	57-80ms (→125ms)	16-22ms	10-24ms	84% 감소
Success Rate	90-94% (→86%)	98.25-99.5%	99.55-100%	+13.8%p
redis_error	11-54 req/s	3-8 req/s	0-3.5 req/s	93% 감소
Pod Count	1 (CPU 병목)	1	3-5 (HPA)	수평 확장

📊 Grafana 스냅샷 (실제 테스트 데이터)

→ 2000 VU, 200 ramp-ups, 30m 테스트 → 2500 VU, 250 ramp-ups, 30m 테스트

┌─────────────────────────────────────────────────────────────────────────────────┐ │ AWS External Components Architecture │ ├─────────────────────────────────────────────────────────────────────────────────┤ │ │ │ ┌─────────────────────────────────────────────────────────────────────────┐ │ │ │ External Traffic │ │ │ │ │ │ │ │ │ ┌────────────▼────────────┐ │ │ │ │ │ Route 53 DNS │ │ │ │ │ │ *.dev.growbin.app │ │ │ │ │ │ (Hosted Zone) │ │ │ │ │ └────────────┬────────────┘ │ │ │ │ │ │ │ │ │ ┌────────────────────┼────────────────────┐ │ │ │ │ ▼ ▼ ▼ │ │ │ │ ┌──────────────────┐ ┌──────────────────┐ ┌──────────────────┐ │ │ │ │ │ CloudFront │ │ AWS ALB │ │ S3 │ │ │ │ │ │ images.*.app │ │ api.*.app │ │ (Origin) │ │ │ │ │ │ CDN + OAI │ │ SSL Termination │ │ Images Bucket │ │ │ │ │ └────────┬─────────┘ └────────┬─────────┘ └────────┬─────────┘ │ │ │ │ │ │ │ │ │ │ │ └────────────────────┤ │ │ │ │ │ ▼ │ │ │ │ │ ┌──────────────────┐ │ │ │ │ │ │ ACM Certificate │ │ │ │ │ │ │ *.growbin.app │◄───────────┘ │ │ │ │ │ (DNS Validated) │ (us-east-1 for CDN) │ │ │ │ └────────┬─────────┘ │ │ │ │ │ │ │ │ │ ▼ │ │ │ │ ┌────────────────────────────────────┐ │ │ │ │ │ Kubernetes Cluster (24 Nodes) │ │ │ │ │ │ Istio Ingress Gateway │ │ │ │ │ └────────────────────────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────────────┘ │ │ │ │ S3 Bucket Configuration: │ │ ├── Versioning: Enabled │ │ ├── Lifecycle: 30d → Standard-IA, 90d → Delete │ │ ├── Encryption: AES256 (SSE-S3) │ │ ├── Public Access: Blocked (OAI only) │ │ └── CORS: frontend.*.app, localhost:3000/5173 │ │ │ │ CloudFront Configuration: │ │ ├── Price Class: PriceClass_200 (Asia + NA + EU) │ │ ├── Cache TTL: 24h default, 7d max │ │ ├── Origin: S3 via OAI (secure) │ │ └── SSL: TLSv1.2_2021, SNI-only │ │ │ └─────────────────────────────────────────────────────────────────────────────────┘

컴포넌트	용도	주요 설정
Route 53	DNS 관리	Hosted Zone: growbin.app, DNS Validation for ACM
AWS ALB	L7 로드밸런서	SSL/TLS Termination, Target: Istio Ingress
CloudFront	이미지 CDN	OAI, 24h/7d TTL, PriceClass_200
S3 Bucket	이미지 저장	Versioning, 30d→IA, 90d 삭제, Pre-signed URL
ACM	SSL 인증서	Wildcard *.growbin.app, DNS Validation

🔧 Terraform IaC 관리

모든 외부 AWS 리소스는 terraform/ 디렉토리에서 선언적으로 관리됩니다:
cloudfront.tf, s3.tf, route53.tf, acm.tf, alb-controller-iam.tf

┌─────────────────────────────────────────────────────────────────────────────────┐ │ IRSA (IAM Roles for Service Accounts) │ ├─────────────────────────────────────────────────────────────────────────────────┤ │ │ │ ┌──────────────────┐ ┌──────────────────┐ ┌──────────────────┐ │ │ │ ExternalSecrets │ │ ExternalDNS │ │ ALB Controller │ │ │ │ Operator │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ ┌──────────────┐ │ │ ┌──────────────┐ │ │ ┌──────────────┐ │ │ │ │ │IAM Role │ │ │ │IAM Role │ │ │ │IAM Role │ │ │ │ │ │external- │ │ │ │external- │ │ │ │alb- │ │ │ │ │ │secrets-op │ │ │ │dns │ │ │ │controller │ │ │ │ │ └──────┬───────┘ │ │ └──────┬───────┘ │ │ └──────┬───────┘ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ ▼ │ │ ▼ │ │ ▼ │ │ │ │ ┌──────────────┐ │ │ ┌──────────────┐ │ │ ┌──────────────┐ │ │ │ │ │SSM Parameter │ │ │ │Route 53 │ │ │ │ELB + EC2 │ │ │ │ │ │Store Read │ │ │ │Change Records│ │ │ │Create/Manage │ │ │ │ │ │SecretsMgr │ │ │ │List Zones │ │ │ │Target Groups │ │ │ │ │ └──────────────┘ │ │ └──────────────┘ │ │ └──────────────┘ │ │ │ └────────┬─────────┘ └────────┬─────────┘ └────────┬─────────┘ │ │ │ │ │ │ │ ▼ ▼ ▼ │ │ ┌──────────────────┐ ┌──────────────────┐ ┌──────────────────┐ │ │ │K8s ExternalSecret│ │K8s Ingress/Svc │ │K8s Ingress │ │ │ │ ↓ │ │annotations │ │annotations │ │ │ │K8s Secret │ │ ↓ │ │ ↓ │ │ │ │(auto-created) │ │Route53 A Record │ │AWS ALB (auto) │ │ │ └──────────────────┘ └──────────────────┘ └──────────────────┘ │ │ │ │ Workflow: ExternalSecrets │ │ ├── 1. ExternalSecret CR 생성 (spec.secretStoreRef: aws-ssm-store) │ │ ├── 2. Operator가 AWS SSM Parameter Store에서 값 조회 │ │ ├── 3. K8s Secret 자동 생성 (refreshInterval: 1h) │ │ └── 4. Pod가 Secret 마운트하여 환경변수로 사용 │ │ │ │ 예시: /sesacthon/dev/api/auth/jwt-secret → auth-secret (K8s Secret) │ │ │ └─────────────────────────────────────────────────────────────────────────────────┘

자동화 도구	ServiceAccount	IAM 권한	동작
ExternalSecrets	platform-system:external-secrets-sa	SSM GetParameter, SecretsManager GetSecretValue	AWS SSM → K8s Secret 자동 동기화
ExternalDNS	platform-system:external-dns	Route53 ChangeResourceRecordSets, ListHostedZones	K8s Ingress/Service → Route53 A 레코드
ALB Controller	kube-system:aws-load-balancer-controller	ELB Create/Modify, EC2 Describe, ACM List	K8s Ingress → AWS ALB 자동 프로비저닝

🔐 IRSA 보안 이점

EC2 Instance Profile 대신 IRSA 사용 → Pod 수준 최소 권한 원칙 적용
terraform/irsa-roles.tf에서 IAM Role 선언적 관리

🚀 향후 발전 방향

EKS OIDC Provider 연동으로 ServiceAccount ↔ IAM Role 자동 매핑 도입 검토

🗺️ 네트워크 토폴로지 다이어그램 🗺️ Network Topology Diagram

🔀 트래픽 플로우 🔀 Traffic Flow

┌─────────────────────────────────────────────────────────────────────────────────┐
│                    Istio Service Mesh Traffic Flow                               │
├─────────────────────────────────────────────────────────────────────────────────┤
│                                                                                 │
│   [User/Client]                                                                 │
│        │                                                                        │
│        │ HTTPS Request                                                          │
│        ▼                                                                        │
│   ┌─────────────────┐     CNAME/A Record     ┌─────────────────┐               │
│   │  Route53 DNS    │◀──────────────────────▶│ ExternalDNS     │               │
│   │  (Global)       │                         │ Controller      │               │
│   └────────┬────────┘                         └─────────────────┘               │
│            │                                                                    │
│            ▼                                                                    │
│   ┌─────────────────┐     Target Group       ┌─────────────────┐               │
│   │  AWS ALB        │◀──────────────────────▶│ ALB Controller  │               │
│   │  (HTTPS :443)   │     Instance Mode       │ (IRSA)          │               │
│   └────────┬────────┘                         └─────────────────┘               │
│            │                                                                    │
│            │ Forward to NodePort (30xxx)                                        │
│            ▼                                                                    │
│   ┌─────────────────────────────────────────────────────────────────────┐       │
│   │  Ingress Gateway Node (k8s-ingress-gateway)                          │       │
│   │  ┌─────────────────────────────────────────────────────────────┐    │       │
│   │  │  Istio Ingress Gateway Pod (:80)                             │    │       │
│   │  │  ├── Gateway CR (hosts, port binding)                        │    │       │
│   │  │  └── VirtualService (path-based routing)                     │    │       │
│   │  └─────────────────────────────────────────────────────────────┘    │       │
│   └────────┬────────────────────────────────────────────────────────────┘       │
│            │                                                                    │
│            │ xDS Config (Istiod Control Plane)                                  │
│            ▼                                                                    │
│   ┌─────────────────────────────────────────────────────────────────────┐       │
│   │  Worker Node                                                         │       │
│   │  ┌───────────────────────────────────────────────────────────────┐  │       │
│   │  │  ┌─────────────┐         ┌─────────────────────────────────┐  │  │       │
│   │  │  │ Envoy       │ mTLS    │  Application Container          │  │  │       │
│   │  │  │ Sidecar     │────────▶│  (scan-api, chat-api, etc)      │  │  │       │
│   │  │  │ (:15001)    │ localhost│                                 │  │  │       │
│   │  │  └─────────────┘         └─────────────────────────────────┘  │  │       │
│   │  └───────────────────────────────────────────────────────────────┘  │       │
│   └─────────────────────────────────────────────────────────────────────┘       │
│                                                                                 │
│   Before: AWS ALB → K8s Ingress → NodePort → Pod                               │
│   After:  AWS ALB → Istio Gateway → VirtualService → Envoy Sidecar → App       │
│                                                                                 │
└─────────────────────────────────────────────────────────────────────────────────┘

🎯 Sidecar vs Ambient Mesh 선택 🎯 Sidecar vs Ambient Mesh Decision

고려 사항Consideration	Sidecar	Ambient	선택Choice
네임스페이스 격리Namespace Isolation	도메인별 Pod 단위Per-pod per domain	Waypoint per NS	✅ Sidecar
노드 밀도Node Density	노드당 적은 PodFew pods/node	ztunnel 효율 낮음inefficient	✅ Sidecar
Calico CNI	충돌 없음No conflict	CNI 레벨 충돌 위험CNI-level overlap risk	✅ Sidecar
운영 성숙도Operational Maturity	검증됨Proven	Beta (v1.24)	✅ Sidecar

⚙️ GitOps 배포 순서 (Sync Wave) ⚙️ GitOps Deployment Order (Sync Wave)

Wave	컴포넌트Component	노드Node	설명Description
04	istio-base, istiod	k8s-master	CRD + Control PlaneCRDs + Control Plane
05	Istio Ingress Gateway	k8s-ingress-gateway	단일 진입점 (t3.medium)Single entry point (t3.medium)
50	VirtualService, DestinationRule	-	앱 배포 후 라우팅 규칙Routing rules after app deployment

✅ 장점Benefits

• Zero-trust mTLS 보안
• 애플리케이션 코드에서 인증 제거
• 정밀한 트래픽 제어
• Observability 분리 (Kiali, Jaeger) • Zero-trust mTLS security
• Auth removed from app code
• Precise traffic control
• Observability decoupling (Kiali, Jaeger)

⚠️ 비용Trade-offs

• Sidecar당 100MB+ 메모리
• 추가 네트워크 홉
• VirtualService/DestinationRule 관리
• 운영 복잡도 증가 • 100MB+ memory per sidecar
• Additional network hops
• VirtualService/DestinationRule mgmt
• Increased operational complexity

🔧 트러블슈팅 🔧 Troubleshooting

문제Issue	원인Cause	해결Solution
ExternalDNS가 Bridge Ingress 감지 못함ExternalDNS not detecting Bridge Ingress	Istio Gateway 기반 Ingress 미인식Istio Gateway-based Ingress not recognized	`external-dns.alpha.kubernetes.io/managed` 라벨 추가label added
ALB Health Check 실패ALB Health Check failing	Gateway hosts에 ALB 헬스체크 경로 없음Gateway hosts missing ALB healthcheck path	Gateway에 와일드카드 호스트 추가Added wildcard host to Gateway
Envoy-istiod 통신 차단Envoy-istiod communication blocked	Calico NetworkPolicy가 xDS 연결 차단Calico NetworkPolicy blocking xDS	istio-system egress 명시적 허용Explicit egress allow to istio-system

📚 참고Reference

Istio Service Mesh 아키텍처 선택 → Istio Service Mesh Architecture Decision →

🔄 Blacklist 동기화 플로우 🔄 Blacklist Synchronization Flow

로그아웃 시 auth-api가 Redis Fallback Outbox에 이벤트를 적재하고, auth_relay가 이를 폴링하여 RabbitMQ Fanout Exchange로 발행합니다. 모든 ext-authz Pod가 브로드캐스트를 수신해 Local Cache(sync.Map)를 즉시 갱신합니다. 병렬로 auth_worker가 Redis에 영속 저장하여 신규 Pod 부트스트랩 시 일관성을 보장합니다. On logout, auth-api pushes event to Redis Fallback Outbox, auth_relay polls and publishes to RabbitMQ Fanout Exchange. All ext-authz Pods receive broadcast and immediately update Local Cache (sync.Map). In parallel, auth_worker persists to Redis for new Pod bootstrap consistency.

┌─────────────────────────────────────────────────────────────────────────────────┐ │ Blacklist Cache Synchronization Architecture │ ├─────────────────────────────────────────────────────────────────────────────────┤ │ │ │ ┌──────────────────────────────────────────────────────────────────────────┐ │ │ │ Event Flow (Logout) │ │ │ │ │ │ │ │ [User Logout] │ │ │ │ │ │ │ │ │ ▼ │ │ │ │ ┌─────────────┐ Redis Fallback Outbox ┌───────────────┐ │ │ │ │ │ auth-api │ ────────────────────────────► │ auth_relay │ │ │ │ │ │ (FastAPI) │ LPUSH outbox:blacklist │ (Outbox 폴링) │ │ │ │ │ └─────────────┘ └───────┬───────┘ │ │ │ │ │ │ │ │ │ │ AMQP Publish │ │ │ │ ▼ │ │ │ │ ┌────────────────────────────────────────┐ │ │ │ │ │ RabbitMQ (Fanout Exchange) │ │ │ │ │ │ blacklist.events │ │ │ │ │ │ type: fanout (broadcast to all) │ │ │ │ │ └──────────────┬─────────────────────────┘ │ │ │ │ │ │ │ │ │ ┌────────────────────────┼────────────────────────┐ │ │ │ │ ▼ ▼ ▼ │ │ │ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │ │ │ │ ext-authz │ │ ext-authz │ │ ext-authz │ │ │ │ │ │ Pod #1 │ │ Pod #2 │ │ Pod #3 │ │ │ │ │ │ ┌──────────┐ │ │ ┌──────────┐ │ │ ┌──────────┐ │ │ │ │ │ │ │Local Cache│ │ │ │Local Cache│ │ │ │Local Cache│ │ │ │ │ │ │ │sync.Map │ │ │ │sync.Map │ │ │ │sync.Map │ │ │ │ │ │ │ │O(1) lookup│ │ │ │O(1) lookup│ │ │ │O(1) lookup│ │ │ │ │ │ │ └──────────┘ │ │ └──────────┘ │ │ └──────────┘ │ │ │ │ │ └──────────────┘ └──────────────┘ └──────────────┘ │ │ │ │ │ │ │ └─────────────────────────────────────────────────────────────────────────┘ │ │ │ │ Parallel Path: auth_worker → Redis (영속 저장) │ │ ┌─────────────────────────────────────────────────────────────────────────┐ │ │ │ auth_worker (Python) │ │ │ │ └── blacklist:{jti} → Redis (TTL = token exp - now, max 24h) │ │ │ └─────────────────────────────────────────────────────────────────────────┘ │ │ │ │ 일관성 보장: │ │ ├── Fanout: 모든 Pod에 즉시 브로드캐스트 (ms 단위) │ │ ├── Local Cache: cleanupInterval 60s 주기로 만료 엔트리 삭제 │ │ ├── 신규 Pod: Bootstrap 시 Redis에서 active blacklist 로드 │ │ └── Lazy TTL: 조회 시 만료 체크, 만료 시 삭제 후 miss 반환 │ │ │ └─────────────────────────────────────────────────────────────────────────────────┘

컴포넌트	역할	구현
auth_relay	Fallback Outbox → MQ 재발행	Redis List 폴링 → RabbitMQ Publish
Fanout Exchange	브로드캐스트	blacklist.events, type=fanout
Local Cache	O(1) 조회	Go sync.Map, cleanupInterval 60s
auth_worker	Redis 영속화	blacklist:{jti}, TTL = exp - now

⚡ 성능 개선

AS-IS: ext-authz → Redis 네트워크 호출 (~2ms/req)
TO-BE: ext-authz → Local Cache (sync.Map, <0.01ms/req)
→ Redis 병목 제거, RPS 42 → 1,200+ 달성

┌─────────────────────────────────────────────────────────────────────────────────┐ │ ext-authz Operational Policy │ ├─────────────────────────────────────────────────────────────────────────────────┤ │ │ │ 1. FAIL-CLOSED POLICY (보안 우선) │ │ ┌─────────────────────────────────────────────────────────────────────────┐ │ │ │ │ │ │ │ if err != nil { │ │ │ │ // Redis 오류, 내부 오류 등 모든 예외 상황 │ │ │ │ return denyResponse(StatusCode_InternalServerError) │ │ │ │ // ❌ 절대 allowResponse 반환 X │ │ │ │ } │ │ │ │ │ │ │ │ 정책: "판단 불가 → 차단" (fail-closed) │ │ │ │ 대안: fail-open은 보안 위험 (블랙리스트 우회 가능) │ │ │ │ │ │ │ └─────────────────────────────────────────────────────────────────────────┘ │ │ │ │ 2. BLACKLIST TTL (24시간) │ │ ┌─────────────────────────────────────────────────────────────────────────┐ │ │ │ │ │ │ │ Access Token 만료: 30분 (JWT exp claim) │ │ │ │ Refresh Token 만료: 24시간 │ │ │ │ │ │ │ │ Blacklist TTL = min(token_exp - now, 24h) │ │ │ │ │ │ │ │ Redis Key: blacklist:{jti} │ │ │ │ Redis Value: {"user_id", "reason", "blacklisted_at", "expires_at"} │ │ │ │ │ │ │ │ RabbitMQ Queue TTL: 86400000ms (24시간) │ │ │ │ → x-message-ttl 설정으로 처리 실패 메시지 자동 만료 │ │ │ │ │ │ │ └─────────────────────────────────────────────────────────────────────────┘ │ │ │ │ 3. MEMORY CAP (Local Cache) │ │ ┌─────────────────────────────────────────────────────────────────────────┐ │ │ │ │ │ │ │ 구현: Go sync.Map (thread-safe, O(1)) │ │ │ │ │ │ │ │ 메모리 관리: │ │ │ │ ├── cleanupInterval: 60초 (주기적 만료 엔트리 삭제) │ │ │ │ ├── Lazy deletion: 조회 시 만료 체크 → 즉시 삭제 │ │ │ │ └── Entry size: ~100 bytes/token (jti + expireAt) │ │ │ │ │ │ │ │ 메트릭 (Prometheus): │ │ │ │ ├── ext_authz_blacklist_cache_size (현재 캐시 크기) │ │ │ │ ├── ext_authz_blacklist_cache_hits_total (캐시 히트) │ │ │ │ ├── ext_authz_blacklist_cache_misses_total (캐시 미스) │ │ │ │ └── ext_authz_blacklist_cache_evictions_total (삭제 횟수) │ │ │ │ │ │ │ │ 알림: cache_size > 10,000 → Slack 경고 │ │ │ │ │ │ │ └─────────────────────────────────────────────────────────────────────────┘ │ │ │ │ 4. CONNECTION POOL (Redis) │ │ ┌─────────────────────────────────────────────────────────────────────────┐ │ │ │ │ │ │ │ PoolSize: 500 (동시 연결 상한) │ │ │ │ MinIdleConns: 200 (웜 커넥션 유지) │ │ │ │ PoolTimeout: 2s (빠른 실패 → backpressure) │ │ │ │ ReadTimeout: 1s │ │ │ │ WriteTimeout: 1s │ │ │ │ │ │ │ │ → Pool exhaustion 방지, Cold start 지연 최소화 │ │ │ │ │ │ │ └─────────────────────────────────────────────────────────────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────────────────────┘

정책	값	이유
Fail Mode	fail-closed	블랙리스트 우회 방지, 보안 우선
Blacklist TTL	min(exp-now, 24h)	Refresh Token 만료 주기 커버
Cleanup Interval	60s	메모리 회수 주기
Pool Size	500 (min idle 200)	고동시성, Cold start 방지

🔒 코드 위치

apps/ext_authz/internal/server/server.go - fail-closed 정책
apps/ext_authz/internal/cache/blacklist.go - Local Cache 구현
apps/ext_authz/internal/config/config.go - Pool 설정
apps/auth_worker/infrastructure/persistence_redis/blacklist_store_redis.py - TTL 계산

📝 Logs

EFK Stack (ECK v2.11.0)
• Fluent Bit → ES v8.11.0
• 500K-1.1M docs/day
• ECS Schema + trace.id

🔗 Traces

Jaeger + OTEL
• Zipkin 9411 (Istio)
• OTLP 4317 (App)
• 7.16% coverage (125K/1.75M)

📈 Metrics

Prometheus + Grafana
• ServiceMonitor CRDs
• Golden Signals (p99, RPS)
• Alertmanager → Slack

┌─────────────────────────────────────────────────────────────────────────────────┐ │ Observability → Agent Feedback Loop (Recursive Self-Improvement) │ ├─────────────────────────────────────────────────────────────────────────────────┤ │ │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │ │ Cluster │────►│ Logs │────►│ Kibana │────►│ Developer │ │ │ │ (24 Nodes) │ │ (EFK) │ │ Dashboard │ │ Analysis │ │ │ └─────────────┘ └─────────────┘ └─────────────┘ └──────┬──────┘ │ │ │ │ │ │ │ ┌─────────────┐ ┌─────────────┐ │ │ │ ├────────────►│ Traces │────►│ Jaeger │────────────┤ │ │ │ │ (OTEL/B3) │ │ UI │ │ │ │ │ └─────────────┘ └─────────────┘ │ │ │ │ ▼ │ │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │ └────────────►│ Metrics │────►│ Grafana │────►│ AI Agent │ │ │ │ (Prometheus)│ │ Dashboard │ │ Debugging │ │ │ └─────────────┘ └─────────────┘ └──────┬──────┘ │ │ │ │ │ ┌──────────────────────────────────────────────────────────────────▼──────┐ │ │ │ Agent Feedback Integration │ │ │ │ • trace.id 기반 에러 로그 추적 → 근본 원인 분석 │ │ │ │ • Grafana Exemplars → 메트릭 이상치에서 샘플 트레이스 추출 │ │ │ │ • Alertmanager → Slack → Agent 컨텍스트 주입 │ │ │ └─────────────────────────────────────────────────────────────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────────────────────┘

Integration	Source	Target	Agent Feedback 활용
trace.id Correlation	Kibana logs	Jaeger spans	분산 트랜잭션 에러 추적
Grafana Exemplars	Prometheus metrics	Jaeger traces	p99 이상치 원인 분석
Alertmanager	PrometheusRule	Slack #eco_sre	실시간 장애 컨텍스트
Kiali Service Graph	Istio Telemetry	Traffic visualization	서비스 의존성 분석

Observability: Agent의 환경 인식 보조

• 실시간 환경 인식 — Agent가 클러스터 상태를 관찰하고 디버깅 컨텍스트 확보
• trace.id 기반 추적 — 1 요청 → 10~30개 로그를 단일 트레이스로 연결
• 메트릭 컨텍스트 — p99 지연, 에러율, 리소스 사용량 등 환경 정보 제공
※ 핵심은 문서 기반 Self-RAG 컨텍스트, Observability는 환경 인식 보조 역할

🏗️ 24-Node 분산 클러스터 아키텍처 🏗️ 24-Node Distributed Cluster Architecture

Self-managed Kubernetes 클러스터에 7개 도메인 서비스, 8개 워커, 4종 Redis 클러스터를 배치한 마이크로서비스 아키텍처입니다. Microservices architecture with 7 domain services, 8 workers, and 4 Redis clusters deployed on a self-managed Kubernetes cluster.

DOMA

도메인별 독립 서비스 Domain-Oriented MSA

Clean Architecture

Port/Adapter 패턴 Port/Adapter Pattern

4-Tier Layer

Presentation → Infra Presentation → Infra

EDA

MQ + Event Bus MQ + Event Bus

📦 물리 노드 클러스터 📦 Physical Node Cluster

┌─────────────────────────────────────────────────────────────────────────────────┐ │ Physical Node Cluster (AWS EC2 · Terraform) │ │ Master: 1 │ API: 8 │ Worker: 4 │ Data: 5 │ Platform: 6 = 24 Nodes + 1 Master │ ├─────────────────────────────────────────────────────────────────────────────────┤ │ │ │ ┌─────────────────────────────────────────────────────────────────────────┐ │ │ │ CONTROL PLANE (t3.xlarge · 4vCPU · 16GB) │ │ │ │ ┌─────────────────────────────────────────────────────────────────┐ │ │ │ │ │ k8s-master: API Server, etcd, Scheduler, Controller Manager │ │ │ │ │ └─────────────────────────────────────────────────────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────────────┘ │ │ │ │ ┌─────────────────────────────────────────────────────────────────────────┐ │ │ │ API NODES (t3.small~medium · 8 nodes) │ │ │ │ ┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐ │ │ │ │ │ auth │ │ scan │ │ chat │ │ char │ │ loc │ │ users │ │ │ │ │ │ 2GB │ │ 4GB │ │ 4GB │ │ 2GB │ │ 2GB │ │ 2GB │ │ │ │ │ └────────┘ └────────┘ └────────┘ └────────┘ └────────┘ └────────┘ │ │ │ │ ┌────────┐ ┌────────┐ │ │ │ │ │ images │ │ info │ → 도메인별 독립 노드 (DOMA) │ │ │ │ │ 2GB │ │ 2GB │ │ │ │ │ └────────┘ └────────┘ │ │ │ └─────────────────────────────────────────────────────────────────────────┘ │ │ │ │ ┌─────────────────────────────────────────────────────────────────────────┐ │ │ │ WORKER NODES (t3.medium · 4 nodes) DATA NODES (t3.small~large · 5) │ │ │ │ ┌──────────────┐ ┌──────────────┐ ┌──────────┐ ┌──────────────────┐│ │ │ │ │ storage × 2 │ │ ai × 2 │ │PostgreSQL│ │ Redis × 4 ││ │ │ │ │ auth/users/ │ │ scan/chat │ │ 8GB │ │ auth│stream│cache││ │ │ │ │ char/info │ │ GPT/Gemini │ │ 7 schemas│ │ pubsub (2GB each)││ │ │ │ └──────────────┘ └──────────────┘ └──────────┘ └──────────────────┘│ │ │ └─────────────────────────────────────────────────────────────────────────┘ │ │ │ │ ┌─────────────────────────────────────────────────────────────────────────┐ │ │ │ PLATFORM NODES (t3.small~xlarge · 6 nodes) │ │ │ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌────────────────┐ │ │ │ │ │ rabbitmq │ │monitoring│ │ logging │ │ ingress │ │sse-gw│evt-rtr │ │ │ │ │ │ 4GB │ │ 8GB │ │ 16GB │ │ 4GB │ │ 2GB │ 2GB │ │ │ │ │ │ Quorum Q │ │Prom/Graf │ │ EFK+ES │ │Istio+authz│ │ SSE │ Streams│ │ │ │ │ └──────────┘ └──────────┘ └──────────┘ └──────────┘ └────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────────────┘ │ │ │ │ Total: 54 vCPU · 114 GB RAM · 880 GB Storage · IaC: terraform/modules/ec2 │ └─────────────────────────────────────────────────────────────────────────────────┘

🔀 시스템 아키텍처 🔀 System Architecture

┌─────────────────────────────────────────────────────────────────────────────────┐ │ System Architecture (5-Layer + EDA) │ ├─────────────────────────────────────────────────────────────────────────────────┤ │ │ │ ┌─────────────────────────────── AWS ─────────────────────────────────────┐ │ │ │ Route53 (*.dev.growbin.app) → ALB (ACM SSL) → CloudFront + S3 (images) │ │ │ └────────────────────────────────────┬────────────────────────────────────┘ │ │ │ │ │ ════════════════════════════════════ │ ═══════════════════════════════════ │ │ L1. EDGE ▼ │ │ ┌─────────────────────────────────────────────────────────────────────────┐ │ │ │ Istio Ingress Gateway ──► ext-authz (Go gRPC) ──► JWT/Blacklist Check │ │ │ └───────────────────────────────────┬─────────────────────────────────────┘ │ │ │ │ │ L2. SERVICE ▼ │ │ ┌─────────────────────────────────────────────────────────────────────────┐ │ │ │ ┌──────┐┌──────┐┌──────┐┌──────┐┌──────┐┌──────┐┌──────┐ │ │ │ │ │ auth ││ scan ││ chat ││ char ││ loc ││users ││images│ + info │ │ │ │ └──┬───┘└──┬───┘└──┬───┘└──┬───┘└──┬───┘└──┬───┘└──────┘ │ │ │ │ │ MQ │ MQ │ MQ │ MQ │ │ │ │ │ └─────┼───────┼───────┼───────┼───────┼───────┼───────────────────────────┘ │ │ │ │ │ │ │ │ │ │ L3. INTEGRATION ▼ ▼ ▼ │ │ ┌─────────────────────────────────────────────────────────────────────────┐ │ │ │ RabbitMQ ◄───► Workers (8종) ◄───► Redis Streams ◄───► Event Router │ │ │ │ (Quorum) scan/chat/auth (4 shards) (Consumer Group) │ │ │ │ char/users/info ▼ │ │ │ │ Redis Pub/Sub │ │ │ │ ▼ │ │ │ │ SSE Gateway │ │ │ └─────────────────────────────────────────────────────────────────────────┘ │ │ │ │ L4. PERSISTENCE │ │ ┌─────────────────────────────────────────────────────────────────────────┐ │ │ │ PostgreSQL (7 schemas) │ Redis Auth │ Redis Cache │ Redis State KV │ │ │ └─────────────────────────────────────────────────────────────────────────┘ │ │ │ │ L5. PLATFORM (GitOps + Observability) │ │ ┌─────────────────────────────────────────────────────────────────────────┐ │ │ │ ArgoCD (App-of-Apps) │ KEDA │ Prometheus/Grafana │ Jaeger │ EFK Stack │ │ │ └─────────────────────────────────────────────────────────────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────────────────────┘

📋 5-Layer 상세 설명 📋 5-Layer Details

Layer	구성요소Components	역할Role	핵심 기술Key Tech
L1. Edge	Istio Ingress, ext-authz	트래픽 라우팅, JWT 인증/인가Traffic routing, JWT auth	Istio 1.24, Go gRPC, Redis Blacklist
L2. Service	8 Domain APIs + 2 Gateways	비즈니스 로직, SSE 실시간 통신Business logic, SSE realtime	FastAPI, Clean Architecture, CQRS
L3. Integration	RabbitMQ, Workers (8), Event Router	비동기 처리, 이벤트 스트리밍Async processing, event streaming	Celery/Taskiq, Redis Streams, Pub/Sub
L4. Persistence	PostgreSQL, Redis ×4	데이터 영속화, 캐싱, 상태 관리Data persistence, caching, state	7 schemas, Auth/Cache/Streams/Pub/Sub
L5. Platform	ArgoCD, KEDA, Observability Stack	GitOps 배포, 오토스케일링, 모니터링GitOps deploy, autoscaling, monitoring	App-of-Apps, Prometheus/Jaeger/EFK

🔀 EDA 이벤트 흐름 🔀 EDA Event Flow

단계Phase	흐름Flow	설명Description
1. Request	Client → API → RabbitMQ	API가 작업을 MQ에 발행 (Quorum Queue)API publishes task to MQ (Quorum Queue)
2. Process	Worker → Business Logic	Celery/Taskiq 워커가 비동기 처리Celery/Taskiq worker async processing
3. Event	Worker → Redis Streams	처리 완료 이벤트를 스트림에 기록Record completion event to stream
4. Fan-out	Event Router → Redis Pub/Sub	Consumer Group이 이벤트를 Pub/Sub로 전파Consumer Group propagates to Pub/Sub
5. Realtime	SSE Gateway → Client	Pub/Sub 구독 → SSE로 클라이언트에 푸시Subscribe Pub/Sub → Push to client via SSE

Anthropic은 Claude Code로 Claude를 개발하며 90% 이상의 코드를 AI가 작성합니다. 핵심은 CLAUDE.md 자동 주입과 Skills의 Progressive Disclosure입니다. Eco²는 이 방식을 적용하여 문서 기반 Self-RAG 컨텍스트를 축적하며, GitOps(ArgoCD)로 클러스터와 코드베이스가 동기화되어 코드에 대한 이해가 곧 배포 환경에 대한 이해로 이어집니다. Anthropic develops Claude using Claude Code, with over 90% of code written by AI. The key is CLAUDE.md auto-injection and Skills' Progressive Disclosure. Eco² applies this approach, accumulating document-based Self-RAG context. With GitOps(ArgoCD) syncing the cluster with codebase, understanding the code directly translates to understanding the deployment environment.

1. Research

foundations/

→

2. Design

plans/

→

3. Implement

Agent + Skills

→

4. Deploy

GitOps

→

5. Report

reports/ → 🔁

개념Concept	Eco² 적용Implementation	설명Description
CLAUDE.md	`CLAUDE.md`	세션 시작 시 프로젝트 컨텍스트 자동 주입Auto-inject project context at session start
Skills	`/.claude/skills/` (27개)	k8s-debug, load-test, pr-review 등 반복 작업 자동화Automate k8s-debug, load-test, pr-review, etc.
Progressive Disclosure	foundations → plans → reports	필요한 문서만 점진적 로드 (토큰 효율)Load only needed docs progressively (token efficient)
Bash 선호	kubectl, terraform, helm	MCP 50K-134K 토큰 vs Bash 직접 접근MCP 50K-134K tokens vs Bash direct access

🔄 재귀적 자기개선 클러스터 워크플로우 🔄 Recursive Self-Improvement Cluster Workflow

┌─────────────────────────────────────────────────────────────────────────────────┐ │ Context-First Development: Self-RAG + Runtime Environment │ ├─────────────────────────────────────────────────────────────────────────────────┤ │ │ │ ┌─────────────────────────────────────────────────────────────────────────┐ │ │ │ Self-RAG Context (Progressive Disclosure) │ │ │ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │ │ │ │ CLAUDE.md │ │ foundations/ │ │ plans/ │ │ reports/ │ │ │ │ │ │ (Auto-load) │ │ (Research) │ │ (Design) │ │ (Feedback) │ │ │ │ │ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │ │ │ │ └────────────────┴────────────────┴────────────────┘ │ │ │ │ │ │ │ │ │ ┌────────────▼────────────┐ │ │ │ │ │ /.claude/skills/ (17) │ ← Bash로 SKILL.md 로드│ │ │ │ │ Progressive Disclosure │ (MCP 대비 경량) │ │ │ │ └────────────┬────────────┘ │ │ │ └───────────────────────────────────┼─────────────────────────────────────┘ │ │ │ │ │ ┌────────────▼────────────┐ │ │ │ AI Agent (Claude) │ │ │ │ Context Window 최적화 │ │ │ └────────────┬────────────┘ │ │ │ │ │ ════════════════════════════════════▼═════════════════════════════════════ │ │ RUNTIME: develop branch = 24-Node K8s Cluster State │ │ ══════════════════════════════════════════════════════════════════════════ │ │ │ │ ┌───────────────────┬───────────────────┬───────────────────────────────┐ │ │ │ GitOps (ArgoCD) │ 8 Domain APIs │ Observability │ │ │ │ Zero-touch Deploy│ 8 Workers + EDA │ (환경 인식 보조) │ │ │ │ App-of-Apps │ RabbitMQ/Redis │ Prometheus/Jaeger/EFK │ │ │ └───────────────────┴───────────────────┴───────────────────────────────┘ │ │ │ │ │ ┌────────────▼────────────┐ │ │ │ Deploy → Observe → │ │ │ │ Report → Loop (RAG) │ │ │ └─────────────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────────────────────┘

Self-RAG Context가 Agent의 판단 근거를 형성하고, Skills가 반복 작업을 자동화하며
Runtime Environment에서 Deploy → Observe → Report 사이클이 순환 Self-RAG Context forms Agent's decision basis, Skills automate repetitive tasks
Deploy → Observe → Report cycle loops in Runtime Environment

💡 핵심 의사결정 💡 Key Design Decisions

⚡ Bash 선호 ⚡ Bash Preferred

MCP는 50K-134K 토큰 소모. Bash는 환경 도구에 직접 접근하여 컨텍스트 효율적 MCP consumes 50K-134K tokens. Bash directly accesses env tools, context efficient

📄 Skills = 파일시스템 📄 Skills = Filesystem

YAML 메타 → SKILL.md → 링크된 파일. 점진적 로드로 Agent 온보딩 가속화 YAML meta → SKILL.md → linked files. Progressive load accelerates Agent onboarding

☸️ GitOps ☸️ GitOps

코드베이스가 클러스터와 sync되어 Desired State로 기능. Agent 작업 범위가 인프라까지 확장되며 state 전이로 시스템 발전 Codebase syncs with cluster as Desired State. Agent scope extends to infra, system evolves through state transitions

🔁 재귀적 자기개선 루프 Recursive Self-Improvement Loop

매 세션의 산출물이 reports/에 축적되고, 이는 다음 세션의 foundations/가 되어 Agent의 판단 근거로 작용
반복 작업은 skills/로 정제되어 코딩 에이전트(세션) 온보딩 비용 없이 즉시 재사용
→ 루프를 거듭할수록 프로젝트 특화된 의사결정 정밀도가 누적 상승 Each session's outputs accumulate in reports/, becoming foundations/ for next session as Agent's decision basis
Repetitive tasks are refined into skills/, instantly reusable without coding agent (session) onboarding cost
→ Each loop iteration compounds project-specific decision precision

왜 Context-First인가? Why Context-First?

┌─────────────────────────────────────────────────────────────────────────────────┐ │ 📉 Context Rot — MIT RLM Research (arxiv.org/pdf/2512.24601v1) │ ├─────────────────────────────────────────────────────────────────────────────────┤ │ Context Window Limits: Opus 200K │ GPT 272K │ Gemini 1M (finite resource) │ ├─────────────────────────────────────────────────────────────────────────────────┤ │ Token Range │ Performance │ Bar │ Status │ │─────────────────┼─────────────┼────────────────────────────┼───────────────────│ │ 0K - 8K │ 100% │ ████████████████████ │ ✅ baseline │ │ 8K - 16K │ 85% │ █████████████████░░░ │ ⚠️ slight decay │ │ 16K - 32K │ 60% │ ████████████░░░░░░░░ │ ⚠️ noticeable │ │ 32K - 64K │ 40% │ ████████░░░░░░░░░░░░ │ 🔴 significant │ │ 64K - 128K │ 20% │ ████░░░░░░░░░░░░░░░░ │ 🔴 severe │ │ 128K+ │ 10% │ ██░░░░░░░░░░░░░░░░░░ │ ❌ near failure │ ├─────────────────────────────────────────────────────────────────────────────────┤ │ "Claude Code 히스토리가 비대해지거나 ChatGPT와 오래 대화하면 │ │ 마치 모델이... 멍청해지는 것 같은 현상" — MIT RLM Paper │ └─────────────────────────────────────────────────────────────────────────────────┘ ┌─────────────────────────────────────────────────────────────────────────────────┐ │ 📉 Context Rot — MIT RLM Research (arxiv.org/pdf/2512.24601v1) │ ├─────────────────────────────────────────────────────────────────────────────────┤ │ Context Window Limits: Opus 200K │ GPT 272K │ Gemini 1M (finite resource) │ ├─────────────────────────────────────────────────────────────────────────────────┤ │ Token Range │ Performance │ Bar │ Status │ │─────────────────┼─────────────┼────────────────────────────┼───────────────────│ │ 0K - 8K │ 100% │ ████████████████████ │ ✅ baseline │ │ 8K - 16K │ 85% │ █████████████████░░░ │ ⚠️ slight decay │ │ 16K - 32K │ 60% │ ████████████░░░░░░░░ │ ⚠️ noticeable │ │ 32K - 64K │ 40% │ ████████░░░░░░░░░░░░ │ 🔴 significant │ │ 64K - 128K │ 20% │ ████░░░░░░░░░░░░░░░░ │ 🔴 severe │ │ 128K+ │ 10% │ ██░░░░░░░░░░░░░░░░░░ │ ❌ near failure │ ├─────────────────────────────────────────────────────────────────────────────────┤ │ "When Claude Code history gets bloated or you chat with ChatGPT for a long │ │ time, the model gets... dumber" — MIT RLM Paper │ └─────────────────────────────────────────────────────────────────────────────────┘

✅ Solution: Context-First Development
• 세션 컨텍스트 최소화 → foundations/plans/reports 외부 저장소 활용
• Agent가 필요한 정보만 Self-RAG → "신선한" 컨텍스트에서 작업
• Progressive Disclosure로 MCP 대비 85% 토큰 절감
• Minimize session context → Use foundations/plans/reports as external storage
• Agent Self-RAGs only needed info → Works in "fresh" context
• Progressive Disclosure saves 85% tokens vs MCP

┌─────────────────────────────────────────────────────────────────────────────────┐ │ EFK Stack — Elastic Operator (ECK) v2.11.0 · helm.elastic.co │ ├─────────────────────────────────────────────────────────────────────────────────┤ │ │ │ ┌─────────────────────────────────────────────────────────────────────────┐ │ │ │ 24 Nodes (DaemonSet: Fluent Bit v2.2.0) │ │ │ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ │ │ │ │ │ master │ │ worker-1│ │ worker-2│ │ storage │ │ logging │ ... │ │ │ │ │ fluent │ │ fluent │ │ fluent │ │ fluent │ │ fluent │ │ │ │ │ └────┬────┘ └────┬────┘ └────┬────┘ └────┬────┘ └────┬────┘ │ │ │ └───────┼──────────┼──────────┼──────────┼──────────┼─────────────────────┘ │ │ │ │ │ │ │ │ │ └──────────┴──────────┴──────────┴──────────┘ │ │ │ │ │ ┌─────────────────────────▼─────────────────────────┐ │ │ │ elastic-system namespace │ │ │ │ ┌──────────────────────────────────────────┐ │ │ │ │ │ ECK Operator (eck-operator v2.11.0) │ │ │ │ │ │ github.com/elastic/cloud-on-k8s │ │ │ │ │ └────────────────────┬─────────────────────┘ │ │ │ └────────────────────────┼──────────────────────────┘ │ │ │ manages CRDs │ │ ┌───────────▼───────────┐ │ │ │ Elasticsearch CR (ECK)│ │ │ │ v8.11.0 · 50GB · 5Gi │ │ │ │ eco2-logs-es-http │ │ │ └───────────┬───────────┘ │ │ │ │ │ ┌───────────▼───────────┐ │ │ │ Kibana CR (ECK) │ │ │ │ kibana.dev.growbin.app│ │ │ │ 23M+ documents │ │ │ └───────────────────────┘ │ │ │ │ Pipeline: /var/log/containers/*.log → CRI Parser → K8s Filter → │ │ ECS Lua Enrichment → Elasticsearch (logs-YYYY.MM.DD) │ │ │ └─────────────────────────────────────────────────────────────────────────────────┘

Component	Version	역할	설정
ECK Operator	v2.11.0	ES/Kibana 라이프사이클 관리	Helm (helm.elastic.co), elastic-system ns
Fluent Bit	v2.2.0	로그 수집 (DaemonSet)	CRI Parser, K8s Filter, ECS Lua
Elasticsearch	v8.11.0	로그 저장/검색 (CR)	elasticsearch.k8s.elastic.co/v1, 50GB gp3
Kibana	v8.11.0	로그 시각화 (CR)	kibana.k8s.elastic.co/v1, elasticsearchRef

500K-1.1M

Daily Documents

260-420MB

Daily Index Size

7-13/s

Log Rate

23M+

Total Documents

Index Strategy: Single Pattern + ECS Field Filtering

• Pattern — logs-YYYY.MM.DD (Logstash format, 1 shard, 0 replica)
• Index Template — eco2-logs-ecs with subobjects: false (dot notation 보존: trace.id, span.id)
• Service Filtering — 인덱스 분리 대신 service.name 필드로 8개 도메인 구분 (cross-service trace.id 추적 용이)

왜 Elastic Operator (ECK)인가?

• Operator Pattern — Elasticsearch/Kibana를 CRD로 선언적 관리. TLS 인증서, 클러스터 토폴로지, Rolling Upgrade 자동화
• Elastic Cloud 전환 용이 — 동일한 CRD 스키마로 Self-Managed → Elastic Cloud on K8s (ECK) → Elastic Cloud (SaaS) 마이그레이션 경로 확보
• 공식 지원 — github.com/elastic/cloud-on-k8s (Elastic 공식 Operator)

🔗 Kibana Dashboard

→ logs-eco2-app (23M+ hits)

1.75M

Total Logs

125K

with trace.id

7.16%

Coverage Rate

99.8%

from istio-proxy

┌─────────────────────────────────────────────────────────────────────────────────┐ │ ECS (Elastic Common Schema) v8.11.0 Log Format │ ├─────────────────────────────────────────────────────────────────────────────────┤ │ │ │ // Python API Services (FastAPI) - ECSJsonFormatter + OTEL auto-inject │ │ { │ │ "@timestamp": "2026-01-12T08:59:59.876Z", │ │ "message": "Request processed successfully", │ │ "log.level": "info", │ │ "ecs.version": "8.11.0", │ │ "service.name": "scan-api", │ │ "service.version": "1.0.7", │ │ "trace.id": "a5f2c1804ac4397448149b63a67f6cfd", ← OTEL trace.get_span() │ │ "span.id": "1979d304b9", │ │ "labels": { "user_id": "***REDACTED***", "scan_id": "uuid" } ← PII 마스킹 │ │ } │ │ │ │ // Go ext-authz Service - slog (Go 1.21+) ECS │ │ { │ │ "@timestamp": "2026-01-12T08:59:59.876Z", │ │ "message": "Authorization allowed", │ │ "log.level": "INFO", │ │ "service.name": "ext-authz", │ │ "trace.id": "a5f2c1804ac4397448149b63a67f6cfd", ← gRPC metadata B3 추출 │ │ "event.action": "authorization", │ │ "event.outcome": "success" │ │ } │ │ │ │ // Istio Envoy Sidecar - EnvoyFilter Access Log (%TRACE_ID% 자동 생성) │ │ { │ │ "trace.id": "a5f2c1804ac4397448149b63a67f6cfd", │ │ "span.id": "1979d304b9", │ │ "http.request.method": "GET", │ │ "http.response.status_code": 200, │ │ "duration_ms": 18 │ │ } │ │ │ ├─────────────────────────────────────────────────────────────────────────────────┤ │ Kibana → trace.id 클릭 → Jaeger UI (분산 트레이싱 시각화) │ └─────────────────────────────────────────────────────────────────────────────────┘

Service	Language	Tracing SDK	trace_id 주입 방식
auth-api, scan-api, chat-api...	Python	OpenTelemetry	ECSJsonFormatter + trace.get_current_span()
ext-authz	Go 1.21+	OpenTelemetry	slog + gRPC B3 metadata extraction
Istio Sidecar	Envoy	Built-in	EnvoyFilter %TRACE_ID% (99.8%)
Jaeger Collector	-	OTLP	gRPC 4317 → Memory (dev)

🔗 Jaeger Distributed Tracing

→ jaeger.dev.growbin.app

┌─────────────────────────────────────────────────────────────────────────────────┐ │ Alertmanager → Slack Automation Pipeline │ ├─────────────────────────────────────────────────────────────────────────────────┤ │ │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │ │ Prometheus │───►│ AlertRules │───►│Alertmanager │───►│ Slack │ │ │ │ Metrics │ │ (PrometheusRule)│ (groupBy) │ │ #eco_sre │ │ │ └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘ │ │ │ │ AlertmanagerConfig (CR): │ │ ┌─────────────────────────────────────────────────────────────────────────┐ │ │ │ route: │ │ │ │ groupBy: [alertname, namespace, severity] │ │ │ │ groupWait: 30s │ │ │ │ repeatInterval: 4h │ │ │ │ routes: │ │ │ │ - receiver: slack-critical (severity=critical) 🔴 │ │ │ │ - receiver: slack-warning (severity=warning) 🟠 │ │ │ │ - receiver: 'null' (alertname=Watchdog) 무시 │ │ │ └─────────────────────────────────────────────────────────────────────────┘ │ │ │ │ Alert Examples: │ │ • 🟠 [WARNING] CPUThrottlingHigh - character namespace 34% throttling │ │ • 🟠 [WARNING] KubeHpaMaxedOut - HPA max replicas 15분+ 유지 │ │ • 🟠 [WARNING] TargetDown - auth-api targets 100% down │ │ │ └─────────────────────────────────────────────────────────────────────────────────┘

Severity	Receiver	Color	예시 Alert
🔴 critical	slack-critical	#ff0000	PodCrashLooping, NodeNotReady
🟠 warning	slack-warning	#ffa500	CPUThrottlingHigh, TargetDown
🟡 info	slack-info	#ffff00	KubeHpaMaxedOut
- Watchdog	null	-	무시 (heartbeat)

💬 Slack Channel

→ #eco_sre (Eco² Backend / Infra)

📊 Prometheus Service Dependency DAG: Jaeger 📊 Prometheus Service Dependency DAG: Jaeger

Prometheus가 중앙 허브로 모든 서비스에서 메트릭 수집 — 8개 도메인 API, 3개 Redis 인스턴스, event-router, sse-gateway, ext-authz 등 전체 서비스 토폴로지 Prometheus as central hub collecting metrics from all services — 8 domain APIs, 3 Redis instances, event-router, sse-gateway, ext-authz and full service topology

┌─────────────────────────────────────────────────────────────────────────────────┐ │ Dual-Protocol Tracing: Zipkin (Istio) + OTLP (Application) │ ├─────────────────────────────────────────────────────────────────────────────────┤ │ │ │ ┌─────────────────────────────────────────────────────────────────────────┐ │ │ │ Istio Sidecar (Envoy) │ │ │ │ ┌─────────────┐ ┌─────────────┐ │ │ │ │ │ App Sidecar │────── Zipkin Protocol ──────────►│ Jaeger │ │ │ │ │ │ (istio-proxy)│ Port 9411 │ All-in-One │ │ │ │ │ └─────────────┘ │ (512MB) │ │ │ │ └─────────────────────────────────────────────────────┴──────┬──────┴─────┘ │ │ │ │ │ ┌─────────────────────────────────────────────────────────────────────────┐ │ │ │ Application Layer │ │ │ │ ┌─────────────┐ ┌──────▼──────┐ │ │ │ │ │ Python API │────── OTLP/gRPC ─────────────────►│ Jaeger │ │ │ │ │ │ (OTEL SDK) │ Port 4317 │ Collector │ │ │ │ │ └─────────────┘ └─────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────────────┘ │ │ │ │ ⚠️ NetworkPolicy: Port 9411 필수 (미개방 시 "No service dependencies found") │ │ │ └─────────────────────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────────────────────┐ │ Service Mesh Observability (Istio v1.24.1) │ ├─────────────────────────────────────────────────────────────────────────────────┤ │ │ │ ┌─ Kiali (Service Graph) ─────────────────────────────────────────────────┐ │ │ │ │ │ │ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ │ │ │ │ Istio │─────►│ ext-authz│─────►│ auth-api │ │ │ │ │ │ Ingress │ │ (Go) │ │ (Python) │ │ │ │ │ └──────────┘ └──────────┘ └────┬─────┘ │ │ │ │ │ │ │ │ │ ┌──────────┐ ┌──────────┐ ┌────▼─────┐ │ │ │ │ │ scan-api │─────►│ RabbitMQ │─────►│ Workers │ │ │ │ │ │ │ │ │ │ (TaskIQ) │ │ │ │ │ └──────────┘ └──────────┘ └──────────┘ │ │ │ │ │ │ │ │ Traffic: ━━━ HTTP/gRPC ─── AMQP ··· Redis │ │ │ │ Health: 🟢 Healthy 🟡 Degraded 🔴 Failure │ │ │ └─────────────────────────────────────────────────────────────────────────┘ │ │ │ │ ┌─ Jaeger (Distributed Tracing) ──────────────────────────────────────────┐ │ │ │ │ │ │ │ Trace: a5f2c1804ac4397448149b63a67f6cfd │ │ │ │ ├─ istio-ingressgateway (2ms) │ │ │ │ ├─ ext-authz (5ms) │ │ │ │ ├─ scan-api (18ms) │ │ │ │ │ └─ PostgreSQL query (3ms) │ │ │ │ │ └─ RabbitMQ publish (2ms) │ │ │ │ └─ scan-worker (12,450ms) │ │ │ │ └─ OpenAI Vision API (3,200ms) │ │ │ │ └─ OpenAI Answer API (8,100ms) │ │ │ │ │ │ │ └─────────────────────────────────────────────────────────────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────────────────────┘

Tool	URL	Protocol	기능
Kiali	kiali.dev.growbin.app	-	Service Graph, Traffic Flow, Health Status
Jaeger	jaeger.dev.growbin.app	Zipkin 9411, OTLP 4317	Distributed Tracing, Span Analysis
Grafana	grafana.dev.growbin.app	-	Metrics Dashboard, Alerting

ServiceEntry: External Dependencies (Mesh 외부 가시화)

Google OAuth
accounts.google.com
www.googleapis.com

Kakao OAuth
kauth.kakao.com
kapi.kakao.com

Naver OAuth
nid.naver.com
openapi.naver.com

OpenAI API
api.openai.com

AWS S3
s3.ap-northeast-2

CloudFront CDN
cdn.growbin.app

🔗 Service Mesh Tools

→ Kiali - Service Mesh Topology → Jaeger - Distributed Tracing

┌─────────────────────────────────────────────────────────────────────────────────┐ │ Scan Pipeline: Celery Chain Architecture │ ├─────────────────────────────────────────────────────────────────────────────────┤ │ │ │ Client Request │ │ │ │ │ ▼ │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │ │ VISION │────►│ RULE │────►│ ANSWER │────►│ REWARD │ │ │ │ scan.vision│ │ scan.rule │ │ scan.answer │ │ scan.reward │ │ │ └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘ │ │ │ │ │ │ │ │ ▼ ▼ ▼ ▼ │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │ │ OpenAI │ │ PostgreSQL │ │ OpenAI │ │ Character │ │ │ │ Vision API │ │ Rule DB │ │ Chat API │ │ Matching │ │ │ │ (~4.5s) │ │ (~0.3s) │ │ (~4.8s) │ │ (~1.7s) │ │ │ └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘ │ │ │ │ Chain Signature: vision.s() | rule.s() | answer.s() | reward.s() │ │ Total Duration: ~11-21s (varies by OpenAI API latency) │ │ │ └─────────────────────────────────────────────────────────────────────────────────┘

Stage	평균Avg	p99	역할Role
vision	4.5s	6~10s	OpenAI Vision API (이미지 분석)OpenAI Vision API (image analysis)
rule	0.3s	2~3s	로컬 캐시 Rule-Based RetrievalLocal Cache Rule-Based Retrieval
answer	4.8s	8~15s	OpenAI Chat API (답변 생성)OpenAI Chat API (answer generation)
reward	1.7s	3~5s	캐릭터 매칭 + DB 저장Character matching + DB save

📊 Stage 소요시간 비율 📊 Stage Duration Breakdown

answer (45%) ████████████████████ 4.8s

vision (40%) █████████████████ 4.5s

reward (14%) ██████ 1.7s

rule (5%) ██ 0.3s

⚠️ OpenAI API가 전체의 85% 차지 (vision + answer) ⚠️ OpenAI API accounts for 85% of total (vision + answer)

┌─────────────────────────────────────────────────────────────────────────────────────────┐ │ Concurrency Model per Pipeline Characteristics │ ├─────────────────────────────────────────────────────────────────────────────────────────┤ │ │ │ ┌─ prefork ──────────────┐ ┌─ Gevent ──────────────────┐ ┌─ asyncio ─────────────┐ │ │ │ OS Process (fork) │ │ Greenlet (monkey-patch) │ │ Coroutine (await) │ │ │ │ ~10-50MB / worker │ │ ~4-8KB / greenlet │ │ ~2-4KB / coroutine │ │ │ │ Context SW: ~1-10ms │ │ Context SW: ~1-10µs │ │ Context SW: ~100ns │ │ │ │ │ │ │ │ │ │ │ │ character-worker │ │ scan-worker │ │ chat-worker │ │ │ │ info-worker │ │ -P gevent -c 100 │ │ Taskiq + aio-pika │ │ │ │ users-worker │ │ │ │ │ │ │ │ │ │ Vision → Retrieval → │ │ LangGraph astream() │ │ │ │ 단순 DB UPSERT, │ │ Answer → Reward │ │ async/await native │ │ │ │ 캐시 갱신 등 │ │ (I/O 65%: OpenAI API) │ │ Multi-Agent 병렬 │ │ │ │ 저레이턴시 태스크 │ │ │ │ │ │ │ └────────────────────────┘ └────────────────────────────┘ └───────────────────────┘ │ │ │ │ Decision Matrix: │ │ ┌────────────────────┬─────────────────────────────────────────────────────────────┐ │ │ │ CPU-bound / 단순 │ → prefork (GIL 우회, 프로세스 격리) │ │ │ │ I/O-bound + 동기 │ → Gevent (기존 동기 코드 monkey-patch로 비동기화) │ │ │ │ I/O-bound + 비동기 │ → asyncio (LangGraph/aiohttp native, 최소 오버헤드) │ │ │ └────────────────────┴─────────────────────────────────────────────────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────────────────────────────┘

항목Item	prefork (Celery)	Gevent (Celery)	asyncio (Taskiq)
실행 단위Execution Unit	OS Process	Greenlet	Coroutine
메모리/단위Memory/Unit	~10-50MB	~4-8KB	~2-4KB
컨텍스트 스위치Context Switch	~1-10ms	~1-10µs	~100ns-1µs
동시 처리Concurrency	6-9 (메모리 제한)6-9 (memory limited)	100+	수천+ (이벤트 루프)1000s+ (event loop)
적용 워커Applied Worker	character, info, users	scan-worker	chat-worker
선택 근거Rationale	단순 DB 태스크, 프로세스 격리Simple DB tasks, process isolation	기존 동기 코드 + 고 I/O (65% OpenAI)Legacy sync code + high I/O (65% OpenAI)	LangGraph async native 극대화LangGraph async native maximization
RPS	충분 (경량 태스크)Sufficient (lightweight tasks)	~4 (OpenAI rate limit)~4 (OpenAI rate limit)	LLM 응답 대기 중 다른 요청 처리Processes other requests during LLM wait

📊 Scan Worker 마이그레이션: prefork → Gevent📊 Scan Worker Migration: prefork → Gevent

❌ Before (prefork)

• -P prefork -c 8

• Memory: ~3.6GB (8 forked processes)

• 1 Worker = 1 Task = 41s blocking

• Measured: 0.0323 RPS

✅ After (Gevent)

• -P gevent -c 100

• Memory: ~500MB (single process)

• I/O 대기 시 자동 greenlet 전환Auto greenlet switch on I/O wait

• Measured: ~4 RPS (OpenAI limit)

⚠️ Gevent + asyncio 이벤트 루프 충돌 — 왜 Taskiq을 별도 도입했는가⚠️ Gevent + asyncio Event Loop Conflict — Why Taskiq was adopted separately

Gevent monkey-patch가 소켓/SSL/select를 덮어쓰면서 asyncio event loop와 충돌 → 98% 요청 실패Gevent monkey-patch overrides socket/SSL/select, conflicting with asyncio event loop → 98% request failure

🏗️ 현재 워커 스택🏗️ Current Worker Stack

Worker	동시성Concurrency	Engine	Broker	파이프라인Pipeline
scan-worker	Gevent (100)	Celery	RabbitMQ	Vision→Retrieval→Answer→Reward
chat-worker	asyncio	Taskiq	RabbitMQ (aio-pika)	LangGraph Multi-Agent astream()
character-worker	prefork	Celery	RabbitMQ	DB UPSERT, cache sync
info-worker	prefork	Celery	RabbitMQ	Beat: news collection (5m/30m)
users-worker	prefork	Celery	RabbitMQ	save_character UPSERT

📈 Grafana Snapshot

→ Gevent 전환 전 분석 (Chain Avg: 41.65초, TTFB p50: 10초)Pre-Gevent analysis (Chain Avg: 41.65s, TTFB p50: 10s)

┌─────────────────────────────────────────────────────────────────────────────────┐ │ Exactly-once Semantics via Idempotency │ ├─────────────────────────────────────────────────────────────────────────────────┤ │ │ │ ┌─ Celery 재시도 전략 ──────────────────────────────────────────────────┐ │ │ │ │ │ │ │ @celery_app.task( │ │ │ │ max_retries=5, # 최대 5회 재시도 │ │ │ │ autoretry_for=(Exception,), │ │ │ │ retry_backoff=True, # Exponential backoff │ │ │ │ retry_backoff_max=300, # 최대 5분 대기 │ │ │ │ ) │ │ │ │ │ │ │ │ 재시도 간격: 1초 → 2초 → 4초 → 8초 → 16초 (지수 증가) │ │ │ │ │ │ │ └────────────────────────────────────────────────────────────────────────┘ │ │ │ │ ┌─ 멱등성 키 패턴 ──────────────────────────────────────────────────────┐ │ │ │ │ │ │ │ # Deterministic UUID 생성 │ │ │ │ idempotency_key = uuid5(NAMESPACE, f"{user_id}:{scan_id}") │ │ │ │ │ │ │ │ # 중복 체크 │ │ │ │ if redis.exists(f"processed:{idempotency_key}"): │ │ │ │ return {"status": "already_processed"} │ │ │ │ │ │ │ │ # 처리 후 마킹 │ │ │ │ redis.setex(f"processed:{idempotency_key}", TTL, "1") │ │ │ │ │ │ │ └────────────────────────────────────────────────────────────────────────┘ │ │ │ │ ┌─ DLQ (Dead Letter Queue) 흐름 ─────────────────────────────────────────┐ │ │ │ │ │ │ │ Task 실패 (5회 재시도 소진) │ │ │ │ │ │ │ │ │ ▼ │ │ │ │ DLX (Dead Letter Exchange) → DLQ (dlq.scan.vision 등) │ │ │ │ │ │ │ │ │ ▼ │ │ │ │ celery-beat: 매 5분 DLQ 재처리 시도 │ │ │ │ │ │ │ └────────────────────────────────────────────────────────────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────────────────────┘

전략	설명	효과
at-least-once	실패 시 자동 재시도 (최대 5회)	메시지 유실 방지
Exponential Backoff	1s → 2s → 4s → 8s → 16s	일시 장애 복원력
Idempotency Key	user_id + scan_id 기반	중복 처리 방지
DLQ	최종 실패 메시지 보관	celery-beat 재처리

아키텍처 진화 과정 Architecture Evolution

┌─────────────────────────────────────────────────────────────────────────────────────┐ │ Event Bus Layer Architecture Evolution │ ├─────────────────────────────────────────────────────────────────────────────────────┤ │ │ │ ❌ Phase 0: Celery Events │ ❌ Phase 1-3: Connection-per-XREAD │ │ ───────────────────────── │ ────────────────────────────────── │ │ SSE → RabbitMQ (21 conn/client) │ SSE Pod ↔ Redis XREAD (N coroutines) │ │ 50 VU = 341 connections → 503 error │ CPU 85%, Context switching overhead │ │ │ StatefulSet hash mismatch │ │ │ │ ├─────────────────────────────────────────────────────────────────────────────────────┤ │ │ │ ✅ Phase 4: Event Bus Layer (Current Solution) │ │ ─────────────────────────────────────────────── │ │ │ │ ┌─────────────┐ ┌──────────────────┐ ┌─────────────────────┐ │ │ │ Worker │─────▶│ Redis Streams │─────▶│ Event Router │ │ │ │ (Scan/Chat) │ XADD │ {domain}:events │XREAD │ (Consumer Group) │ │ │ └─────────────┘ │ :{shard 0-3} │ GROUP└─────────────────────┘ │ │ └──────────────────┘ │ │ │ │ PUBLISH │ │ ▼ │ │ ┌─────────────┐ ┌──────────────────┐ ┌─────────────────────┐ │ │ │ Client │◀─────│ SSE Gateway │◀─────│ Pub/Sub │ │ │ │ (Browser) │ SSE │ (4 shard SUB) │ SUB │ sse:events:{shard} │ │ │ └─────────────┘ └──────────────────┘ └─────────────────────┘ │ │ ▲ │ │ │ State Recovery │ │ ┌──────────────────┐ │ │ │ State KV │ │ │ │ {domain}:state │ │ │ │ :{job_id} │ │ │ └──────────────────┘ │ │ │ │ Key Benefit: Pod count ⟂ Shard count (독립적 스케일링 가능) │ │ │ └─────────────────────────────────────────────────────────────────────────────────────┘

저장 + 전달 + 복구 책임 분리 Storage + Delivery + Recovery Separation

💾 Redis Streams (영속)

Key: {domain}:events:{shard}

• 4 shards (0-3)

• XREADGROUP Consumer Group

• AOF + Replication

• rfr-streams-redis 클러스터

📡 Pub/Sub (실시간) Shard 기반

Channel: sse:events:{shard}

• 4 shards (hash(job_id) % 4)

• SSE GW: 4개 shard 구독 → job_id 내부 라우팅

• 연결 수: O(N) → O(4)

• rfr-pubsub-redis 클러스터

🔄 State KV (복구)

Key: {domain}:state:{job_id}

• 마지막 상태 스냅샷

• SSE 재연결 시 catch-up

• seq 기반 중복 방지

• TTL 1h (자동 정리)

Event Router 핵심 컴포넌트 Event Router Core Components

StreamConsumer

• XREADGROUP GROUP eventrouter

• COUNT 100 BLOCK 5000ms

• 4 shards 병렬 소비

• 처리 후 XACK

EventProcessor

• Atomic Lua Script

• Idempotency: router:published:{job_id}:{seq}

• Sequence 검증

• State 업데이트 → PUBLISH

PendingReclaimer

• XAUTOCLAIM 주기: 60s

• Idle threshold: 5분

• 미처리 메시지 복구

• Idempotency 보장

🔀 Pub/Sub Shard 최적화: O(N) → O(4) 🔀 Pub/Sub Shard Optimization: O(N) → O(4)

❌ Before: job_id별 채널

sse:events:{job_id}

• 1,000 VU = 1,000 Pub/Sub 연결

• Redis 연결 한계 (기본 10,000) 도달 가능

• 메모리 사용량 증가 (연결당 ~10KB)

✅ After: shard별 채널

sse:events:{shard} (shard = hash(job_id) % 4)

• 1,000 VU = 4 Pub/Sub 연결

• SSE GW: 시작 시 4개 shard 전체 구독

• 메시지 내 job_id로 내부 라우팅

동시 접속 수Concurrent Users	Streams	Pub/Sub (Before)	Pub/Sub (After)	절감률Reduction
100명	4개 (고정)	100개	4개	96%
1,000명	4개 (고정)	1,000개	4개	99.6%
10,000명	4개 (고정)	10,000개 ⚠️	4개	99.96%

연결 복잡도 비교 Connection Complexity Comparison

구분Phase	연결 구조Connection Structure	50 VU 연결 수50 VU Connections	스케일링Scaling	결과Result
Celery Events	SSE × RabbitMQ = O(n×m)	341개 (1:21 비율)	❌ 불가	503 Error
Connection-per-XREAD	SSE × Coroutine = O(n)	50개	⚠️ CPU 85%	62% 완료율
Event Bus Layer	Router(1) + SSE(SUB) = O(n)	≈20개	✅ HPA/KEDA	100% @ 500VU

VU	완료율Completion	처리량Throughput	E2E p95	Scan API p95	스냅샷Snapshot
500	100%	367.9 req/m	83.3s	232ms	SLA →
600	99.7%	358.6 req/m	108.3s	360ms	→
700	99.2%	329.1 req/m	122.3s	444ms	Live →
800	99.7%	367.3 req/m	144.6s	734ms	→
900	99.7%	405.5 req/m	149.6s	635ms	→
1000	97.8%	373.4 req/m	173.3s	787ms	→

📈 VU 발전 과정 분석 📈 VU Progression Analysis

SLA 기준: 100% 완료율 → 500 VU가 SLA 기준
600-900 VU: 99.2%+ 완료율 유지, Safe Range
1000 VU: 완료율 97.8%로 하락, 포화 지점
테스트 환경: OpenAI Tier 4 (TPM 4M), Worker minReplica=2, maxReplica=5 SLA Criteria: 100% completion → 500 VU is SLA baseline
600-900 VU: 99.2%+ completion maintained, Safe Range
1000 VU: Completion drops to 97.8%, saturation point
Test Env: OpenAI Tier 4 (TPM 4M), Worker minReplica=2, maxReplica=5

┌─────────────────────────────────────────────────────────────────────────────────┐ │ LLM Pipeline I/O-bound 특성 │ ├─────────────────────────────────────────────────────────────────────────────────┤ │ │ │ Stage별 소요 시간 (저부하 k6 VU 10 기준) │ │ ───────────────────────────────────────── │ │ │ │ vision (40%) ████████████████████████████████████████ 4.5초 (OpenAI Vision) │ │ answer (45%) ████████████████████████████████████████████ 4.8초 (OpenAI Chat)│ │ rule (3%) ██ 0.3초 (PostgreSQL) │ │ reward (15%) ███████████████ 1.7초 (Character Match) │ │ │ │ ────────────────────────────────────────────────────────────────────────── │ │ Total: ~11-12초 (저부하 평균) / 🎯 SLA 500 VU: p95 83.3초 │ │ │ ├─────────────────────────────────────────────────────────────────────────────────┤ │ │ │ ⚠️ 핵심 병목: OpenAI API (vision + answer) = 전체의 82% │ │ │ │ 100% I/O-bound 워크로드: │ │ ├─ OpenAI API 호출: 82% (vision 4.5s + answer 4.8s = 9.3s) │ │ ├─ Rule-based retrieval: 3% (PostgreSQL I/O) │ │ └─ Character matching: 15% (DB + gRPC I/O) │ │ │ │ → Gevent (greenlet) 전환으로 100+ 동시 I/O 가능 │ │ → 500 VU에서 100% 완료율 달성 (SLA 기준) │ │ │ └─────────────────────────────────────────────────────────────────────────────────┘

Stage	평균 (VU 10)Avg (VU 10)	비율Ratio	병목 원인Bottleneck
vision	4.5초	40%	OpenAI Vision API (gpt-5.2)
answer	4.8초	42%	OpenAI Chat API (gpt-5.2)
reward	1.7초	15%	Character Match gRPC
rule	0.3초	3%	PostgreSQL 쿼리
Total	~11.3초	100%	🎯 SLA: 500 VU p95 83.3초

┌─────────────────────────────────────────────────────────────────────────────────┐ │ SSE 재연결 시 누락 이벤트 복구 메커니즘 │ ├─────────────────────────────────────────────────────────────────────────────────┤ │ │ │ ┌─ Redis 3종 분리 ────────────────────────────────────────────────────────┐ │ │ │ │ │ │ │ Streams (영속) Pub/Sub (실시간) State KV (복구) │ │ │ │ scan:events:{shard} sse:events:{shard} scan:state:{job_id} │ │ │ │ XADD (4개 샤드) PUBLISH (4개 샤드) SETEX (TTL 30분) │ │ │ │ │ │ │ └─────────────────────────────────────────────────────────────────────────┘ │ │ │ │ ┌─ 재연결 복구 흐름 ─────────────────────────────────────────────────────┐ │ │ │ │ │ │ │ 1. Client SSE 재연결 (네트워크 끊김 후) │ │ │ │ │ │ │ │ │ ▼ │ │ │ │ 2. SSE Gateway: 4개 shard Pub/Sub 구독 (job_id로 내부 라우팅) │ │ │ │ │ │ │ │ │ ▼ │ │ │ │ 3. State KV 조회: scan:state:{job_id} │ │ │ │ │ → 현재 stage, seq, status 확인 │ │ │ │ ▼ │ │ │ │ 4. Streams Catch-up (last_seq 이후 이벤트 XRANGE) │ │ │ │ │ │ │ │ │ ▼ │ │ │ │ 5. 실시간 Pub/Sub 이벤트 수신 계속 │ │ │ │ │ │ │ └────────────────────────────────────────────────────────────────────────┘ │ │ │ │ 핵심 원칙: "구독 먼저, State 조회 나중" → Race Condition 방지 │ │ │ └─────────────────────────────────────────────────────────────────────────────────┘

저장소	용도	TTL
Redis Streams	이벤트 영속 저장 (4개 샤드)	maxlen 50
Redis Pub/Sub	실시간 브로드캐스트	Fire-and-forget
Redis State KV	현재 상태 스냅샷	30분

✅ Race Condition 해결

• Worker와 Event Router의 State 갱신 권한 단일화
• seq 순서와 관계없이 모든 이벤트 Pub/Sub 발행
• Pub/Sub 구독 → State 조회 순서 강제
• Streams catch-up으로 누락 이벤트 복구

┌─────────────────────────────────────────────────────────────────────────────────┐ │ Write Offloading: 판정/저장 분리 │ ├─────────────────────────────────────────────────────────────────────────────────┤ │ │ │ ❌ Before (판정 + 저장 동시) ✅ After (판정/저장 분리) │ │ ──────────────────────────── ───────────────────────────── │ │ scan_reward_task 시작 scan_reward_task 시작 │ │ │ │ │ │ ├─ 캐릭터 매칭 (판정) ├─ 캐릭터 매칭 (판정) │ │ ├─ character_ownerships INSERT ├─ persist_reward.delay() │ │ ├─ session.commit() │ (Fire & Forget) │ │ ├─ sync_to_users_task.delay() │ │ │ ├─ gRPC 호출 └─ 클라이언트 응답 ✅ (~100ms) │ │ │ │ │ └─ 클라이언트 응답 (~600ms) (비동기, 클라이언트 응답과 무관) │ │ persist_reward_task │ │ │ │ │ ├─ save_ownership.delay() │ │ └─ save_user_character.delay() │ │ │ │ 응답 시간: 600ms → 100ms (83% 감소) │ │ │ └─────────────────────────────────────────────────────────────────────────────────┘

항목	Before	After
클라이언트 응답	판정 + DB 저장 완료 후	판정 즉시
DB 저장	순차 (character → users gRPC)	병렬 (Fire & Forget)
users 도메인 연동	gRPC 호출	직접 DB INSERT
실패 시 영향	전체 실패	각자 독립 재시도

📈 핵심 성과

• 응답 시간 83% 감소 (600ms → 100ms)
• gRPC 네트워크 오버헤드 제거
• DB 장애 격리 (응답에 영향 없음)
• 각 Worker 독립적 재시도 (5회, exponential backoff)

┌─────────────────────────────────────────────────────────────────────────────────┐ │ Exactly-once = at-least-once + Idempotency │ ├─────────────────────────────────────────────────────────────────────────────────┤ │ │ │ ┌─ Deterministic UUID 생성 ──────────────────────────────────────────────┐ │ │ │ │ │ │ │ # user_id + scan_id → 동일 입력이면 동일 UUID │ │ │ │ idempotency_key = uuid5(NAMESPACE, f"{user_id}:{scan_id}") │ │ │ │ │ │ │ │ 장점: │ │ │ │ • 클라이언트 재시도 시 동일 키 보장 │ │ │ │ • DB Unique Constraint로 중복 방지 │ │ │ │ • Race condition 방어 (동시 요청도 하나만 성공) │ │ │ │ │ │ │ └────────────────────────────────────────────────────────────────────────┘ │ │ │ │ ┌─ 재시도 전략 ─────────────────────────────────────────────────────────┐ │ │ │ │ │ │ │ 실패 → 자동 재시도 5회 (exponential backoff: 1s → 2s → 4s → 8s → 16s)│ │ │ │ → 최종 실패 시 DLQ 보관 │ │ │ │ → celery-beat: 매 5분 DLQ 재처리 │ │ │ │ │ │ │ └────────────────────────────────────────────────────────────────────────┘ │ │ │ │ ┌─ 중복 처리 방어 ──────────────────────────────────────────────────────┐ │ │ │ │ │ │ │ # DB 레벨 방어 │ │ │ │ try: │ │ │ │ await ownership_repo.insert_owned(...) │ │ │ │ await session.commit() │ │ │ │ except IntegrityError: │ │ │ │ # Race condition: 이미 다른 요청이 삽입함 │ │ │ │ return {"saved": False, "reason": "concurrent_insert"} │ │ │ │ │ │ │ └────────────────────────────────────────────────────────────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────────────────────┘

계층	방어 메커니즘	효과
Application	Deterministic UUID (uuid5)	동일 입력 → 동일 키
Cache	Redis SETEX (idempotency 마킹)	빠른 중복 체크
Database	Unique Constraint	최종 중복 방지
Worker	IntegrityError 핸들링	Race condition 방어

┌─────────────────────────────────────────────────────────────────────────────────┐ │ Fanout Exchange: 단일 이벤트 → 다중 Worker │ ├─────────────────────────────────────────────────────────────────────────────────┤ │ │ │ scan_reward_task (판정 완료) │ │ │ │ │ ▼ │ │ persist_reward_task.delay() ─── dispatcher │ │ │ │ │ ├────────────────────────────────────────────┐ │ │ │ │ │ │ ▼ ▼ │ │ ┌─────────────────────┐ ┌─────────────────────┐ │ │ │ save_ownership_task │ │ save_user_character │ │ │ │ (reward.persist 큐) │ │ (users.sync 큐) │ │ │ └──────────┬──────────┘ └──────────┬──────────┘ │ │ │ │ │ │ ▼ ▼ │ │ ┌─────────────────────┐ ┌─────────────────────┐ │ │ │ character DB │ │ users DB │ │ │ │ character_ownerships│ │ user_characters │ │ │ │ INSERT │ │ INSERT (직접) │ │ │ └─────────────────────┘ └─────────────────────┘ │ │ │ │ 특징: │ │ • 둘 다 Fire & Forget (발행 후 즉시 반환) │ │ • 하나가 실패해도 다른 하나는 정상 발행 │ │ • 각자 독립적 재시도 (5회, exponential backoff) │ │ • gRPC 제거 → users DB 직접 접근 │ │ │ └─────────────────────────────────────────────────────────────────────────────────┘

Task	큐	대상 DB	재시도
persist_reward_task	reward.persist	dispatcher	3회
save_ownership_task	reward.persist	character.character_ownerships	5회
save_user_character_task	users.sync	users.user_characters	5회

┌─────────────────────────────────────────────────────────────────────────────────┐ │ Cache Invalidation: Eventual Consistency │ ├─────────────────────────────────────────────────────────────────────────────────┤ │ │ │ ┌─ 캐시 무효화 흐름 ─────────────────────────────────────────────────────┐ │ │ │ │ │ │ │ DB 변경 (character_ownerships INSERT) │ │ │ │ │ │ │ │ │ ▼ │ │ │ │ character.events (Fanout Exchange) │ │ │ │ │ │ │ │ │ ├────────────────────────────────────────────┐ │ │ │ │ ▼ ▼ │ │ │ │ ┌──────────────────┐ ┌──────────────────┐ │ │ │ │ │ character_worker │ │ ext-authz │ │ │ │ │ │ Local Cache 갱신 │ │ Local Cache 갱신 │ │ │ │ │ └──────────────────┘ └──────────────────┘ │ │ │ │ │ │ │ └────────────────────────────────────────────────────────────────────────┘ │ │ │ │ ┌─ TTL 기반 자동 만료 ──────────────────────────────────────────────────┐ │ │ │ │ │ │ │ • Local Cache TTL: 10초 │ │ │ │ • 최악의 경우 10초간 stale data 가능 (Eventual Consistency) │ │ │ │ • TTL 만료 후 DB에서 fresh data 로드 │ │ │ │ │ │ │ └────────────────────────────────────────────────────────────────────────┘ │ │ │ │ ┌─ 신규 Worker 부팅 시 워밍 ────────────────────────────────────────────┐ │ │ │ │ │ │ │ Worker 시작 시: │ │ │ │ 1. DB에서 현재 상태 로드 │ │ │ │ 2. Local Cache 초기화 │ │ │ │ 3. Fanout Exchange 구독 시작 │ │ │ │ 4. 이후 이벤트 수신하며 실시간 갱신 │ │ │ │ │ │ │ └────────────────────────────────────────────────────────────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────────────────────┘

전략	메커니즘	지연 시간
Fanout Broadcast	Exchange → 모든 구독 Worker	~10ms
TTL 기반 만료	Local Cache TTL 10초	최대 10초
신규 Worker 워밍	부팅 시 DB 로드 + 구독	즉시

⚠️ Trade-off

• 장점: 빠른 응답, 장애 격리, 수평 확장 용이
• 단점: 최대 10초간 stale data 가능
• 적용 기준: Strong Consistency 불필요한 영역에만 적용

┌─────────────────────────────────────────────────────────────────────────────────┐ │ Chat Pipeline - 3중 Eventual Consistency │ ├─────────────────────────────────────────────────────────────────────────────────┤ │ │ │ ① CheckpointSyncService (LangGraph 체크포인트) │ │ ───────────────────────────────────────────── │ │ chat-worker CheckpointSyncService (replicas=1) │ │ ┌──────────┐ ┌──────────────────────────────────┐ │ │ │ aput() │──RPUSH──→ │ BRPOP (blocking, 5s timeout) │ │ │ │ Redis │ sync:queue │ Batch 수집 (max=50, drain=2s) │ │ │ └──────────┘ │ Dedup: (thread_id, ns) → latest │ │ │ │ Redis read → PG upsert │ │ │ │ 실패 → DLQ (checkpoint:sync:dlq)│ │ │ └──────────────────────────────────┘ │ │ │ │ ② ReadThroughCheckpointer (읽기 경로 최적화) │ │ ───────────────────────────────────────────── │ │ graph.ainvoke() │ │ ┌──────────────┐ ┌──────────────────────────────────────────┐ │ │ │ aget_tuple() │──→ │ Redis hit? → return (hot, ~1ms) │ │ │ │ │ │ Redis miss? → PG read → Redis promote │ │ │ │ │ │ (Temporal Locality) │ │ │ └──────────────┘ └──────────────────────────────────────────┘ │ │ │ │ ③ Chat Persistence Consumer (메시지 영속화) │ │ ───────────────────────────────────────────── │ │ chat:events:{shard} Persistence Consumer (group: chat-persistence) │ │ ┌──────────────┐ ┌──────────────────────────────────┐ │ │ │ done 이벤트 │──XREADGROUP─→│ MessageSaveHandler │ │ │ │ (chat-worker │ │ batch flush: 100개 or 5초 │ │ │ │ 발행) │ │ → chat.messages INSERT │ │ │ └──────────────┘ │ XACK on success │ │ │ └──────────────────────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────────────────────┘

// checkpoint_sync_service.py — Batch Dedup 핵심 로직

async def _collect_batch(self) -> list[SyncEvent]:
    """BRPOP + drain → batch 수집."""
    batch = []
    first = await self._redis.brpop(SYNC_QUEUE_KEY, timeout=5)
    if first:
        batch.append(parse(first))
        # drain: 추가 메시지를 2초간 수집
        while len(batch) < 50:
            item = await self._redis.rpop(SYNC_QUEUE_KEY)
            if not item: break
            batch.append(parse(item))
    return batch

# Dedup: 동일 (thread_id, checkpoint_ns)는 최신만 유지
dedup = {(e.thread_id, e.ns): e for e in batch}

패턴	Write 경로	Consistency 보장
CheckpointSyncService	Redis → Queue → Batch → PG	DLQ fallback, Redis TTL 24h
ReadThroughCheckpointer	PG miss → Redis promote	Temporal Locality, LRU
Persistence Consumer	Streams → Batch flush → PG	Consumer Group, XACK

💡 설계 원칙

• Redis Primary: 모든 읽기/쓰기의 hot path (~1ms)
• PostgreSQL Eventual: 비동기 배치로 영속화 (latency 무관)
• 장애 격리: PG 불가 시 Queue 축적, 복구 후 자동 재개
• Chat 응답 지연에 DB 쓰기가 영향을 주지 않는 구조

┌─────────────────────────────────────────────────────────────────────────────────┐ │ KEDA + RabbitMQ 기반 Worker 스케일링 │ ├─────────────────────────────────────────────────────────────────────────────────┤ │ │ │ ❌ CPU 기반 HPA의 한계 ✅ KEDA 이벤트 드리븐 │ │ ──────────────────────────── ────────────────────────── │ │ scan-worker (50 VU 테스트 중) RabbitMQ 큐 길이 기반 │ │ CPU: 37m / 1000m (3.7%) 메시지 10개 이상 → 스케일업 │ │ ← CPU 미사용, HPA 미트리거! scan.vision, scan.answer 모니터링 │ │ │ │ 실제 병목: OpenAI API 대기 Worker 1→5 자동 확장 │ │ (6~10초/요청, I/O-bound) cooldownPeriod: 30초 │ │ pollingInterval: 10초 │ │ │ ├─────────────────────────────────────────────────────────────────────────────────┤ │ │ │ ┌─ ScaledObject 구성 ────────────────────────────────────────────────────┐ │ │ │ │ │ │ │ apiVersion: keda.sh/v1alpha1 │ │ │ │ kind: ScaledObject │ │ │ │ spec: │ │ │ │ scaleTargetRef: scan-worker │ │ │ │ minReplicaCount: 1 │ │ │ │ maxReplicaCount: 10 │ │ │ │ triggers: │ │ │ │ - type: rabbitmq │ │ │ │ metadata: │ │ │ │ queueName: scan.vision │ │ │ │ mode: QueueLength │ │ │ │ value: '10' # 10개 이상 → 스케일업 │ │ │ │ │ │ │ └────────────────────────────────────────────────────────────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────────────────────┘

설정	값	설명
minReplicaCount	1	유휴 시 최소 Pod
maxReplicaCount	10	최대 스케일 한계
QueueLength threshold	10	스케일업 트리거
cooldownPeriod	30초	스케일다운 대기
pollingInterval	10초	메트릭 체크 주기

📈 k6 50 VU 테스트 개선

Before (CPU HPA): 35% 완료율
After (KEDA): 86.3% 완료율 → +51.3%p 개선

┌─────────────────────────────────────────────────────────────────────────────────┐ │ Event Router - Multi-domain Processing │ ├─────────────────────────────────────────────────────────────────────────────────┤ │ │ │ ┌─ Stream 소비 (Consumer Group: eventrouter) ──────────────────────────────┐ │ │ │ │ │ │ │ stream_configs = [("scan:events", 4), ("chat:events", 4)] │ │ │ │ │ │ │ │ scan:events:0 ──┐ chat:events:0 ──┐ │ │ │ │ scan:events:1 ──┼─→ XREADGROUP chat:events:1 ──┼─→ XREADGROUP │ │ │ │ scan:events:2 ──┤ (Consumer Group) chat:events:2 ──┤ │ │ │ │ scan:events:3 ──┘ chat:events:3 ──┘ │ │ │ │ │ │ │ └──────────────────────────────────────────────────────────────────────────┘ │ │ │ │ │ ▼ │ │ ┌─ Lua Script: UPDATE_STATE_SCRIPT (원자적) ───────────────────────────────┐ │ │ │ │ │ │ │ 1. 멱등성 체크: router:published:{job_id}:{seq} EXISTS? │ │ │ │ └─ YES → return 0 (스킵) │ │ │ │ └─ NO → 계속 │ │ │ │ │ │ │ │ 2. State 조건부 갱신: new_seq > current_seq? │ │ │ │ └─ YES → SETEX {domain}:state:{job_id} (최신 상태 유지) │ │ │ │ └─ NO → State 유지 (더 최신 값 보존) │ │ │ │ │ │ │ │ 3. 처리 마킹: SETEX router:published:{job_id}:{seq} TTL=2h │ │ │ │ │ │ │ │ 4. return 1 → Pub/Sub 발행 (모든 이벤트 전달, 순서 무관) │ │ │ │ │ │ │ └──────────────────────────────────────────────────────────────────────────┘ │ │ │ │ │ ▼ │ │ ┌─ Reclaimer (XAUTOCLAIM) ─────────────────────────────────────────────────┐ │ │ │ │ │ │ │ 주기: 60초마다 실행 │ │ │ │ 대상: min_idle_ms = 300,000 (5분 이상 미ACK 메시지) │ │ │ │ 처리: asyncio.gather() → 모든 도메인/샤드 병렬 reclaim │ │ │ │ │ │ │ │ scan:events:0 ─┐ │ │ │ │ scan:events:1 ─┤ │ │ │ │ chat:events:0 ─┼─→ XAUTOCLAIM → re-process → ACK │ │ │ │ chat:events:1 ─┤ │ │ │ │ ... ─┘ │ │ │ │ │ │ │ └──────────────────────────────────────────────────────────────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────────────────────┘

// apps/event_router/core/processor.py — Lua Script (핵심 발췌)

UPDATE_STATE_SCRIPT = """
local state_key = KEYS[1]      -- {domain}:state:{job_id}
local publish_key = KEYS[2]    -- router:published:{job_id}:{seq}

-- 멱등성: 이미 처리했으면 스킵
if redis.call('EXISTS', publish_key) == 1 then
    return 0
end

-- State 조건부 갱신 (더 큰 seq만)
local current = redis.call('GET', state_key)
if current then
    local cur_seq = tonumber(cjson.decode(current).seq) or 0
    if new_seq <= cur_seq then should_update_state = false end
end
if should_update_state then
    redis.call('SETEX', state_key, state_ttl, event_data)
end

-- 처리 마킹 + Pub/Sub 발행 허용
redis.call('SETEX', publish_key, published_ttl, '1')
return 1
"""

컴포넌트	역할	핵심 메커니즘
Consumer	XREADGROUP 기반 이벤트 소비	Multi-domain 8 streams 병렬
Processor	State 갱신 + Pub/Sub 발행	Lua Script 원자적 처리
Reclaimer	Pending 메시지 복구	XAUTOCLAIM (5분 idle)

🔑 수평 확장 보장

• 여러 Pod이 같은 job_id 이벤트를 동시에 처리해도 안전
• 순서가 뒤집혀도 모든 이벤트는 Pub/Sub에 발행됨
• State는 가장 큰 seq로만 갱신 → 최신 상태 항상 유지
• Consumer 장애 시 Reclaimer가 5분 후 자동 재할당

┌─────────────────────────────────────────────────────────────────────────────────┐ │ RabbitMQ Kubernetes Operator + Messaging Topology │ ├─────────────────────────────────────────────────────────────────────────────────┤ │ │ │ ┌─ Operator 아키텍처 ────────────────────────────────────────────────────┐ │ │ │ │ │ │ │ RabbitMQ Cluster Operator (v2.12.0) │ │ │ │ └─ RabbitmqCluster CR → eco2-rabbitmq-server (3 replicas, Quorum) │ │ │ │ │ │ │ │ Messaging Topology Operator (v1.16.0) │ │ │ │ └─ Exchange, Queue, Binding CRs → 선언적 토폴로지 관리 │ │ │ │ │ │ │ └────────────────────────────────────────────────────────────────────────┘ │ │ │ │ ┌─ 선언적 토폴로지 예시 ─────────────────────────────────────────────────┐ │ │ │ │ │ │ │ # Exchange CR # Queue CR │ │ │ │ apiVersion: rabbitmq.com/v1beta1 apiVersion: rabbitmq.com/v1beta1 │ │ │ │ kind: Exchange kind: Queue │ │ │ │ metadata: metadata: │ │ │ │ name: scan-exchange name: scan-vision-queue │ │ │ │ spec: spec: │ │ │ │ name: scan.events name: scan.vision │ │ │ │ type: direct type: quorum │ │ │ │ durable: true durable: true │ │ │ │ rabbitmqClusterReference: arguments: │ │ │ │ name: eco2-rabbitmq-server x-dead-letter-exchange: dlx │ │ │ │ x-delivery-limit: 5 │ │ │ │ │ │ │ └────────────────────────────────────────────────────────────────────────┘ │ │ │ │ ┌─ 현재 토폴로지 ────────────────────────────────────────────────────────┐ │ │ │ │ │ │ │ Exchanges: Queues: │ │ │ │ ├─ scan.events (direct) ├─ scan.vision (quorum) │ │ │ │ ├─ blacklist.events (fanout) ├─ scan.rule (quorum) │ │ │ │ ├─ dlx (direct) ├─ scan.answer (quorum) │ │ │ │ └─ character.events (fanout) ├─ scan.reward (quorum) │ │ │ │ ├─ reward.persist (quorum) │ │ │ │ ├─ users.sync (quorum) │ │ │ │ └─ dlq.* (DLQ 큐들) │ │ │ │ │ │ │ └────────────────────────────────────────────────────────────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────────────────────┘

Operator	버전	역할
RabbitMQ Cluster Operator	v2.12.0	RabbitmqCluster CR → 3-node Quorum 클러스터
Messaging Topology Operator	v1.16.0	Exchange, Queue, Binding CR 관리

✅ GitOps 장점

• Queue/Exchange를 YAML로 버전 관리
• ArgoCD가 자동 동기화 (선언적)
• 환경별 Kustomize overlay 적용
• 변경 이력 추적 가능

┌─────────────────────────────────────────────────────────────────────────────────┐ │ Worker 이원 구조: Scan (Celery) vs Chat (Taskiq) │ ├─────────────────────────────────────────────────────────────────────────────────┤ │ │ │ ┌─ scan-worker (Celery + Gevent) ──────────────────────────────────────────┐ │ │ │ │ │ │ │ • 기존 AI 리서처 작성 동기 코드 기반 │ │ │ │ • Gevent 몽키패칭으로 I/O 동시성 확보 │ │ │ │ • Vision→Rule→Answer→Reward 체인 파이프라인 │ │ │ │ • prefork 대비 메모리 3.6GB → 500MB │ │ │ │ │ │ │ └──────────────────────────────────────────────────────────────────────────┘ │ │ │ │ ┌─ chat-worker (Taskiq + asyncio) ─────────────────────────────────────────┐ │ │ │ │ │ │ │ • LangGraph가 asyncio-native → Taskiq 도입 │ │ │ │ • AioPikaBroker: RabbitMQ Direct Exchange (chat_tasks) │ │ │ │ • 분산 트레이싱: aio-pika + OpenAI + Gemini + LangSmith OTEL │ │ │ │ • W3C traceparent 전파 → Jaeger에서 E2E 추적 │ │ │ │ • process_chat_task: timeout=300s, retries=3 │ │ │ │ │ │ │ └──────────────────────────────────────────────────────────────────────────┘ │ │ │ │ ┌─ 공유 인프라 ────────────────────────────────────────────────────────────┐ │ │ │ │ │ │ │ RabbitMQ 클러스터 (3-node Quorum) │ │ │ │ ├─ scan.events (direct) → scan-worker │ │ │ │ ├─ chat_tasks (direct) → chat-worker │ │ │ │ └─ K8s Topology Operator가 Exchange/Queue 선언적 관리 │ │ │ │ │ │ │ │ Redis Event Bus (이벤트 발행 공유) │ │ │ │ ├─ scan:events:{shard} ← scan-worker │ │ │ │ └─ chat:events:{shard} ← chat-worker │ │ │ │ │ │ │ └──────────────────────────────────────────────────────────────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────────────────────┘

// apps/chat_worker/setup/broker.py

broker = AioPikaBroker(
    url=settings.rabbitmq_url,
    declare_exchange=not _is_production,  # 운영: Topology Operator 사용
    exchange_name="chat_tasks",
    exchange_type=ExchangeType.DIRECT,
    routing_key="chat.process",
    queue_name="chat.process",
)

비교	scan-worker (Celery)	chat-worker (Taskiq)
런타임	Gevent (몽키패칭)	asyncio-native
LLM 프레임워크	동기 OpenAI SDK	LangGraph (async)
Exchange	scan.events (direct)	chat_tasks (direct)
이벤트 발행	Redis Streams (scan:events)	Redis Streams (chat:events)
트레이싱	-	aio-pika + OpenAI + Gemini OTEL

💡 선택 근거

• LangGraph의 모든 API가 async/await 기반 → Celery의 동기 워커 모델과 불일치
• Taskiq: asyncio event loop 위에서 네이티브 실행, aio-pika로 RabbitMQ 비동기 소비
• 기존 scan-worker는 동기 코드 기반이므로 Gevent로 유지 (안정성 우선)

┌─────────────────────────────────────────────────────────────────────────────────┐ │ Clean Architecture - 4 Layer DIP │ ├─────────────────────────────────────────────────────────────────────────────────┤ │ │ │ ┌─ 의존성 방향 (바깥 → 안쪽) ──────────────────────────────────────────────┐ │ │ │ │ │ │ │ ┌────────────────────────────────────────────────────────────────┐ │ │ │ │ │ Presentation Layer │ │ │ │ │ │ HTTP Controllers, gRPC Servicers, Schemas │ │ │ │ │ │ └─ presentation/http/, presentation/grpc/ │ │ │ │ │ └────────────────────────────────────────────────────────────────┘ │ │ │ │ ↓ depends on │ │ │ │ ┌────────────────────────────────────────────────────────────────┐ │ │ │ │ │ Application Layer │ │ │ │ │ │ Commands (Write), Queries (Read), DTOs, Ports │ │ │ │ │ │ └─ application/commands/, queries/, ports/ │ │ │ │ │ └────────────────────────────────────────────────────────────────┘ │ │ │ │ ↓ depends on │ │ │ │ ┌────────────────────────────────────────────────────────────────┐ │ │ │ │ │ Domain Layer │ │ │ │ │ │ Entities, Value Objects, Domain Services, Domain Ports │ │ │ │ │ │ └─ domain/entities/, value_objects/, services/ │ │ │ │ │ └────────────────────────────────────────────────────────────────┘ │ │ │ │ ↑ implements │ │ │ │ ┌────────────────────────────────────────────────────────────────┐ │ │ │ │ │ Infrastructure Layer │ │ │ │ │ │ Adapters (Port 구현), DB/Redis/gRPC/MQ Clients │ │ │ │ │ │ └─ infrastructure/persistence_postgres/, persistence_redis/ │ │ │ │ │ └────────────────────────────────────────────────────────────────┘ │ │ │ │ │ │ │ └────────────────────────────────────────────────────────────────────────┘ │ │ │ │ ┌─ DIP (Dependency Inversion Principle) ───────────────────────────────────┐ │ │ │ │ │ │ │ ❌ Before (직접 의존) ✅ After (인터페이스 의존) │ │ │ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ │ │ │ │ │ Service │──→│Repository│ │ Service │──→│ Port │←──│ Adapter │ │ │ │ │ └─────────┘ └─────────┘ └─────────┘ │(Protocol)│ │(SqlAlchemy)│ │ │ │ │ │ │ │ └─────────┘ └─────────┘ │ │ │ │ App Layer Infra Layer App Layer App Layer Infra Layer │ │ │ │ │ │ │ └──────────────────────────────────────────────────────────────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────────────────────┘

레이어Layer	책임Responsibility	구성 요소Components
Presentation	외부 인터페이스External Interface	HTTP Router, gRPC Servicer, Schema
Application	유스케이스 오케스트레이션Use Case Orchestration	Command, Query, DTO, Port
Domain	핵심 비즈니스 로직Core Business Logic	Entity, Value Object, Domain Service
Infrastructure	기술 구현 (Port 구현체)Tech Implementation (Port Impl)	SQLAlchemy Adapter, Redis Adapter, gRPC Client

┌─────────────────────────────────────────────────────────────────────────────────┐ │ CQRS - Command Query Responsibility Segregation │ ├─────────────────────────────────────────────────────────────────────────────────┤ │ │ │ ┌─ Application Layer 구조 ─────────────────────────────────────────────────┐ │ │ │ │ │ │ │ application/ │ │ │ │ ├── {feature}/ # 기능별 분리 │ │ │ │ │ ├── commands/ # Write 작업 (Create, Update, Delete)│ │ │ │ │ │ ├── oauth_authorize.py # OAuth 인증 URL 생성 │ │ │ │ │ │ ├── oauth_callback.py # OAuth 콜백 처리 │ │ │ │ │ │ └── logout.py # 로그아웃 │ │ │ │ │ ├── queries/ # Read 작업 │ │ │ │ │ │ └── validate_token.py # 토큰 검증 │ │ │ │ │ ├── dto/ # Request/Response 객체 │ │ │ │ │ └── ports/ # 외부 의존성 인터페이스 │ │ │ │ │ ├── user_command_gateway.py │ │ │ │ │ └── user_query_gateway.py │ │ │ │ └── services/ # 복잡한 비즈니스 로직 │ │ │ │ │ │ │ └──────────────────────────────────────────────────────────────────────────┘ │ │ │ │ ┌─ Command/Query 패턴 ─────────────────────────────────────────────────────┐ │ │ │ │ │ │ │ class OAuthCallbackCommand: class ValidateTokenQuery: │ │ │ │ """UseCase (Write)""" """UseCase (Read)""" │ │ │ │ │ │ │ │ def __init__( def __init__( │ │ │ │ self, self, │ │ │ │ user_gateway: UserCommandGateway, token_service: TokenService, │ │ │ │ oauth_client: OAuthClient, blacklist: TokenBlacklist, │ │ │ │ token_service: TokenService ... │ │ │ │ ): ): │ │ │ │ self._user_gateway = ... self._token_service = ... │ │ │ │ │ │ │ │ async def execute(self, req): async def execute(self, req): │ │ │ │ # 1. 유효성 검증 # Port 호출 → DTO 반환 │ │ │ │ # 2. 비즈니스 로직 pass │ │ │ │ # 3. Port 호출 │ │ │ │ # 4. 결과 반환 │ │ │ │ │ │ │ └──────────────────────────────────────────────────────────────────────────┘ │ │ │ │ ┌─ Service 역할 분리 ──────────────────────────────────────────────────────┐ │ │ │ │ │ │ │ Facade Service (Port 보유) Pure Logic Service (Port 없음) │ │ │ │ ├─ OAuthFlowService ├─ ProfileBuilder │ │ │ │ ├─ TokenService └─ ScoreCalculator │ │ │ │ └─ 외부 시스템 캡슐화 └─ 순수 비즈니스 로직 │ │ │ │ │ │ │ └──────────────────────────────────────────────────────────────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────────────────────┘

유형Type	역할Role	특징Characteristics	예시Example
Command	상태 변경 (Write)State Change (Write)	Side Effect, Transaction	OAuthCallback, Logout
Query	조회 (Read)Read Only	No Side Effect, Cacheable	ValidateToken, GetProfile
Port	인터페이스 정의Interface Definition	Protocol/ABC, DI	UserCommandGateway
Service	복잡 로직 캡슐화Complex Logic Encapsulation	Facade or Pure	OAuthFlowService

┌─────────────────────────────────────────────────────────────────────────────────┐ │ Port & Adapter - Hexagonal Architecture │ ├─────────────────────────────────────────────────────────────────────────────────┤ │ │ │ ┌─ Port/Adapter 매핑 ──────────────────────────────────────────────────────┐ │ │ │ │ │ │ │ Driving ┌─────────────────┐ Driven │ │ │ │ Adapters │ Application │ Adapters │ │ │ │ │ Core │ │ │ │ │ ┌───────────┐ Port │ ┌───────────┐ │ Port ┌───────────┐ │ │ │ │ │HTTP Router│ ────────→ │ │ UseCase │ │ ←────── │SQLAlchemy │ │ │ │ │ └───────────┘ │ │ + Domain │ │ │ Adapter │ │ │ │ │ ┌───────────┐ Port │ └───────────┘ │ Port └───────────┘ │ │ │ │ │ gRPC │ ────────→ │ │ ←────── ┌───────────┐ │ │ │ │ │ Servicer │ │ │ │ Redis │ │ │ │ │ └───────────┘ └─────────────────┘ │ Adapter │ │ │ │ │ └───────────┘ │ │ │ └──────────────────────────────────────────────────────────────────────────┘ │ │ │ │ ┌─ 현재 구현된 Adapter 교체 가능성 ────────────────────────────────────────┐ │ │ │ │ │ │ │ Persistence Adapters: │ │ │ │ ├─ UserQueryGatewaySqla ← PostgreSQL (현재) │ │ │ │ │ └─ 교체 가능: MySQL, MongoDB, DynamoDB │ │ │ │ ├─ TokenBlacklistStoreRedis ← Redis (현재) │ │ │ │ │ └─ 교체 가능: Memcached, In-Memory │ │ │ │ │ │ │ │ │ LLM Adapters (chat_worker): │ │ │ │ ├─ OpenAIAdapter ← GPT-5.2 (현재) │ │ │ │ │ └─ 교체 가능: Anthropic, Gemini, Local LLM │ │ │ │ ├─ VisionAdapter ← GPT-5.2 (현재) │ │ │ │ │ └─ 교체 가능: Claude Vision, Gemini Vision │ │ │ │ │ │ │ │ │ MQ Adapters: │ │ │ │ ├─ RabbitMQPublisher ← RabbitMQ (현재) │ │ │ │ │ └─ 교체 가능: Kafka, Redis Streams, AWS SQS │ │ │ │ ├─ CeleryTaskAdapter ← Celery (현재) │ │ │ │ │ └─ 교체 가능: Dramatiq, Huey │ │ │ │ │ │ │ │ │ Event Adapters: │ │ │ │ ├─ RedisStreamNotifier ← Redis Streams (현재) │ │ │ │ │ └─ 교체 가능: Kafka, RabbitMQ │ │ │ │ │ │ │ └──────────────────────────────────────────────────────────────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────────────────────┘

Port (인터페이스)Port (Interface)	현재 AdapterCurrent Adapter	교체 가능 옵션Swappable Options
UserQueryGateway	SQLAlchemy (PostgreSQL)	MySQL, MongoDB, DynamoDB
TokenBlacklistStore	Redis Adapter	Memcached, In-Memory
LLMAdapter (chat)	OpenAI GPT-5.2	Anthropic, Gemini, Local LLM
MessagePublisher	RabbitMQ	Kafka, Redis Streams, AWS SQS
ProgressNotifier	Redis Streams	Kafka, RabbitMQ, WebSocket

┌─────────────────────────────────────────────────────────────────────────────────┐ │ LLM Pipeline Resilience - Fallback + Checkpoint │ ├─────────────────────────────────────────────────────────────────────────────────┤ │ │ │ ┌─ 4단계 Celery Chain (~12s total) ────────────────────────────────────────┐ │ │ │ │ │ │ │ Vision (3s) → Rule (0.5s) → Answer (8s) → Reward (0.5s) │ │ │ │ GPT-5.2 Rule Match GPT-5.2 Score Calc │ │ │ │ ↓ ↓ ↓ ↓ │ │ │ │ [Checkpoint] [Checkpoint] [Checkpoint] [Persist] │ │ │ │ │ │ │ └──────────────────────────────────────────────────────────────────────────┘ │ │ │ │ ┌─ Fallback 전략 ──────────────────────────────────────────────────────────┐ │ │ │ │ │ │ │ Vision Stage: │ │ │ │ ├─ Primary: GPT-5.2 │ │ │ │ ├─ Fallback: GPT-5.2 (비용 증가, 정확도 유지) │ │ │ │ └─ Final: Generic Error Response │ │ │ │ │ │ │ │ Answer Stage: │ │ │ │ ├─ Primary: GPT-5.2 │ │ │ │ ├─ Fallback: GPT-5.2 │ │ │ │ └─ Final: Cached Similar Response │ │ │ │ │ │ │ └──────────────────────────────────────────────────────────────────────────┘ │ │ │ │ ┌─ Checkpoint Recovery ────────────────────────────────────────────────────┐ │ │ │ │ │ │ │ scan:{task_id}:state = { │ │ │ │ "stage": "rule", # 현재 진행 단계 │ │ │ │ "vision_result": {...}, # Vision 결과 캐시 │ │ │ │ "retry_count": 1, # 재시도 횟수 │ │ │ │ "timestamp": "2026-01-15T..." │ │ │ │ } │ │ │ │ │ │ │ │ 장애 발생 시: │ │ │ │ 1. Worker 재시작 → Redis에서 state 조회 │ │ │ │ 2. "rule" stage부터 재개 (Vision 결과 재사용) │ │ │ │ 3. retry_count < 3 → 자동 재시도 │ │ │ │ 4. retry_count >= 3 → DLQ로 이동 │ │ │ │ │ │ │ └──────────────────────────────────────────────────────────────────────────┘ │ │ │ │ ┌─ 고지연/불안정 헷지 ──────────────────────────────────────────────────────┐ │ │ │ │ │ │ │ Timeout 설정: Rate Limiting: Circuit Breaker: │ │ │ │ ├─ Vision: 10s ├─ 100 RPM/model ├─ 5 failures → open │ │ │ │ ├─ Answer: 30s ├─ 50k TPM/model ├─ 30s cooldown │ │ │ │ └─ Total: 60s soft └─ Queue backpressure └─ Fallback trigger │ │ │ │ │ │ │ └──────────────────────────────────────────────────────────────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────────────────────┘

단계Stage	지연Latency	LLMLLM	복원력 전략Resilience Strategy
Vision	~3s	GPT-5.2	Fallback → GPT-5.2, Checkpoint 저장
Rule	~0.5s	-	Rule Engine, 빠른 매칭
Answer	~8s	GPT-5.2	Fallback → GPT-5.2, Streaming
Reward	~0.5s	-	Batch Persist, Idempotency

┌─────────────────────────────────────────────────────────────────────────────────┐ │ Domain-specific ConfigMap/Secret Management │ ├─────────────────────────────────────────────────────────────────────────────────┤ │ │ │ ┌─ Namespace별 ConfigMap/Secret 분리 ──────────────────────────────────────┐ │ │ │ │ │ │ │ auth namespace: character namespace: │ │ │ │ ├─ auth-config (ConfigMap) ├─ character-config (ConfigMap) │ │ │ │ │ ├─ JWT_ALGORITHM │ ├─ GRPC_PORT │ │ │ │ │ ├─ OAUTH_REDIRECT_URL │ ├─ CHARACTER_TRAITS_VERSION │ │ │ │ │ └─ TOKEN_EXPIRY │ └─ MATCHING_THRESHOLD │ │ │ │ ├─ auth-secret (Secret) ├─ character-secret (Secret) │ │ │ │ │ ├─ JWT_SECRET │ └─ DB_PASSWORD │ │ │ │ │ ├─ OAUTH_CLIENT_ID │ │ │ │ │ │ └─ OAUTH_CLIENT_SECRET │ │ │ │ │ │ │ │ │ │ │ chat namespace: scan namespace: │ │ │ │ ├─ chat-config ├─ scan-config │ │ │ │ │ ├─ LLM_MODEL │ ├─ VISION_MODEL │ │ │ │ │ ├─ MAX_CONTEXT_LENGTH │ ├─ ANSWER_MODEL │ │ │ │ │ └─ STREAM_TIMEOUT │ └─ REWARD_RULES_PATH │ │ │ │ ├─ chat-secret ├─ scan-secret │ │ │ │ │ └─ OPENAI_API_KEY │ └─ OPENAI_API_KEY │ │ │ │ │ │ │ │ │ └──────────────────────────────────────────────────────────────────────────┘ │ │ │ │ ┌─ External Secrets Operator 통합 ─────────────────────────────────────────┐ │ │ │ │ │ │ │ AWS Secrets Manager External Secret CRD │ │ │ │ ┌─────────────────┐ ┌─────────────────────────────────────┐ │ │ │ │ │ eco2/auth/jwt │ → │ apiVersion: external-secrets.io/v1 │ │ │ │ │ │ eco2/auth/oauth │ │ kind: ExternalSecret │ │ │ │ │ │ eco2/chat/openai│ │ spec: │ │ │ │ │ │ eco2/scan/openai│ │ secretStoreRef: aws-secrets │ │ │ │ │ └─────────────────┘ │ target: auth-secret │ │ │ │ │ ↓ │ data: │ │ │ │ │ IRSA Auth (ServiceAccount) │ - secretKey: JWT_SECRET │ │ │ │ │ │ remoteRef: │ │ │ │ │ │ key: eco2/auth/jwt │ │ │ │ │ └─────────────────────────────────────┘ │ │ │ │ │ │ │ └──────────────────────────────────────────────────────────────────────────┘ │ │ │ │ ┌─ 장점 ───────────────────────────────────────────────────────────────────┐ │ │ │ │ │ │ │ ✅ 도메인 팀 자율성: 각 팀이 자체 ConfigMap/Secret 관리 │ │ │ │ ✅ 변경 격리: auth 설정 변경이 chat에 영향 없음 │ │ │ │ ✅ 보안 강화: 최소 권한 원칙, 도메인별 Secret 접근 제한 │ │ │ │ ✅ 롤백 용이: 도메인 단위로 설정 롤백 가능 │ │ │ │ │ │ │ └──────────────────────────────────────────────────────────────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────────────────────┘

도메인Domain	ConfigMapConfigMap	SecretSecret	특수 설정Special Config
auth	auth-config	auth-secret (JWT, OAuth)	OAuth Provider별 설정
character	character-config	character-secret	gRPC 포트, 매칭 임계값
chat	chat-config	chat-secret (OpenAI)	LLM 모델, 컨텍스트 길이
scan	scan-config	scan-secret (OpenAI)	Vision/Answer 모델
users	users-config	users-secret	동기화 설정

┌─────────────────────────────────────────────────────────────────────────────────┐ │ Radon Cyclomatic Complexity Analysis │ ├─────────────────────────────────────────────────────────────────────────────────┤ │ │ │ ┌─ 측정 결과 요약 ─────────────────────────────────────────────────────────┐ │ │ │ │ │ │ │ Total Blocks: 2,120 │ │ │ │ Average Complexity: 2.28 (Grade A) │ │ │ │ │ │ │ │ Grade Distribution: │ │ │ │ ┌────────────────────────────────────────────────────────────┐ │ │ │ │ │ A (1-5) ████████████████████████████████████████ 92.3% │ │ │ │ │ │ B (6-10) █████ 5.8% │ │ │ │ │ │ C (11-20) █ 1.4% │ │ │ │ │ │ D (21-30) ░ 0.4% │ │ │ │ │ │ E (31-40) ░ 0.1% │ │ │ │ │ │ F (41+) ░ 0.0% │ │ │ │ │ └────────────────────────────────────────────────────────────┘ │ │ │ │ │ │ │ └──────────────────────────────────────────────────────────────────────────┘ │ │ │ │ ┌─ 도메인별 복잡도 ────────────────────────────────────────────────────────┐ │ │ │ │ │ │ │ Domain Blocks Avg CC Grade Highest Function │ │ │ │ ──────────────────────────────────────────────────────────────────────│ │ │ │ auth 342 2.1 A oauth_callback (8) │ │ │ │ character 287 2.3 A match_traits (7) │ │ │ │ chat_worker 456 2.8 A intent_classifier (12) │ │ │ │ scan_worker 389 2.4 A vision_processor (9) │ │ │ │ users 198 1.9 A sync_character (5) │ │ │ │ location 156 1.7 A nearby_search (4) │ │ │ │ images 134 1.8 A upload_handler (6) │ │ │ │ ext_authz 158 2.2 A check_permission (7) │ │ │ │ │ │ │ └──────────────────────────────────────────────────────────────────────────┘ │ │ │ │ ┌─ Clean Architecture 기여 ────────────────────────────────────────────────┐ │ │ │ │ │ │ │ Before (Layered): After (Clean): │ │ │ │ ├─ AuthService: CC 45 ├─ OAuthCallbackCommand: CC 8 │ │ │ │ │ (God Object) ├─ LogoutCommand: CC 3 │ │ │ │ │ ├─ RefreshTokenCommand: CC 5 │ │ │ │ │ └─ ValidateTokenQuery: CC 4 │ │ │ │ │ │ │ │ │ 낮은 복잡도의 이점: │ │ │ │ ✅ 테스트 커버리지 향상 (단순한 분기) │ │ │ │ ✅ 버그 발생률 감소 (인지 부하 감소) │ │ │ │ ✅ 코드 리뷰 효율성 (작은 함수 단위) │ │ │ │ ✅ 신규 에이전트 온보딩 용이 │ │ │ │ │ │ │ └──────────────────────────────────────────────────────────────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────────────────────┘

등급Grade	범위Range	의미Meaning	비율Ratio
A	1-5	단순, 저위험Simple, Low Risk	92.3%
B	6-10	복잡, 중위험Complex, Moderate Risk	5.8%
C	11-20	고복잡High Complexity	1.4%
D-F	21+	리팩토링 필요Refactor Required	0.5%

📈 Radon 측정 명령어

radon cc apps/ -a -s --total-average
radon cc apps/ -a -nc  # A등급만 제외하고 표시

Clean Architecture 전환으로 Port/Usecase 단위가 작아져 Mock 기반 단위 테스트 작성 비용이 낮아졌고, 커버리지 측정이 쉬워졌습니다.
아래 수치는 코드베이스 문서/리포트에 기록된 실측(coverage 출력) 기반입니다. Clean Architecture reduces unit size (ports/usecases), lowering test-writing cost with mocks and making coverage easier to measure.
Numbers below are based on documented measured outputs (coverage reports) in this repo.

┌─────────────────────────────────────────────────────────────────────────────────┐ │ Test Coverage Strategy │ ├─────────────────────────────────────────────────────────────────────────────────┤ │ │ │ ┌─ Testing Pyramid ──────────────────────────────────────────────────────┐ │ │ │ │ │ │ │ ▲ │ │ │ │ /·\ E2E (Selective) │ │ │ │ /···\ ████ 핵심 플로우만 │ │ │ │ /·····\ │ │ │ │ ───────── │ │ │ │ /·········\ Integration │ │ │ │ /···········\ ████████ DB/MQ 연동 │ │ │ │ ───────────────── │ │ │ │ /·················\ Unit Tests │ │ │ │ /···················\ ████████████████████ 88%+ │ │ │ │ ───────────────────────── │ │ │ │ │ │ │ └────────────────────────────────────────────────────────────────────────┘ │ │ │ │ ┌─ Clean Architecture가 테스트에 미친 영향 ──────────────────────────────┐ │ │ │ │ │ │ │ Before (Layered): After (Clean Architecture): │ │ │ │ ├─ 하나의 Service에 모든 로직 ├─ Command/Query 분리 │ │ │ │ │ → Mock 대상이 많고 복잡 │ → 작은 단위, 쉬운 격리 │ │ │ │ │ → 커버리지 측정 어려움 │ → Port 기반 의존성 주입 │ │ │ │ │ │ → Adapter 교체로 Mock 용이 │ │ │ │ │ │ │ └────────────────────────────────────────────────────────────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────────────────────┘

📊 앱별 테스트 현황 (2026-01 기준) 📊 Test Status by App (as of 2026-01)

컴포넌트Component	테스트 파일Test Files	테스트 함수Test Functions	주요 대상Target
chat_worker	28	394	LangGraph Nodes, Adapters, Commands
auth_worker	11	69	OAuth, JWT, Commands
auth	10	67	Controllers, Services
auth_relay	10	64	Token Relay, Middleware
location	4	65	gRPC, Queries
character	6	42	Domain, gRPC
sse_gateway	4	31	Event Streaming
scan	3	27	Submit API
users	3	26	Domain, Controllers
scan_worker	4	15	Celery Tasks
event_router	3	13	Redis Streams
Total	86	813	-

⚙️ CI 파이프라인 통합 ⚙️ CI Pipeline Integration

단계Stage	도구Tool	설정Config	출력Output
커버리지 측정Coverage	pytest-cov	`--cov=apps/`	coverage.xml
품질 분석Quality	SonarCloud	`sonar.python.coverage.reportPaths`	대시보드Dashboard
제외 패턴Exclusions	pytest.ini	`--ignore=tests/integration`	Unit만 CI 실행Unit-only in CI

✅ Unit 테스트 전략 ✅ Unit Test Strategy

• Port 인터페이스 Mock으로 격리
• Command/Query 단위로 분리
• 88%+ 커버리지 목표 • Isolate via Port interface Mocks
• Separate by Command/Query units
• Target 88%+ coverage

🔄 Integration 테스트 전략 🔄 Integration Test Strategy

• 핵심 플로우만 선택적 실행
• CI에서 기본 제외 (속도)
• 배포 전 수동 검증 • Selective run for core flows
• Excluded by default in CI (speed)
• Manual verification pre-deploy

┌─────────────────────────────────────────────────────────────────────┐
│                     Vision LLM Stage                                │
├─────────────────────────────────────────────────────────────────────┤
│                                                                     │
│   ┌──────────────┐      ┌──────────────────┐      ┌─────────────┐  │
│   │  📷 Image    │  ──► │  🧠 GPT-5.2 /    │  ──► │  📝 분류    │  │
│   │  (Base64)    │      │  Gemini 3.0 Pro  │      │  결과       │  │
│   └──────────────┘      └────────┬─────────┘      └─────────────┘  │
│                                  │                                  │
│                                  ▼                                  │
│                       ┌──────────────────────┐                      │
│                       │ 📋 System Prompt     │                      │
│                       │ ────────────────────│                       │
│                       │ • 재활용 가능 여부   │                       │
│                       │ • 분류 기준 제시     │                       │
│                       │ • 출력 포맷 정의     │                       │
│                       └──────────────────────┘                      │
│                                                                     │
└─────────────────────────────────────────────────────────────────────┘

항목Item	설명Description
모델Model	gpt-5.2 (default) / gemini-3-pro-preview
입력Input	사용자가 촬영한 이미지 (Base64 인코딩)User-captured image (Base64 encoded)
출력Output	품목 분류 결과 (대분류, 소분류, 재질)Item classification (category, subcategory, material)
프롬프트Prompt	분류 기준과 판단 근거를 명시한 시스템 프롬프트System prompt with classification criteria and reasoning guidelines

📊 실측 레이턴시 (Prometheus) 📊 Measured Latency (Prometheus)

조건Condition	Avg	비고Note
단일 요청Single request	6.9s	기준값 (워밍업 후)Baseline (warmed up)
저부하Low Load (k6 VU 10)	4.5s	비동기 Gevent 전환 후After Gevent migration
🎯 SLA (k6 VU 500)	E2E p95: 83.3s (전체 Chain 기준)E2E p95: 83.3s (total Chain)

┌─────────────────────────────────────────────────────────────────────┐
│                     Rule Engine Stage                               │
├─────────────────────────────────────────────────────────────────────┤
│                                                                     │
│   ┌──────────────┐      ┌──────────────────┐      ┌─────────────┐  │
│   │  📝 분류     │  ──► │  🔍 Rule Search  │  ──► │  📄 관련    │  │
│   │  결과        │      │ (Rule-based RAG) │      │  규정       │  │
│   └──────────────┘      └────────┬─────────┘      └─────────────┘  │
│                                  │                                  │
│                                  ▼                                  │
│                  ┌──────────────────────────────┐                   │
│                  │ 📁 YAML Knowledge Base       │                   │
│                  │ ─────────────────────────────│                   │
│                  │ item_class_list.yaml         │                   │
│                  │ ├── 대분류: 플라스틱/종이/... │                   │
│                  │ ├── 소분류: PET/PP/PE/...    │                   │
│                  │ └── 규정: 세척/라벨제거/...  │                   │
│                  │                              │                   │
│                  │ 📊 167개 품목 규정 수록       │                   │
│                  └──────────────────────────────┘                   │
│                                                                     │
└─────────────────────────────────────────────────────────────────────┘

항목Item	설명Description
데이터 소스Data Source	온누리 재활용 품목별 규정 (YAML 구조화)On-Nuri recycling regulations per item (YAML structured)
검색 방식Search Method	분류 결과 기반 키워드 매칭 → 관련 규정 추출Keyword matching based on classification → Extract relevant rules
규정 항목Rule Items	세척 여부, 라벨 제거, 분리배출 방법, 특이사항Washing, label removal, disposal method, special notes
품목 수Item Count	200+ 품목items

📊 실측 레이턴시 (Prometheus) 📊 Measured Latency (Prometheus)

조건Condition	Avg	비고Note
단일 요청Single request	0.5ms	기준값 (워밍업 후)Baseline (warmed up)
저부하Low Load (k6 VU 10)	0.3s	비동기 Gevent 전환 후After Gevent migration
🎯 SLA (k6 VU 500)	E2E p95: 83.3s (전체 Chain 기준)E2E p95: 83.3s (total Chain)

┌─────────────────────────────────────────────────────────────────────┐
│                     Answer LLM Stage                                │
├─────────────────────────────────────────────────────────────────────┤
│                                                                     │
│   ┌──────────────┐      ┌──────────────────┐      ┌─────────────┐  │
│   │  📄 관련     │  ──► │  🧠 gpt-5.2-mini │  ──► │  📤 JSON    │  │
│   │  규정        │      │  / Gemini 3 Flash│      │  Output     │  │
│   └──────────────┘      └────────┬─────────┘      └─────────────┘  │
│                                  │                                  │
│                                  ▼                                  │
│              ┌─────────────────────────────────────┐                │
│              │ 📋 Structured Output Schema         │                │
│              │ ──────────────────────────────────  │                │
│              │ {                                   │                │
│              │   "category": "플라스틱",           │                │
│              │   "subcategory": "PET",             │                │
│              │   "recyclable": true,               │                │
│              │   "instructions": ["세척", "라벨제거"],│               │
│              │   "reward_points": 10,              │                │
│              │   "explanation": "..."              │                │
│              │ }                                   │                │
│              └─────────────────────────────────────┘                │
│                                                                     │
└─────────────────────────────────────────────────────────────────────┘

항목Item	설명Description
모델Model	gpt-5.2 (default) / gemini-3-flash-preview
입력Input	Vision 분류 결과 + Rule Engine 검색 규정Vision classification + Rule Engine retrieved regulations
출력Output	JSON Structured Output (스키마 강제)JSON Structured Output (schema enforced)

📊 실측 레이턴시 (Prometheus) 📊 Measured Latency (Prometheus)

조건Condition	Avg	비고Note
단일 요청Single request	4.8s	기준값 (워밍업 후)Baseline (warmed up)
저부하Low Load (k6 VU 10)	4.8s	비동기 Gevent 전환 후After Gevent migration

⏱️ 전체 Chain Duration (E2E) ⏱️ Total Chain Duration (E2E)

VU	완료율Completion	E2E p95	상태Status
🎯 500 (SLA)	100%	83.3s	✅ SLA 기준SLA baseline
700	99.2%	122.3s	✅
900	99.7%	149.6s	✅
1000	97.8%	173.3s	⚠️ 포화 지점Saturation

📈 비동기 전환 효과 📈 Async Migration Impact

Vision 4.5s + Rule 0.3s + Answer 4.8s + Reward 1.7s = ~11.3s (저부하 평균). OpenAI API가 전체 파이프라인의 82% (9.3s/11.3s) 차지. Vision 4.5s + Rule 0.3s + Answer 4.8s + Reward 1.7s = ~11.3s (low load avg). OpenAI API accounts for 82% (9.3s/11.3s) of total pipeline.

┌────────────────────────────────────────────────────────────────────────────────────┐
│                        4-Stage Celery Chain Pipeline                                │
├────────────────────────────────────────────────────────────────────────────────────┤
│                                                                                    │
│   ┌────────────┐    ┌────────────┐    ┌────────────┐    ┌────────────┐             │
│   │  👁️ Vision │───▶│  📋 Rule   │───▶│  💬 Answer │───▶│  🎁 Reward │             │
│   │  ~4.5s     │    │  ~0.3s     │    │  ~4.8s     │    │  ~1.7s     │             │
│   │  (40%)     │    │  (5%)      │    │  (45%)     │    │  (10%)     │             │
│   └─────┬──────┘    └─────┬──────┘    └─────┬──────┘    └─────┬──────┘             │
│         │ seq:10-11       │ seq:20-21       │ seq:30-31       │ seq:40-41          │
│         ▼                 ▼                 ▼                 ▼                    │
│   ┌───────────────────────────────────────────────────────────────────┐            │
│   │                    Redis Event Bus (XADD)                          │            │
│   │                scan:events:{shard} (4 shards)                      │            │
│   └───────────────────────────────────────────────────────────────────┘            │
│                                                                                    │
│   Total: ~11.3s (LLM I/O-bound 85%)                                                │
└────────────────────────────────────────────────────────────────────────────────────┘

📊 Stage별 상세 📊 Stage Details

Stage	Queue	처리 내용Processing	외부 의존성External Dependency	Latency
👁️ Vision	`scan.vision`	이미지 분류 (GPT Vision)Image Classification (GPT Vision)	OpenAI API (gpt-5.2)	~4.5s (40%)
📋 Rule	`scan.rule`	규정 검색 (Lite RAG)Regulation Search (Lite RAG)	JSON File (In-Memory)	~0.3s (5%)
💬 Answer	`scan.answer`	응답 생성 (Structured Output)Response Generation (Structured Output)	OpenAI API (gpt-5.2)	~4.8s (45%)
🎁 Reward	`scan.reward`	캐릭터 매칭 + 포인트 지급Character Matching + Points	Character gRPC Service	~1.7s (10%)

🔗 Celery Chain 설정 🔗 Celery Chain Configuration

# Celery Chain: 순차 실행 보장
chain = (
    vision_task.s(task_id, image_url, model)
    | rule_task.s()      # vision 결과 자동 전달
    | answer_task.s()    # rule 결과 자동 전달
    | reward_task.s()    # answer 결과 자동 전달
)

# 실행: apply_async로 비동기 발행
chain.apply_async(task_id=task_id)

# Gevent Pool: 100 coroutines per worker
celery -A scan_worker worker --pool=gevent --concurrency=100

🔄 Retry 전략 🔄 Retry Strategy

설정Setting	값Value	설명Description
`max_retries`	3	최대 재시도 횟수Maximum retry attempts
`default_retry_delay`	5s	재시도 간격 (exponential backoff)Retry interval (exponential backoff)
`acks_late`	true	작업 완료 후 ACK (재시도 보장)ACK after completion (retry guarantee)
`reject_on_worker_lost`	true	워커 실패 시 큐 재전달Requeue on worker failure
`task_time_limit`	300s	하드 타임아웃 (강제 종료)Hard timeout (force kill)
`task_soft_time_limit`	240s	소프트 타임아웃 (SoftTimeLimitExceeded)Soft timeout (SoftTimeLimitExceeded)

📡 Event Sequence (SSE 전송) 📡 Event Sequence (SSE Delivery)

seq: 10-11

vision.start/done

seq: 20-21

rule.start/done

seq: 30-31

answer.start/done

seq: 40-41

reward.start/done

seq: 51

complete

💡 Idempotent Event 처리 💡 Idempotent Event Processing

Event Router가 seq 번호로 중복 검사 → Lua Script로 원자적 업데이트 → published marker로 재전송 방지 Event Router checks seq for duplicates → Atomic update via Lua Script → published marker prevents re-delivery

┌─────────────────────────────────────────────────────────────────────┐
│                  Multi-intent Classification                        │
├─────────────────────────────────────────────────────────────────────┤
│                                                                     │
│   ┌──────────────┐      ┌──────────────────┐      ┌─────────────┐  │
│   │  💬 User     │  ──► │  🧠 LLM          │  ──► │  🏷️ Intent │  │
│   │  Query       │      │  Classifier      │      │  Tags       │  │
│   └──────────────┘      └────────┬─────────┘      └─────────────┘  │
│                                  │                                  │
│                                  ▼                                  │
│      ┌────────────────────────────────────────────────────────┐     │
│      │ 📋 intent.txt (System Prompt)                          │     │
│      │ ─────────────────────────────────────────────────────  │     │
│      │ 4-Class Routing:                                       │     │
│      │ ┌────────────┬────────────┬────────────┬────────────┐ │     │
│      │ │ 🗑️ waste  │ 🎭 character│ 📍 location│ 💭 general │ │     │
│      │ │ 분리배출   │ 캐릭터      │ 수거장소   │ 일반대화   │ │     │
│      │ └────────────┴────────────┴────────────┴────────────┘ │     │
│      │                                                        │     │
│      │ 복합 질문 분해:                                        │     │
│      │ "플라스틱 버리는 곳 알려줘" → [waste, location]        │     │
│      └────────────────────────────────────────────────────────┘     │
│                                                                     │
└─────────────────────────────────────────────────────────────────────┘

분류Class	설명Description	예시Example
🗑️ waste	분리배출 방법 질문Recycling method questions	"페트병 어떻게 버려?""How to dispose PET bottles?"
🎭 character	캐릭터 관련 질문Character-related questions	"초록이가 좋아하는 건?""What does Choroki like?"
📍 location	수거 장소 질문Collection point questions	"근처 재활용 센터""Nearby recycling center"
💭 general	일반 대화/인사General chat/greetings	"안녕!""Hello!"

💡 복합 질문 분해 💡 Compound Query Decomposition

하나의 질문에 여러 의도가 포함된 경우, 각각 별도의 서브에이전트로 라우팅하여 병렬 처리 후 결과 통합 When a query contains multiple intents, route to separate sub-agents for parallel processing, then merge results

┌──────────────────────────────────────────────────────────────────────────┐
│                  TagBasedRetriever Architecture                          │
├──────────────────────────────────────────────────────────────────────────┤
│                                                                          │
│   ┌──────────────┐                          ┌────────────────────────┐   │
│   │ 💬 Query     │──────────────────────────│ 🔍 Vector Search       │   │
│   │ + Intent     │                          │ (Embedding Similarity) │   │
│   └──────┬───────┘                          └───────────┬────────────┘   │
│          │                                              │                │
│          ▼                                              ▼                │
│   ┌─────────────────────────────────────────────────────────────────┐   │
│   │ 📁 Structured Data Injection (YAML → Context)                   │   │
│   │ ────────────────────────────────────────────────────────────────│   │
│   │                                                                  │   │
│   │ item_class_list.yaml          situation_tags.yaml               │   │
│   │ ┌─────────────────────┐       ┌─────────────────────┐          │   │
│   │ │ 167개 품목            │       │ 80개 상황 태그       │          │   │
│   │ │ ├── 플라스틱         │       │ ├── 청소            │          │   │
│   │ │ │   ├── PET         │       │ ├── 분리수거        │          │   │
│   │ │ │   ├── PP          │       │ ├── 재활용          │          │   │
│   │ │ │   └── PE          │       │ └── 환경보호        │          │   │
│   │ │ ├── 종이            │       └─────────────────────┘          │   │
│   │ │ └── 금속            │                                         │   │
│   │ └─────────────────────┘                                         │   │
│   └─────────────────────────────────────────────────────────────────┘   │
│                                    │                                     │
│                                    ▼                                     │
│                        ┌─────────────────────┐                          │
│                        │ 📄 Enriched Context │                          │
│                        │ (Query + YAML data) │                          │
│                        └─────────────────────┘                          │
│                                                                          │
└──────────────────────────────────────────────────────────────────────────┘

파일File	내용Content	용도Purpose
`item_class_list.yaml`	167개 품목 분류 체계200+ item classification	분리배출 질문 시 품목 매칭Item matching for disposal questions
`situation_tags.yaml`	80개 상황 태그100+ situation tags	컨텍스트 기반 검색 강화Contextual search enhancement

🔗 Anthropic Contextual Retrieval 패턴 🔗 Anthropic Contextual Retrieval Pattern

구조화된 YAML 데이터를 LLM 컨텍스트에 직접 주입하여 검색 정확도 향상. 기존 벡터 검색만으로는 놓칠 수 있는 도메인 특화 정보를 보완. Inject structured YAML data directly into LLM context for improved retrieval accuracy. Supplements domain-specific information that vector search alone might miss.

┌──────────────────────────────────────────────────────────────────────────┐
│                    Eval Agent - 4 Phase Evaluation                       │
├──────────────────────────────────────────────────────────────────────────┤
│                                                                          │
│   ┌─────────────────────────────────────────────────────────────────┐   │
│   │  Phase 1: Citation (근거 검증)                                   │   │
│   │  ───────────────────────────────────────────────────────────────│   │
│   │  • 응답이 검색된 문서에 기반하는지 확인                          │   │
│   │  • Rule-based 빠른 판단 (정규식, 키워드 매칭)                    │   │
│   └─────────────────────────────────────────────────────────────────┘   │
│                              ▼                                           │
│   ┌─────────────────────────────────────────────────────────────────┐   │
│   │  Phase 2: Nugget (완전성 평가)                                   │   │
│   │  ───────────────────────────────────────────────────────────────│   │
│   │  • 핵심 정보 포함 여부 체크                                      │   │
│   │  • 누락된 정보 식별                                              │   │
│   └─────────────────────────────────────────────────────────────────┘   │
│                              ▼                                           │
│   ┌─────────────────────────────────────────────────────────────────┐   │
│   │  Phase 3: Groundedness (검증)                                    │   │
│   │  ───────────────────────────────────────────────────────────────│   │
│   │  • LLM 기반 정밀 평가                                            │   │
│   │  • Hallucination 탐지                                            │   │
│   └─────────────────────────────────────────────────────────────────┘   │
│                              ▼                                           │
│   ┌─────────────────────────────────────────────────────────────────┐   │
│   │  Phase 4: Just-in-Time (다음 액션)                               │   │
│   │  ───────────────────────────────────────────────────────────────│   │
│   │  • 품질 점수 기반 분기 결정                                      │   │
│   │  • Pass / Retry / Fallback 판단                                  │   │
│   └─────────────────────────────────────────────────────────────────┘   │
│                                                                          │
└──────────────────────────────────────────────────────────────────────────┘

Phase	평가 항목Evaluation	방식Method	기준Threshold
1. Citation	근거 존재Evidence exists	Rule-based	≥1 참조reference
2. Nugget	정보 완전성Info completeness	Rule + LLM	≥70%
3. Groundedness	사실 검증Fact verification	LLM	≥0.8 score
4. Just-in-Time	다음 액션Next action	Decision Tree	Pass/Retry/Fallback

🔬 Anthropic RAG Evaluation Patterns 🔬 Anthropic RAG Evaluation Patterns

Rule 기반 빠른 판단과 LLM 기반 정밀 평가를 조합하여 비용 효율적인 품질 평가. 저품질 응답은 Fallback Chain으로 전달. Combines rule-based fast judgment with LLM-based precise evaluation for cost-efficient quality assessment. Low-quality responses are passed to Fallback Chain.

┌──────────────────────────────────────────────────────────────────────────┐
│                     Fallback Chain Strategy                              │
├──────────────────────────────────────────────────────────────────────────┤
│                                                                          │
│   Eval Agent 저품질 판정                                                 │
│        │                                                                 │
│        ▼                                                                 │
│   ┌─────────────────────────────────────────────────────────────────┐   │
│   │  Tier 1: RAG Retry (검색 재시도)                                 │   │
│   │  ───────────────────────────────────────────────────────────────│   │
│   │  • 쿼리 리라이팅 후 재검색                                       │   │
│   │  • 다른 검색 전략 시도                                           │   │
│   └─────────────────────────────────────────────────────────────────┘   │
│        │ 실패 시                                                         │
│        ▼                                                                 │
│   ┌─────────────────────────────────────────────────────────────────┐   │
│   │  Tier 2: Web Search (외부 검색)                                  │   │
│   │  ───────────────────────────────────────────────────────────────│   │
│   │  • Tavily/Serper API로 웹 검색                                   │   │
│   │  • 최신 정보 보완                                                │   │
│   └─────────────────────────────────────────────────────────────────┘   │
│        │ 실패 시                                                         │
│        ▼                                                                 │
│   ┌─────────────────────────────────────────────────────────────────┐   │
│   │  Tier 3: General LLM (일반 응답)                                 │   │
│   │  ───────────────────────────────────────────────────────────────│   │
│   │  • 컨텍스트 없이 LLM 직접 응답                                   │   │
│   │  • "정확한 정보 확인 필요" 안내 포함                             │   │
│   └─────────────────────────────────────────────────────────────────┘   │
│        │ 의도 불명확 시                                                  │
│        ▼                                                                 │
│   ┌─────────────────────────────────────────────────────────────────┐   │
│   │  Clarify Template (명확화 요청)                                  │   │
│   │  ───────────────────────────────────────────────────────────────│   │
│   │  • "좀 더 구체적으로 질문해 주세요"                              │   │
│   │  • 예시 질문 제안                                                │   │
│   └─────────────────────────────────────────────────────────────────┘   │
│                                                                          │
└──────────────────────────────────────────────────────────────────────────┘

Tier	전략Strategy	조건Condition	비용Cost
1	RAG Retry	품질 점수 < 임계값Quality score < threshold	낮음Low
2	Web Search	RAG 재시도 실패RAG retry failed	중간Medium
3	General LLM	Web 검색도 실패Web search also failed	높음High
-	Clarify	의도 파악 불가Intent unclear	없음None

🔁 Recursive Self-Improvement Loop 🔁 Recursive Self-Improvement Loop

Fallback 결과도 Eval Agent로 재평가하여 품질 기준 충족 시까지 반복. 무한 루프 방지를 위해 최대 재시도 횟수 제한. Fallback results are re-evaluated by Eval Agent, repeating until quality threshold is met. Max retry limit prevents infinite loops.

┌─────────────────────────────────────────────────────────────────────────────────┐
│                    Scan Worker Infrastructure (Production)                       │
├─────────────────────────────────────────────────────────────────────────────────┤
│                                                                                 │
│   ┌────────────────────────────────────────────────────────────────────────┐    │
│   │  Celery Worker (gevent pool=100)                                        │    │
│   │  ───────────────────────────────────────────────────────────────────    │    │
│   │  celery -A scan_worker.main worker --pool=gevent --concurrency=100      │    │
│   │  -Q scan.vision,scan.rule,scan.answer,scan.reward                       │    │
│   └────────────────────────────────────────────────────────────────────────┘    │
│                                                                                 │
│   4-Stage Chain + Sequence Events (순차 의존성)                                 │
│   ────────────────────────────────────────────────────────────────────────      │
│   vision (5.16s) → rule (0.05s) → answer (3.69s) → reward (0.22s) → done       │
│     seq:10-11      seq:20-21      seq:30-31        seq:40-41       seq:51      │
│        ↓              ↓              ↓                ↓                         │
│   OpenAI Vision   Lite RAG      OpenAI Chat     Character gRPC                 │
│                  (PostgreSQL)                 (완벽 분류 시에만 지급)            │
│                                                                                 │
│   2-Redis Event Bus Layer (Idempotent Fan-out)                                  │
│   ────────────────────────────────────────────────────────────────────────      │
│                                                                                 │
│   Worker ──XADD──▶ Streams Redis ──────────────▶ Event Router ──PUBLISH──▶     │
│                    (4 shards: scan:events:0~3)       │                          │
│                    (Durable Buffer)                  │   ┌──────────────────┐   │
│                                                      ├──▶│ State KV Update  │   │
│   ┌──────────────────────────────────────────────────┘   │ (Lua Atomic)     │   │
│   │                                                      └──────────────────┘   │
│   │   StreamConsumer          EventProcessor        PendingReclaimer            │
│   │   ───────────────         ──────────────        ────────────────            │
│   │   XREADGROUP              Lua Script            XAUTOCLAIM                  │
│   │   eventrouter             seq 중복 검사          5min idle 재처리            │
│   │   batch count             published marker      (idempotent safe)           │
│   │                           TTL 2h                                            │
│   │                                                                             │
│   └──────────▶ Pub/Sub Redis ──────────────────────▶ SSE Gateway ──▶ Client    │
│                (Volatile Broadcast)                   (5s timeout → catch-up)   │
│                                                                                 │
└─────────────────────────────────────────────────────────────────────────────────┘

🎯 SLA 성능 지표 (k6 VU 500 기준) 🎯 SLA Performance Metrics (k6 VU 500 baseline)

VU	완료율Completion	E2E p95	처리량Throughput	상태Status
🎯 500 (SLA)	100%	83.3s	367.9 req/m	✅ SLA 기준SLA baseline
600	99.7%	108.3s	358.6 req/m	✅
700	99.2%	122.3s	329.1 req/m	✅
800	99.7%	144.6s	367.3 req/m	✅
900	99.7%	149.6s	405.5 req/m	✅
1000	97.8%	173.3s	373.4 req/m	⚠️ 포화 지점Saturation

📊 스테이지별 레이턴시 (저부하 k6 VU 10) 📊 Stage Latency (Low load k6 VU 10)

Stage	Avg	비율Ratio	병목 원인Bottleneck
vision (seq:10-11)	5.16s	57%	OpenAI Vision API (gpt-5.2)
answer (seq:30-31)	3.69s	40%	OpenAI Chat API (gpt-5.2)
reward (seq:40-41)	0.22s	2%	Character gRPC (완벽 분류 시에만)Character gRPC (perfect only)
rule (seq:20-21)	0.05s	1%	Lite RAG (Memory, JSON)
Total	~9.12s	100%	🎯 SLA 500 VU E2E p95 83.3s

⚙️ 인프라 구성 ⚙️ Infrastructure Configuration

구성 요소Component	설정Config	설명Description
Celery Pool	`gevent 100`	100% I/O-bound → greenlet 100개 동시 처리100% I/O-bound → 100 concurrent greenlets
Queues	`scan.vision, rule, answer, reward`	스테이지별 개별 큐 (Topology CR)Per-stage queues (Topology CR)
Redis Streams	`4 shards`	이벤트 샤딩 (균등 분배)Event sharding (balanced)
KEDA Scaling	`1-5 replicas`	RabbitMQ 큐 길이 기반RabbitMQ queue length based

🔄 2-Redis Event Bus Layer 🔄 2-Redis Event Bus Layer

컴포넌트Component	역할Role	상세Details
Streams Redis	내구성 버퍼Durable Buffer	`scan:events:0~3` 4 샤드, State KV 저장4 shards, State KV storage
Pub/Sub Redis	실시간 브로드캐스트Realtime Broadcast	Volatile, 장애 격리 (State 복구 가능)Volatile, fault-isolated (State recoverable)
StreamConsumer	`XREADGROUP`	Consumer Group: eventrouter, batch 소비Consumer Group: eventrouter, batch consume
EventProcessor	Lua Script	seq 중복 검사 + State + published marker 원자적seq dedup + State + published marker atomic
PendingReclaimer	`XAUTOCLAIM`	5분 idle 메시지 재처리 (idempotent safe)5min idle reclaim (idempotent safe)
Published Marker	`router:published:{job_id}:{seq}`	TTL 2h (중복 발행 방지)(prevent duplicate publish)

🎁 Reward 조건Reward Condition

완벽한 분리배출 시에만 캐릭터 리워드 지급 (insufficiencies가 비어있을 때) Character reward only for perfect disposal (when insufficiencies is empty)

📈 스냅샷 링크Snapshot Links

🎯 VU 500 SLA → VU 600 → VU 700 Live → VU 800 → VU 900 → VU 1000 →

┌─────────────────────────────────────────────────────────────────────────────────┐
│                    Chat Worker Infrastructure (Production)                       │
├─────────────────────────────────────────────────────────────────────────────────┤
│                                                                                 │
│   Dual-Path Messaging Architecture                                              │
│   ════════════════════════════════════════════════════════════════════════      │
│   Path 1: RabbitMQ (Job Execution)    │   Path 2: Redis (Event Streaming)      │
│   ─────────────────────────────────   │   ──────────────────────────────────   │
│   DIRECT exchange: chat_tasks          │   Streams + Pub/Sub (3-Tier)          │
│   Queue: chat.process                  │   Durable → Router → Volatile         │
│                                                                                 │
│   ┌────────────────────────────────────────────────────────────────────────┐    │
│   │  Taskiq Worker (asyncio-native)                                         │    │
│   │  ───────────────────────────────────────────────────────────────────    │    │
│   │  taskiq worker chat_worker.main:broker --workers 4 --max-async-tasks 10 │    │
│   │  @broker.task(task_name="chat.process", timeout=120, max_retries=2)     │    │
│   └────────────────────────────────────────────────────────────────────────┘    │
│                                                                                 │
│   Job Submission Flow (chat-api → chat-worker)                                  │
│   ────────────────────────────────────────────────────────────────────────      │
│   chat-api                             chat-worker                              │
│   ┌─────────────────────┐              ┌─────────────────────┐                 │
│   │ AioPikaBroker       │   AMQP       │ @broker.task        │                 │
│   │ .kiq(job_id,        │ ──────────▶  │ async def process() │                 │
│   │      session_id,    │   JSON       │   → LangGraph       │                 │
│   │      message,       │              │   → Event Publish   │                 │
│   │      image_url?)    │              └─────────────────────┘                 │
│   └─────────────────────┘                                                      │
│                                                                                 │
│   Event Bus Layer (Idempotent Fan-out)                                          │
│   ────────────────────────────────────────────────────────────────────────      │
│   Worker ──XADD──▶ Redis Streams ──▶ Event Router ──PUBLISH──▶ Pub/Sub        │
│                    (chat:events:*)         │                                   │
│                                            │ Lua Script (Atomic)               │
│                                            ├─ seq 중복 검사                     │
│                                            ├─ State KV Update                  │
│                                            └─ router:published:{job}:{seq}     │
│                                                      │                          │
│                                                      ▼                          │
│                                            SSE Gateway ──▶ Client              │
│                                            ├─ In-memory fan-out                │
│                                            ├─ State recovery (Redis KV)        │
│                                            └─ Catch-up (XRANGE)                │
│                                                                                 │
└─────────────────────────────────────────────────────────────────────────────────┘

✅ 실측 성능 및 최적화 ✅ Measured Performance & Optimization

지표Metric	결과Result	상세Detail
동시 처리Concurrency	40 tasks	4 workers × 10 async-tasks (--workers 4 --max-async-tasks 10)
Connection PoolConnection Pool	212→33 (84%↓)	ReadThroughCheckpointer로 PostgreSQL 직접 접근 제거Removed direct PostgreSQL access via ReadThroughCheckpointer
체크포인트 접근Checkpoint Access	~1ms (Redis)	Cold: 10-50ms (PG) → Warm: ~1ms (Redis Cache)Cold: 10-50ms (PG) → Warm: ~1ms (Redis Cache)
멀티턴 검증Multi-turn Verified	5턴 38 steps	40 checkpoints, 45 blobs, 162 writes 검증Verified 40 checkpoints, 45 blobs, 162 writes

⚡ Celery vs Taskiq 선택 이유 ⚡ Why Taskiq over Celery

항목Aspect	Celery (Scan)	Taskiq (Chat)
동시성 모델Concurrency	gevent (greenlet)	asyncio (native)
LangGraph 호환LangGraph	△ (래핑 필요Wrapping needed)	✅ Native async
BrokerBroker	RabbitMQ	RabbitMQ (재사용)
결과 반환Result	AsyncResult	TaskiqResult

⚙️ 인프라 구성 ⚙️ Infrastructure Configuration

구성 요소Component	설정Config	설명Description
Broker	`AioPikaBroker`	RabbitMQ asyncio 클라이언트RabbitMQ asyncio client
Exchange	`chat_tasks (direct)`	Topology CR로 미리 생성Pre-created by Topology CR
Queue	`chat.process`	DLX, TTL 설정 포함With DLX, TTL settings
Checkpointer	`ReadThroughCheckpointer`	Redis Primary + PostgreSQL Async Sync (3-Tier Memory)Redis Primary + PostgreSQL Async Sync (3-Tier Memory)
LLM Models	`gpt-5.2, gemini-3-flash`	Fallback 체인Fallback chain

🌊 Event Bus Layer (3-Tier Redis) 🌊 Event Bus Layer (3-Tier Redis)

Tier	컴포넌트Component	역할Role
1. Streams	`chat:events:{shard}`	Durable Buffer, Worker → XADDDurable Buffer, Worker → XADD
2. Event Router	`Lua Script`	Idempotent 처리, seq 중복검사, State KVIdempotent processing, seq dedup, State KV
3. Pub/Sub	`chat:{job_id}`	Volatile Broadcast → SSE GatewayVolatile Broadcast → SSE Gateway

📺 SSE Gateway Recovery (Token v2) 📺 SSE Gateway Recovery (Token v2)

기능Feature	구현Implementation	설정값Config
Stale Detection	3초 타임아웃 시 자동 재연결Auto-reconnect on 3s timeout	`3s threshold`
Max Reconnections	지수 간격으로 최대 3회 시도3 attempts with exponential spacing	`3 attempts`
Fallback Polling	SSE 실패 시 폴링 전환Polling fallback if SSE fails	`3s interval, 120s max`
Seq Encoding	Stage: STAGE_ORDER×10, Token: 1000+Stage: STAGE_ORDER×10, Token: 1000+	`Last-Event-ID`
Catch-up	XREVRANGE로 누락 이벤트 복구XREVRANGE for missed events	`State KV + Streams`

📚 참고Reference

ReadThroughCheckpointer 최적화 → ReadThroughCheckpointer Optimization → Event Router ACK 정책 → Event Router ACK Policy → Token v2 XRANGE Recovery → Token v2 XRANGE Recovery → 멀티턴 대화 검증 → Multi-turn Conversation Verification → 3-Tier Memory 아키텍처 → 3-Tier Memory Architecture → SSE Token v2 프로토콜 → SSE Token v2 Protocol →

✅ E2E 검증 완료E2E Verification Complete

✅ LangGraph 파이프라인 구현 및 배포 완료
✅ ReadThroughCheckpointer (Redis + PostgreSQL Async Sync)
✅ 멀티턴 대화 검증 (5턴 38 steps, 40 checkpoints)
✅ Connection Pool 최적화 (212→33, 84%↓)
✅ Token v2 스트리밍 + XRANGE Recovery ✅ LangGraph pipeline implemented & deployed
✅ ReadThroughCheckpointer (Redis + PostgreSQL Async Sync)
✅ Multi-turn conversation verified (5 turns, 38 steps, 40 checkpoints)
✅ Connection Pool optimization (212→33, 84%↓)
✅ Token v2 streaming + XRANGE Recovery

┌─────────────────────────────────────────────────────────────────────────────────┐
│                  Chat Observability Architecture                                 │
├─────────────────────────────────────────────────────────────────────────────────┤
│                                                                                 │
│   ┌─────────────────────────────────────────────────────────────────────────┐   │
│   │                    LangSmith (Feature-level Analysis)                    │   │
│   │   ─────────────────────────────────────────────────────────────────     │   │
│   │   • Per-node latency tracking                                           │   │
│   │   • Token usage & cost monitoring                                       │   │
│   │   • Error tracking & debugging                                          │   │
│   │   • Run metadata (user_id, job_id, intent tags)                        │   │
│   └─────────────────────────────────────────────────────────────────────────┘   │
│                              │                                                  │
│                              │ LANGSMITH_OTEL_ENABLED=true                      │
│                              ▼                                                  │
│   ┌─────────────────────────────────────────────────────────────────────────┐   │
│   │                    Jaeger (Distributed Tracing)                          │   │
│   │   ─────────────────────────────────────────────────────────────────     │   │
│   │   chat-api ───▶ RabbitMQ ───▶ chat-worker ───▶ LangGraph Pipeline       │   │
│   │      │                            │                  │                  │   │
│   │      └─ trace_id propagation ─────┴──────────────────┘                  │   │
│   └─────────────────────────────────────────────────────────────────────────┘   │
│                              │                                                  │
│                              ▼                                                  │
│   ┌─────────────────────────────────────────────────────────────────────────┐   │
│   │                    Prometheus + Grafana (Metrics)                        │   │
│   │   ─────────────────────────────────────────────────────────────────     │   │
│   │   chat_stream_token_latency_seconds    │ Redis XADD delay detection     │   │
│   │   chat_stream_duration_seconds         │ P95 SLO target: 30s            │   │
│   │   chat_stream_active                   │ Concurrent stream count        │   │
│   │   chat_stream_tokens_total             │ Per-node throughput            │   │
│   └─────────────────────────────────────────────────────────────────────────┘   │
│                                                                                 │
└─────────────────────────────────────────────────────────────────────────────────┘

🎯 Clean Architecture: TelemetryConfigPort 🎯 Clean Architecture: TelemetryConfigPort

┌─────────────────────────────────────────────────────────────────┐
│  Application Layer (Ports)                                       │
│  ┌───────────────────────────────────────────────────────────┐  │
│  │  TelemetryConfigPort                                       │  │
│  │  ├── create_run_config(tags, metadata) → RunnableConfig   │  │
│  │  └── Abstract: LangSmith 의존성 분리                        │  │
│  └───────────────────────────────────────────────────────────┘  │
│                              │                                   │
│                              ▼                                   │
│  Infrastructure Layer (Adapters)                                 │
│  ┌───────────────────────────────────────────────────────────┐  │
│  │  LangSmithTelemetryAdapter  │  NoOpTelemetryAdapter        │  │
│  │  (Production)               │  (Testing)                   │  │
│  └───────────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────────┘

📈 Prometheus Metrics 📈 Prometheus Metrics

Metric	Type	용도Purpose	Alert
`chat_stream_token_latency_seconds`	Histogram	Redis XADD 지연 감지Redis XADD delay detection	P95 > 50ms
`chat_stream_duration_seconds`	Histogram	전체 스트림 완료 시간Total stream completion	P95 > 30s
`chat_stream_active`	Gauge	동시 스트림 수Concurrent streams	> 100
`chat_stream_tokens_total`	Counter	노드별 처리량Per-node throughput	-

🏷️ LangSmith Run Config 🏷️ LangSmith Run Config

필드Field	값Value	용도Purpose
tags	`["intent:disposal", "env:prod"]`	UI 필터링UI filtering
metadata.user_id	`UUID`	사용자 추적User tracking
metadata.job_id	`UUID`	작업 상관관계Job correlation
metadata.session_id	`UUID`	대화 세션Conversation session

📚 참고Reference

LangSmith + OpenTelemetry Observability 구현 → LangSmith + OpenTelemetry Observability Implementation →

🔍 해결한 문제 🔍 Problem Solved

Issue #1: 위치 데이터 누락

userLocation이 비동기 geolocation API에서 로드되는데, 클로저가 초기값(undefined)을 캡처

Issue #2: 메시지 소실

페이지네이션 시 Optimistic 메시지와 서버 데이터 불일치 → 덮어쓰기로 메시지 사라짐

Issue #3: 새로고침 손실

메모리 상태(useState)만 사용하여 브라우저 새로고침 시 모든 메시지 손실

Timeline: ─────────────────────────────────────────────────────────────────────
T0: User sends message
T1: Frontend Optimistic Update (즉시)
T2: SSE streaming starts (0.5s)
T3: SSE done event (3s)
T4: Backend DB write starts (3.1s)   ← Eventual Consistency
T5: Backend DB write completes (3.3s)
T6: User scrolls up → API call (4s)  ← Race Condition!
───────────────────────────────────────────────────────────────────────────────
Gap: T1~T5 사이에 프론트엔드는 메시지를 가지고 있지만, 백엔드 DB에는 아직 없음

🏗️ FE-BE 통합 아키텍처 🏗️ FE-BE Integration Architecture

┌─────────────────────────────────────────────────────────────────────────────┐
│                     Frontend-Backend Integration                             │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  Frontend (Browser)                                                          │
│  ┌────────────────────────────────────────────────────────────────────────┐ │
│  │  React State (Optimistic)                                               │ │
│  │  ┌─────────────┐    ┌──────────────┐    ┌─────────────┐                │ │
│  │  │  Messages   │───▶│  Reconciler  │───▶│  IndexedDB  │                │ │
│  │  │ (client_id) │    │ (30s buffer) │    │ (Persistent)│                │ │
│  │  └─────────────┘    └──────────────┘    └─────────────┘                │ │
│  └────────────────────────────────────────────────────────────────────────┘ │
│                            │                        ▲                        │
│              POST /send    │                        │ GET /messages          │
│                            ▼                        │                        │
│  ────────────────────────────────────────────────────────────────────────── │
│                                                                              │
│  Backend (Kubernetes)                                                        │
│  ┌────────────────────────────────────────────────────────────────────────┐ │
│  │  chat-api → RabbitMQ → chat-worker (LangGraph)                         │ │
│  │                              │                                          │ │
│  │              ┌───────────────┼───────────────┐                          │ │
│  │              ▼               ▼               ▼                          │ │
│  │       Redis Streams    SSE Gateway     PostgreSQL                       │ │
│  │       (chat:events)    (Real-time)    (Eventual Write)                  │ │
│  │              │               ▲                  ▲                       │ │
│  │              └───────────────┘                  │                       │ │
│  │              event-router              chat-consumer                    │ │
│  │                                       (Async Write)                     │ │
│  └────────────────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────────────┘

🎯 핵심 해결 전략 🎯 Key Solution Strategies

⏱️ 30초 Retention Window

committed 상태이지만 server_id 없는 메시지를 30초간 유지
→ 백엔드 DB 저장 완료까지의 시간 차이 흡수
Why 30초? DB write 평균 200~500ms + 피크 트래픽 고려

🔗 Client ID + Server ID Mapping

client_id: 프론트엔드 UUID (불변, idempotency key)
server_id: 백엔드 DB PK (done 이벤트 후 할당)
Reconcile 시 server_id 우선 사용 → 중복 제거

📦 Cache-Aside Pattern (IndexedDB)

Read: IndexedDB 우선 → 서버 백그라운드 (0ms UX)
Write: Optimistic → SSE done → Reconcile
TTL: 7일 (일반), 30초 (committed + synced)

📍 Location Ref Pattern

useState는 클로저 캡처 → 비동기 완료 전 값 고정
useRef는 항상 최신 값 참조 → geolocation 완료 후 반영
userLocationRef.current로 항상 최신 위치 전송

🔀 Backend: Consumer Group Fan-out 🔀 Backend: Consumer Group Fan-out

Consumer Group	Consumer	용도Purpose	Latency
`eventrouter`	`event-router`	SSE 실시간 전송SSE Real-time	~10ms
`chat-persistence`	`chat-consumer`	PostgreSQL 저장 (5초 Batch)PostgreSQL Write (5s Batch)	~200ms

핵심: 동일 이벤트를 두 Consumer Group이 독립 처리 → 실시간성 + 영속성 분리 Key: Same event processed by two independent Consumer Groups → Real-time + Persistence separated

📚 참고Reference

Optimistic Update (FE) & Eventual Consistency (BE) 통합 트러블슈팅 → Optimistic Update (FE) & Eventual Consistency (BE) Integration →

🔄 Message Status State Machine 🔄 Message Status State Machine

type MessageStatus = 'pending' | 'streaming' | 'committed' | 'failed';

interface AgentMessage {
  client_id: string;    // UUID (프론트엔드 생성, 불변)
  server_id?: string;   // DB PK (백엔드 할당)
  id: string;           // Legacy compat (server_id || client_id)
  role: 'user' | 'assistant';
  content: string;
  created_at: string;
  status: MessageStatus;
}

User Message:      pending ────────────────▶ committed
                      │                           ▲
                      └───────────────────────────┘
                           (실패 시) failed

Assistant Message: streaming ──────────────▶ committed

🔀 Reconcile Algorithm 🔀 Reconcile Algorithm

Message Status	Server_ID	Keep?	이유Reason
`pending`	❌	✅ 항상	전송 중Sending
`streaming`	❌	✅ 항상	SSE 수신 중SSE Receiving
`committed`	✅	❌	서버 버전으로 대체Replace with server
`committed`	❌	✅ 30초 내	Eventual Consistency 버퍼Eventual Consistency buffer
`failed`	❌	✅ 항상	재시도 가능Retry available

💾 IndexedDB Persistence 💾 IndexedDB Persistence

📝 MessageDB Class

saveMessages(chatId, messages)
getMessages(chatId)
cleanup(chatId, options)
Compound Index: [chat_id, created_at] → O(log n)

🧹 Cleanup Policy

TTL (7일): 모든 메시지 자동 삭제
Committed Retention (30초): 동기화 완료 후 삭제
1분 주기: useMessagePersistence 훅이 cleanup

⚡ 성능 최적화 ⚡ Performance Optimization

500ms Throttle

SSE 토큰 스트리밍 시 수십 번 업데이트
Throttle 없이 → 10+ writes/s
500ms throttle → 1 write/s

Compound Index

by-chat-created 복합 인덱스
단일 쿼리로 정렬된 결과
1000+ 메시지에서 10ms 이내

Batch Cleanup

1분 주기 자동 cleanup
실시간 → 매 메시지 검사 (비효율)
배치 → 성능 영향 최소화

📊 통합 데이터 플로우 📊 Integrated Data Flow

Timeline             Frontend                              Backend
────────────────────────────────────────────────────────────────────────────────
T0: User clicks    createUserMessage()
                   └─ client_id: uuid-1
                   └─ status: 'pending'
                   setMessages([...prev, userMsg])
                   IndexedDB.save(userMsg)
                   API.sendMessage(chatId, {...})  ───▶  POST /chat/:id/messages

T1: 0.1s                                                  RabbitMQ.publish()
                                                          └─ queue: chat.process

T2~T5: Streaming   SSE: onmessage                  ◀───   XADD chat:events:0
                   appendStreamingText(token)              LangGraph Pipeline

T6: done event     handleSSEComplete()             ◀───   {stage: "done", ...}
                   ├─ updateUserMessage(uuid-1)
                   │  └─ status: 'committed'
                   │  └─ server_id: 'srv-uuid-1'
                   └─ IndexedDB.save(both)

T7~T8: Async                                              chat-consumer
                                                          └─ PostgreSQL INSERT

T9: Pagination     loadMoreMessages()
                   API.getChatDetail()             ───▶   GET /chat/:id/messages
                   reconcileMessages(local, server) ◀───  {messages: [...]}
                   └─ 중복 제거 + 30s 버퍼 유지
────────────────────────────────────────────────────────────────────────────────

🎨 Status-Driven UI 🎨 Status-Driven UI

pending

🔄 스피너 표시

streaming

⌨️ 타이핑 인디케이터

committed

✅ 정상 표시

failed

🔁 재시도 버튼

🛠️ 기술 스택 요약 🛠️ Tech Stack Summary

Layer	Technology	용도Purpose
Frontend State	React useState	Optimistic UI 상태Optimistic UI state
Frontend Cache	IndexedDB (idb)	영구 저장 + 새로고침 복구Persistent + refresh recovery
Frontend Sync	Reconcile Algorithm	로컬/서버 데이터 병합Local/server merge
Backend Real-time	Redis Streams → Pub/Sub → SSE	SSE 이벤트 전송SSE event delivery
Backend Persistence	PostgreSQL	Eventual WriteEventual Write
Backend Worker	LangGraph + TaskIQ + RabbitMQ	비동기 메시지 처리Async message processing

🔍 Deep Dive 🔍 Deep Dive

🚌

Event Bus Architecture

Redis Streams + Pub/Sub + State KV 3-Tier

💾

3-Tier Memory

Redis Primary + PostgreSQL Async Sync

⚡

Optimistic Update + Eventual Consistency

FE-BE 실시간 동기화, 30s Retention Window

🧠

Multi-LLM Provider

OpenAI, Gemini 동적 선택

📈 KEDA 오토스케일링Autoscaling

⚙️ Chat Worker

RabbitMQ 큐 길이 (5msg)

1-4 pods

📡 SSE Gateway

연결 수 (100/pod)

1-3 pods

🔀 Event Router

Pending 메시지 (100)

1-2 pods

📚 참고Reference

Optimistic Update (FE) & Eventual Consistency (BE) 통합 트러블슈팅 → Optimistic Update (FE) & Eventual Consistency (BE) Integration →

📋 Intent & Sub-agent 매핑 (10개) 📋 Intent & Sub-agent Mapping (10)

Intent	Node	방식 Method
🗑️ waste	waste_rag	RAG (Tag-Based)
🌍 character	character	gRPC
📍 location	location	Tool Calling (Kakao)
🛋️ bulk_waste	bulk_waste	Tool Calling (MOIS)
💰 recyclable_price	recyclable_price	Tool Calling (KECO)
📦 collection_point	collection_point	Tool Calling (KECO)
🌤️ weather	weather	Tool Calling (KMA)
🎨 image_generation	image_generation	Gemini Image API
🔍 web_search	web_search	Native Tool (OpenAI/Gemini)
💬 general	general	LLM Direct

🔧 Tool Calling (5)

location: Kakao Local API

bulk_waste: MOIS 공공 API

recyclable_price: KECO API

collection_point: KECO API

weather: KMA 기상청 API

@function_tool 데코레이터 정의

⚡ Native Tool (2)

web_search: GPT-5.2 / Gemini-3-flash 내장

image_generation: Gemini-3-pro-image

LLM 네이티브 기능, 외부 API 불필요

google.genai.types.Tool 직접 사용

📦 Other Methods (3)

waste_rag: Tag-Based RAG Pipeline

character: gRPC (Characters API)

general: LLM Direct (Fallback)

RAG 167품목 + 80상황 태그 검색

🛠️ SDK 직접 사용 (LangChain 추상화 배제) 🛠️ Direct SDK Usage (No LangChain Abstraction)

OpenAI SDK

tools=[{"type": "function"}]
네이티브 Function Calling Native Function Calling

Gemini SDK

google.genai.types.Tool
Web Search, Image Gen 내장 Built-in Web Search, Image Gen

Agents SDK

@function_tool
데코레이터 기반 Tool 정의 Decorator-based Tool Definition

🌊 Tool Execution → SSE Streaming 🌊 Tool Execution → SSE Streaming

Tool 실행 결과가 Event Bus를 통해 실시간 스트리밍:
tool_start → tool_progress → tool_result → answer_stream Tool execution results stream in real-time via Event Bus:
tool_start → tool_progress → tool_result → answer_stream

🔴 발견된 문제: Race Condition 🔴 Problem: Race Condition

Optimistic Update(즉시)와 Eventual Consistency(~200ms 지연) 간의 타이밍 불일치로 메시지 소실 발생
• 사용자 메시지 전송 → Optimistic UI 표시 → 이전 메시지 로드 → 서버 미저장 상태에서 덮어쓰기 → 메시지 사라짐 Timing mismatch between Optimistic Update (immediate) and Eventual Consistency (~200ms delay) causes message loss
• Send message → Optimistic UI → Load older messages → Server overwrites before persistence → Message lost

┌─────────────────────────────────────────────────────────────────────────────────┐
│                    Data Consistency Architecture                                 │
├─────────────────────────────────────────────────────────────────────────────────┤
│                                                                                 │
│  Frontend (React)                          Backend                              │
│  ┌─────────────────────┐                  ┌──────────────────────────────────┐ │
│  │ useState (Optimistic)│ ──POST──────►   │ API → RabbitMQ → LangGraph       │ │
│  │ IndexedDB (Persist)  │                  │         │                        │ │
│  │ Reconcile Algorithm  │ ◄──SSE───────   │         ▼                        │ │
│  └─────────────────────┘                  │  Redis Streams (Fan-out)         │ │
│         │                                  │  ├─ eventrouter → SSE (~10ms)   │ │
│         │ 500ms throttle                   │  └─ persistence → PG (~200ms)   │ │
│         ▼                                  └──────────────────────────────────┘ │
│  ┌─────────────────────┐                                                       │
│  │ IndexedDB            │  TTL: 7일 (일반) / 30초 (synced)                      │
│  │ + Auto Cleanup (1m)  │  Compound Index: [chat_id, timestamp]                │
│  └─────────────────────┘                                                       │
│                                                                                 │
└─────────────────────────────────────────────────────────────────────────────────┘

📊 Reconcile Algorithm

• pending/streaming: 항상 유지 (로컬 우선)
• committed (no server_id): 30초 이내 유지
• 중복 제거: server_id 기준 우선
• 시간순 정렬: timestamp 기반 병합 • pending/streaming: Always keep (local-first)
• committed (no server_id): Keep within 30s
• Dedup: server_id takes priority
• Sort: Timestamp-based merge

🔀 Message Status State Machine

• pending → committed (SSE done 수신)
• pending → failed (timeout/error)
• streaming → committed (완료)
• ID 매핑: client_id(FE UUID) ↔ server_id(DB PK) • pending → committed (SSE done received)
• pending → failed (timeout/error)
• streaming → committed (complete)
• ID mapping: client_id(FE UUID) ↔ server_id(DB PK)

⏱️ 30s Retention Window

• Eventual Consistency 완료 대기 시간
• 동기화된 메시지 30초 후 IndexedDB 삭제
• 서버 데이터가 진실의 원천으로 전환
• Race Condition 방지 안전 구간 • Eventual Consistency completion wait time
• Synced messages deleted from IndexedDB after 30s
• Server data becomes source of truth
• Safety window preventing Race Conditions

💾 IndexedDB Persistence

• 브라우저 새로고침 복구 (Cache-Aside)
• 500ms Throttle: SSE 토큰 스트리밍 시 쓰기 10배 감소
• Compound Index: [chat_id, timestamp]
• 1분 주기 Cleanup: 만료 메시지 자동 삭제 • Browser refresh recovery (Cache-Aside)
• 500ms Throttle: 10x write reduction during SSE streaming
• Compound Index: [chat_id, timestamp]
• 1min Cleanup: Auto-delete expired messages

📈 통합 데이터 플로우 타임라인 📈 Integrated Data Flow Timeline

                    
                    T0: User sends → Optimistic Update (즉시 UI)

                    T1~T6: SSE streaming (vision → answer → done + server_id 할당)

                    T7~T8: PostgreSQL INSERT/COMMIT (~200ms)

                    T9: User scrolls → Reconcile(local + server) → 중복 제거 + 30초 유지
                    
                    T0: User sends → Optimistic Update (immediate UI)

                    T1~T6: SSE streaming (vision → answer → done + server_id assigned)

                    T7~T8: PostgreSQL INSERT/COMMIT (~200ms)

                    T9: User scrolls → Reconcile(local + server) → Dedup + 30s retention

Consumer Group	목적Purpose	지연시간Latency
`eventrouter`	SSE 실시간 전송SSE Real-time delivery	~10ms
`chat-persistence`	PostgreSQL 저장PostgreSQL persistence	~200ms

📚 블로그 원문Blog Post

Optimistic Update (FE) & Eventual Consistency (BE) 통합 트러블슈팅 → Optimistic Update (FE) & Eventual Consistency (BE) Integration Troubleshooting →

┌─────────────────────────────────────────────────────────────────────┐
│              Multi-Model LLM Architecture (Port/Adapter)            │
├─────────────────────────────────────────────────────────────────────┤
│                                                                     │
│  ┌───────────────────────────────────────────────────────────────┐  │
│  │                    LLMClientPort (ABC)                         │  │
│  │  ─────────────────────────────────────────────────────────── │  │
│  │  • generate(prompt, system_prompt) → str                     │  │
│  │  • generate_stream(prompt) → AsyncIterator[str]              │  │
│  │  • generate_structured(prompt, schema) → T                   │  │
│  │  • generate_with_tools(prompt, tools) → AsyncIterator[str]   │  │
│  │  • generate_function_call(prompt, functions) → (name, args)  │  │
│  └───────────────────────┬───────────────────────────────────────┘  │
│                           │                                         │
│            ┌──────────────┼──────────────┐                          │
│            ▼              ▼              ▼                          │
│  ┌─────────────┐ ┌─────────────┐ ┌──────────────┐                  │
│  │   OpenAI    │ │   Gemini    │ │  LangChain   │                  │
│  │   Adapter   │ │   Adapter   │ │   Adapter    │                  │
│  ├─────────────┤ ├─────────────┤ ├──────────────┤                  │
│  │ gpt-5.2     │ │ gemini-3-   │ │ OpenAI SDK   │                  │
│  │ Agents SDK  │ │ flash/pro   │ │ → Runnable   │                  │
│  │ (Primary)   │ │ Gemini SDK  │ │ → astream()  │                  │
│  │ +Responses  │ │ (FC, Image) │ │ Token capture│                  │
│  │ (Fallback)  │ │             │ │              │                  │
│  └─────────────┘ └─────────────┘ └──────────────┘                  │
│                                                                     │
│  ┌───────────────────────────────────────────────────────────────┐  │
│  │                   Model Registry (provider.py)                │  │
│  │  ─────────────────────────────────────────────────────────── │  │
│  │  openai/gpt-5.2              │ 400K ctx │ tools, vision        │  │
│  │  google/gemini-3-flash-preview   │ 1M ctx │ tools, vision     │  │
│  │  google/gemini-3-pro-image         │   -   │ image_gen         │  │
│  └───────────────────────────────────────────────────────────────┘  │
│                                                                     │
│  Runtime Selection: create_llm_client(provider, model)              │
│  Auto-Inference: "gemini-*" → google, "gpt-*" → openai             │
│  LLMPolicyPort: ModelTier(FAST/STANDARD/PREMIUM) × TaskType         │
│                                                                     │
└─────────────────────────────────────────────────────────────────────┘

LLM Native API 통합 LLM Native API Integration

OpenAI Agents SDK + Responses API (Fallback)

• generate_with_tools() — Agents SDK Primary + Responses Fallback
• generate_structured() — Agent output_type (Pydantic 자동 파싱)
• generate_function_call() — Chat Completions functions
• Image Generation — gpt-image-1.5 tool
• Streaming — response.output_text.delta 이벤트

Gemini SDK (google-genai)

• generate_content() — 텍스트 생성
• generate_content_stream() — 스트리밍
• Structured Output — response_mime_type + schema
• Image Gen — response_modalities=["TEXT","IMAGE"]
• Reference Image — Pro: 최대 14장, Flash: 최대 3장

런타임 모델 전환 (Provider Auto-Inference) Runtime Model Switching (Provider Auto-Inference)

Model ID	Provider	Context	Capabilities
`openai/gpt-5.2`	OpenAI	400K	tools, vision, web_search
`google/gemini-3-flash-preview`	Google	1M	tools, vision
`google/gemini-3-pro-image`	Google	-	image_gen, char_ref

create_llm_client(provider, model) 팩토리에서 모델명 prefix로 Provider 자동 추론.
LLMPolicyPort.select_model(task_type, tier)로 태스크별 최적 모델 선택.
Image Generation: 텍스트 모델에 연결된 이미지 모델 자동 매핑 (gpt-5.2 → gpt-image-1.5). create_llm_client(provider, model) factory auto-infers provider from model name prefix.
LLMPolicyPort.select_model(task_type, tier) selects optimal model per task.
Image Generation: Auto-maps linked image model (gpt-5.2 → gpt-image-1.5).

LangChain Adapter (Token Streaming) LangChain Adapter (Token Streaming)

LangGraph stream_mode="messages"로 토큰 캡처 시:
OpenAI SDK → LangChainOpenAIRunnable(BaseChatModel) → AIMessageChunk
LangGraph가 직접 토큰을 캡처하지 못하는 경우 (web_search 등):
LLMClientPort.generate_with_tools() → notify_token_v2() 직접 발행 For LangGraph stream_mode="messages" token capture:
OpenAI SDK → LangChainOpenAIRunnable(BaseChatModel) → AIMessageChunk
When LangGraph can't capture tokens (web_search etc.):
LLMClientPort.generate_with_tools() → notify_token_v2() direct emit

📂 Source Files

ports/llm/llm_client.py (LLMClientPort ABC)
clients/openai_client.py (Agents SDK Primary + Responses Fallback)
clients/gemini_client.py (Gemini SDK)
domain/models/provider.py (MODEL_REGISTRY)
setup/dependencies.py (create_llm_client Factory)

┌─────────────────────────────────────────────────────────────────────┐
│                  Intent Classification Pipeline                      │
├─────────────────────────────────────────────────────────────────────┤
│                                                                     │
│  User Message ──► intent_node ──► dynamic_router ──► subagents     │
│                       │                                             │
│         ┌─────────────┴─────────────┐                               │
│         │    ClassifyIntentCommand    │  (UseCase)                   │
│         │  ┌───────────────────────┐ │                              │
│         │  │ 1. Cache 조회          │ │  SHA256(message) → Redis    │
│         │  │    TTL: 3600s (1시간)  │ │                              │
│         │  ├───────────────────────┤ │                              │
│         │  │ 2. LLM 호출            │ │                              │
│         │  │    intent.txt 프롬프트  │ │                              │
│         │  │    → JSON 응답 파싱     │ │                              │
│         │  ├───────────────────────┤ │                              │
│         │  │ 3. IntentClassifier    │ │  (Service)                   │
│         │  │    Service             │ │                              │
│         │  │  • Keyword Boost +0.2  │ │                              │
│         │  │  • Chain-of-Intent     │ │                              │
│         │  │    Transition +0.15    │ │                              │
│         │  │  • Length Penalty -0.1 │ │                              │
│         │  │  • THRESHOLD: 0.6     │ │                              │
│         │  ├───────────────────────┤ │                              │
│         │  │ 4. Multi-Intent 감지   │ │                              │
│         │  │    "그리고","또","같이" │ │                              │
│         │  │    → Query Decompose   │ │                              │
│         │  ├───────────────────────┤ │                              │
│         │  │ 5. 캐릭터 이름 감지    │ │                              │
│         │  │    "페티","이코" 등     │ │                              │
│         │  │    → image_gen 참조용  │ │                              │
│         │  ├───────────────────────┤ │                              │
│         │  │ 6. Cache 저장          │ │                              │
│         │  └───────────────────────┘ │                              │
│         └────────────────────────────┘                              │
│                                                                     │
│  Output → state:                                                    │
│    intent, confidence, is_complex, has_multi_intent,                │
│    additional_intents, decomposed_queries, intent_history,          │
│    detected_character                                               │
│                                                                     │
└─────────────────────────────────────────────────────────────────────┘

Confidence Scoring

                    final_confidence = llm_confidence + keyword_boost + transition_boost + length_penalty

                    llm_confidence: 0.0~1.0 (LLM JSON 응답)

                    keyword_boost: 0.0~0.2 (키워드 맵 매칭)

                    transition_boost: 0.0~0.15 (Chain-of-Intent, last_conf ≥ 0.7)

                    length_penalty: -0.1~0.0 (짧은 질문)

                    CONFIDENCE_THRESHOLD = 0.6 (미만 → general fallback)

주입 프롬프트 (intent.txt 핵심)

9 Categories

waste, character, location,
bulk_waste, recyclable_price,
collection_point, weather,
image_generation, general

판단 가이드라인

1. 의미 중심 판단 (키워드 X)
2. 대화 맥락 고려
3. 확신 낮으면 general
4. 복합 질문 → 핵심 의도 선택

Multi-turn Context

conversation_history[-6:]
(최근 3턴 참조)
→ LLM 맥락 분류 지원

Output Format

{"intent": "waste",
"confidence": 0.95,
"reasoning": "분리배출 방법"}

Chain-of-Intent Transitions

waste → location: +0.15

waste → collection_point: +0.10

location → waste: +0.10

bulk_waste → location: +0.10

※ last_confidence ≥ 0.7 + intent_history 기반 적용

📂 Source Files

nodes/intent_node.py (Node Adapter)
commands/classify_intent_command.py (UseCase)
services/intent_classifier_service.py (Business Logic)
prompts/classification/intent.txt (System Prompt)

┌─────────────────────────────────────────────────────────────────────┐
│                    waste Intent Pipeline (RAG)                       │
├─────────────────────────────────────────────────────────────────────┤
│                                                                     │
│  User Message ──► rag_node ──► answer_node ──► SSE Stream          │
│                      │                                              │
│              ┌───────┴───────┐                                      │
│              │  RetrieverPort │  (Vector DB Search)                  │
│              │  ─────────────│                                      │
│              │  • Query 임베딩                                       │
│              │  • Similarity Search                                  │
│              │  • Top-K 규정 문서 반환                                │
│              └───────────────┘                                      │
│                      │                                              │
│              ┌───────┴───────┐                                      │
│              │ SearchRAGCommand│                                     │
│              │  (UseCase)     │                                      │
│              │  • disposal_rules                                     │
│              │  • classification                                     │
│              │  • situation_tags                                     │
│              └───────────────┘                                      │
│                                                                     │
│  Policy: NodeExecutor (FAIL_OPEN, timeout: 3000ms, retry: 1)       │
│                                                                     │
└─────────────────────────────────────────────────────────────────────┘

Clean Architecture

Node (Adapter)

rag_node.py
LangGraph glue code

Command (UseCase)

SearchRAGCommand
정책/흐름 제어

Service (Logic)

RAGSearcherService
순수 비즈니스 로직

주입 프롬프트 (waste_instruction.txt)

Context 변수: {disposal_rules}, {classification}, {situation_tags}

핵심 지침:
• RAG 검색 결과(규정 문서)를 기반으로 분리배출 방법 안내
• disposal_rules에서 품목별 배출 규정 추출
• classification 결과로 세부 분류 확인
• situation_tags로 상황별 맞춤 안내 (비올 때, 야간 등)

📂 Source Files

nodes/rag_node.py
prompts/local/waste_instruction.txt
commands/search_rag_command.py

┌─────────────────────────────────────────────────────────────────────┐
│                 character Intent Pipeline (gRPC)                     │
├─────────────────────────────────────────────────────────────────────┤
│                                                                     │
│  User Message ──► character_node ──► answer_node ──► SSE Stream    │
│                        │                                            │
│              ┌─────────┴─────────┐                                  │
│              │ LLM: 카테고리 추출  │  (CategoryExtractorService)     │
│              │ "이코가 누구야?"     │                                 │
│              │  → category: "이코" │                                 │
│              └─────────┬─────────┘                                  │
│                        │                                            │
│              ┌─────────┴─────────┐                                  │
│              │ CharacterClientPort│  (gRPC Call)                     │
│              │ ──────────────────│                                  │
│              │ • GetCharacter()   │                                  │
│              │ • 13 캐릭터 데이터  │                                  │
│              │ • 이미지 에셋 로드  │                                  │
│              └───────────────────┘                                  │
│                                                                     │
│  Policy: NodeExecutor (FAIL_OPEN, timeout: 3000ms, retry: 1)       │
│  ※ 선택적 컨텍스트 - 실패해도 파이프라인 진행                         │
│                                                                     │
└─────────────────────────────────────────────────────────────────────┘

13 Eco² 캐릭터 (character_names.yaml)

🌍 이코
대표 캐릭터

🧴 페티
무색페트병

🥫 메탈리
금속류

📄 페이피
종이

🧃 플리
플라스틱류

🛍️ 비니
비닐류

📦 폼이
발포합성수지

🫙 글래시
유리병

🥛 팩토리
종이팩

🔋 배리
전지

💡 라이티
조명제품

👕 코튼
의류및원단

📱 일렉
전기전자제품

📂 Source Files

nodes/character_node.py
data/character_names.yaml (13종 정의)
character_name_detector.py
prompts/global/eco_character.txt

┌─────────────────────────────────────────────────────────────────────┐
│              location Intent Pipeline (Function Calling)             │
├─────────────────────────────────────────────────────────────────────┤
│                                                                     │
│  User Message ──► kakao_place_node ──► answer_node ──► SSE Stream  │
│                        │                                            │
│              ┌─────────┴──────────┐                                 │
│              │ LLM Function Calling│                                │
│              │ ───────────────────│                                 │
│              │ search_kakao_place: │                                 │
│              │  • query: "재활용센터"│                                │
│              │  • search_type: keyword│                              │
│              │  • radius: 5000 (m)  │                                │
│              └─────────┬──────────┘                                 │
│                        │                                            │
│              ┌─────────┴──────────┐                                 │
│              │ SearchKakaoPlace    │                                 │
│              │ Command (UseCase)   │                                 │
│              │ ───────────────────│                                 │
│              │ KakaoLocalClientPort│  → REST API                    │
│              │ • Keyword Search    │                                 │
│              │ • Category Search   │                                 │
│              └────────────────────┘                                 │
│                                                                     │
│  HITL: user_location 없으면 → notify_needs_input("location")       │
│                                                                     │
└─────────────────────────────────────────────────────────────────────┘

Function Definition

                    search_kakao_place({

                      query: string,  // "재활용센터", "제로웨이스트샵"

                      search_type: "keyword" | "category",

                      category_code?: string,  // CE7(카페) 등

                      radius?: integer = 5000  // 미터

                    })

                    // function_call: {"name": "search_kakao_place"} (강제 호출)

📂 Source Files

nodes/kakao_place_node.py
prompts/local/location_instruction.txt
commands/search_kakao_place_command.py

┌─────────────────────────────────────────────────────────────────────┐
│            bulk_waste Intent Pipeline (LLM Agent + MOIS API)        │
├─────────────────────────────────────────────────────────────────────┤
│                                                                     │
│  User ──► bulk_waste_agent_node ──► answer_node ──► SSE Stream     │
│                    │                                                │
│         ┌──────────┴──────────┐                                     │
│         │    LLM Agent Loop   │  (max 5 iterations)                 │
│         │  ┌───────────────┐  │                                     │
│         │  │ GPT-5.2 Strict│  │  or  │ Gemini 3 │                  │
│         │  └───────┬───────┘  │                                     │
│         │          │          │                                     │
│         │    tool_choice: auto│                                     │
│         │          │          │                                     │
│         │  ┌───────┴───────┐  │                                     │
│         │  │  Tool Registry │  │                                     │
│         │  │───────────────│  │                                     │
│         │  │• get_collection│  │  → 수거 방법/신청 URL/전화번호      │
│         │  │  _info(sigungu)│  │                                     │
│         │  │• search_fee    │  │  → 품목별 수수료 검색               │
│         │  │  (sigungu,item)│  │                                     │
│         │  └───────────────┘  │                                     │
│         └─────────────────────┘                                     │
│                                                                     │
│  병렬 Tool 실행: asyncio.gather(*tool_calls)                       │
│  Provider: OpenAI (Strict Mode) / Gemini (AUTO mode)                │
│                                                                     │
└─────────────────────────────────────────────────────────────────────┘

Tool Definitions (GPT-5.2 Strict Mode)

get_collection_info

                        sigungu: string (필수)

                        → 신청 URL, 전화번호

                        → 수거 방식, 수수료 납부

search_fee

                        sigungu: string (필수)

                        item_name: string (필수)

                        → 품목별 수수료 목록 (max 10)

Agent System Prompt (핵심)

Scope Discipline: get_collection_info, search_fee만 사용
Preambles (GPT-5.2): 도구 호출 전 이유 설명 필수
지역 추출: "강남구 대형폐기물" → sigungu="강남구"
Fallback: 지역명 없으면 도구 호출 X, 사용자에게 요청

📂 Source Files

nodes/bulk_waste_agent_node.py (733 lines)

┌─────────────────────────────────────────────────────────────────────┐
│        recyclable_price Intent Pipeline (LLM Agent + KECO API)      │
├─────────────────────────────────────────────────────────────────────┤
│                                                                     │
│  User ──► recyclable_price_agent_node ──► answer_node ──► SSE      │
│                    │                                                │
│         ┌──────────┴──────────┐                                     │
│         │    LLM Agent Loop   │  (max 5 iterations)                 │
│         │  ┌───────────────┐  │                                     │
│         │  │ GPT-5.2/Gemini│  │                                     │
│         │  └───────┬───────┘  │                                     │
│         │          │          │                                     │
│         │  ┌───────┴───────┐  │                                     │
│         │  │  Tool Registry │  │                                     │
│         │  │───────────────│  │                                     │
│         │  │• search_price  │  │  → 품목명 검색 (부분 매칭)          │
│         │  │  (item_name,   │  │                                     │
│         │  │   region?)     │  │                                     │
│         │  │• get_category  │  │  → 카테고리별 전체 조회             │
│         │  │  _prices       │  │                                     │
│         │  │  (category,    │  │                                     │
│         │  │   region?)     │  │                                     │
│         │  └───────────────┘  │                                     │
│         └─────────────────────┘                                     │
│                                                                     │
│  8 Regions: capital, gangwon, chungbuk, chungnam,                   │
│             jeonbuk, jeonnam, gyeongbuk, gyeongnam + national       │
│  5 Categories: paper, plastic, glass, metal, tire                   │
│                                                                     │
└─────────────────────────────────────────────────────────────────────┘

Tool Definitions

search_price

                        item_name: string (필수)

                        region?: enum (8개 권역)

                        → kg당 가격, 조사일, 형태별

get_category_prices

                        category: enum (필수)

                        region?: enum

                        → 카테고리 전체 품목 가격

📂 Source Files

nodes/recyclable_price_agent_node.py (799 lines)

┌─────────────────────────────────────────────────────────────────────┐
│       collection_point Intent Pipeline (LLM Agent + KECO API)       │
├─────────────────────────────────────────────────────────────────────┤
│                                                                     │
│  User ──► collection_point_agent_node ──► answer_node ──► SSE      │
│                    │                                                │
│         ┌──────────┴──────────┐                                     │
│         │    LLM Agent Loop   │  (max 5 iterations)                 │
│         │  ┌───────────────┐  │                                     │
│         │  │ GPT-5.2/Gemini│  │                                     │
│         │  └───────┬───────┘  │                                     │
│         │          │          │                                     │
│         │  ┌───────┴──────────────┐                                 │
│         │  │     Tool Registry    │                                 │
│         │  │─────────────────────│                                  │
│         │  │• search_collection   │  → 주소/상호명 검색              │
│         │  │  _points             │                                 │
│         │  │• get_nearby          │  → 좌표 기반 주변 검색           │
│         │  │  _collection_points  │                                 │
│         │  │• geocode             │  → 장소명→좌표 변환 (Kakao)     │
│         │  └──────────────────────┘                                 │
│         └─────────────────────────┘                                 │
│                                                                     │
│  Multi-Step Pattern:                                                │
│  "[지역명] 근처 수거함" → geocode(지역) → get_nearby(lat,lon)       │
│                                                                     │
│  수거 품목: 폐전자제품, 폐건전지, 폐형광등                          │
│                                                                     │
└─────────────────────────────────────────────────────────────────────┘

Tool Definitions (3 Tools)

search_collection_points

                        address_keyword?

                        name_keyword?

                        → 수거함 목록

get_nearby_collection_points

                        latitude: number

                        longitude: number

                        radius_km?: 2.0

geocode

                        place_name: string

                        → lat, lon 반환

                        (KakaoAPI 위임)

📂 Source Files

nodes/collection_point_agent_node.py (857 lines)

┌─────────────────────────────────────────────────────────────────────────────┐
│          web_search Intent Pipeline (Agents SDK + Fallback)                  │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  AS-IS (Responses API raw)              TO-BE (Agents SDK Primary)          │
│  ─────────────────────────              ──────────────────────────          │
│  Intent → /v1/responses (timeout빈발)  Intent → Agents SDK Runner          │
│       → FC 2회 호출 (재확인)                 → WebSearchTool (재시도 내장)  │
│       → 검색결과 미반영 (FAIL_OPEN)          → 1회 호출 (Router 신뢰)      │
│                                                                             │
│  ┌─ Primary: Agents SDK ─────────────────────────────────────────────────┐ │
│  │                                                                       │ │
│  │  agent = Agent(                                                       │ │
│  │      model=OpenAIResponsesModel(model="gpt-5.2"),                     │ │
│  │      tools=[WebSearchTool()],          # SDK 내장 재시도               │ │
│  │      output_type=response_schema,      # Pydantic 자동 파싱           │ │
│  │  )                                                                    │ │
│  │  result = await Runner.run(agent, input=prompt)                       │ │
│  │  → result.final_output  # 이미 구조화된 응답                          │ │
│  │                                                                       │ │
│  └───────────────────────────────────────────────────────────────────────┘ │
│                         │ Exception 발생 시                                 │
│  ┌─ Fallback: Responses API ─────────────────────────────────────────────┐ │
│  │  generate_with_responses_api(tools=["web_search"])                    │ │
│  └───────────────────────────────────────────────────────────────────────┘ │
│                                                                             │
│  ※ DuckDuckGo web_search_node.py는 DEPRECATED (Feedback Fallback용 잔존) │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

전환 비교Migration Comparison

기능Feature	AS-IS (Responses API raw)	TO-BE (Agents SDK)
Web Search	Responses API raw 호출 (타임아웃 빈발)Responses API raw call (frequent timeout)	Agents SDK Primary + Fallback
구조화 출력Structured Output	json_schema 수동 파싱Manual json_schema parsing	Agent output_type 자동화 (Pydantic)Agent output_type auto (Pydantic)
결정 프로세스Decision Process	FC로 재확인 (2회 호출)FC re-confirmation (2 calls)	Router 신뢰 (1회 호출, ~500ms↓)Router trust (1 call, ~500ms↓)
에러 핸들링Error Handling	미적용 (FAIL_OPEN 동작)Not applied (FAIL_OPEN behavior)	SDK 내장 재시도 + Fallback 체인SDK built-in retry + Fallback chain
검증Verification	—	760 passed, 5 skipped

🔧 Primary + Fallback 패턴🔧 Primary + Fallback Pattern

🚀 배포 현황🚀 Deployment Status

• 9-10 Pod (HPA), ArgoCD Synced, Health: Healthy • 9-10 Pods (HPA), ArgoCD Synced, Health: Healthy
• openai>=2.9.0, openai-agents>=0.7.0, langchain-openai>=0.3.0, langgraph>=0.2.0

📂 Source Files

clients/openai_client.py (Agents SDK Primary + Responses API Fallback)
nodes/answer_node.py (web_search 경로)
Blog: OpenAI Agents SDK 전환 분석 →

┌─────────────────────────────────────────────────────────────────────┐
│       image_generation Intent Pipeline (Gemini SDK Native)          │
├─────────────────────────────────────────────────────────────────────┤
│                                                                     │
│  User Message ──► image_generation_node ──► answer_node ──► SSE    │
│                         │                                           │
│              ┌──────────┴──────────┐                                │
│              │ Character Reference  │  (CharacterAssetPort)          │
│              │ ────────────────────│                                │
│              │ • 13 캐릭터 에셋 로드│                                │
│              │ • 참조 이미지 base64 │                                │
│              └──────────┬──────────┘                                │
│                         │                                           │
│              ┌──────────┴──────────┐                                │
│              │ Gemini SDK Native    │                                │
│              │ ────────────────────│                                │
│              │ • gemini-3-pro-image │                                │
│              │ • Native 이미지 생성  │                                │
│              │ • response_modalities│                                │
│              │   : ["TEXT","IMAGE"] │                                │
│              └──────────┬──────────┘                                │
│                         │                                           │
│              ┌──────────┴──────────┐                                │
│              │ ImageStoragePort     │  (gRPC → Images API)           │
│              │ ────────────────────│                                │
│              │ • base64 → bytes     │                                │
│              │ • gRPC UploadImage() │                                │
│              │ • CDN URL 반환        │                                │
│              └─────────────────────┘                                │
│                                                                     │
│  Timeout: 30000ms | System Prompt: Character Fidelity 원칙          │
│                                                                     │
└─────────────────────────────────────────────────────────────────────┘

System Prompt (Character Fidelity)

NON-NEGOTIABLE Rules:
• 참조 이미지와 EXACT 동일 디자인 복사
• Same colors, proportions, style 유지
• 캐릭터가 즉시 인식 가능해야 함

Allowed: 포즈, 배경, 조명, 소품 변경
Forbidden: 캐릭터 재해석, 스타일 변경, 색상/비율 변경

📂 Source Files

nodes/image_generation_node.py
prompts/image_generation/system.txt

┌─────────────────────────────────────────────────────────────────────┐
│             general Intent Pipeline (Answer Generation)             │
├─────────────────────────────────────────────────────────────────────┤
│                                                                     │
│  User Message ──► answer_node ──► SSE Token Stream                 │
│                       │                                             │
│              ┌────────┴────────┐                                    │
│              │ Context Assembly │                                   │
│              │ ────────────────│                                    │
│              │ • disposal_rules │  (waste 결과)                     │
│              │ • character_ctx  │  (character 결과)                  │
│              │ • location_ctx   │  (location 결과)                   │
│              │ • price_ctx      │  (recyclable_price 결과)           │
│              │ • bulk_waste_ctx │  (bulk_waste 결과)                 │
│              │ • weather_ctx    │  (weather 결과)                    │
│              │ • collection_ctx │  (collection_point 결과)           │
│              │ • image_gen_ctx  │  (image_generation 결과)           │
│              │ • web_search_ctx │  (web_search 결과)                 │
│              └────────┬────────┘                                    │
│                       │                                             │
│              ┌────────┴────────┐                                    │
│              │ PromptBuilder    │                                   │
│              │ ────────────────│                                    │
│              │ • System Prompt  │  (eco_character.txt)               │
│              │ • Intent Prompt  │  (general_instruction.txt)         │
│              │ • Context Inject │  (위 모든 컨텍스트)                │
│              │ • History (10)   │  (Multi-turn 대화)                 │
│              └────────┬────────┘                                    │
│                       │                                             │
│              ┌────────┴────────┐                                    │
│              │ LLM Streaming   │                                    │
│              │ ────────────────│                                    │
│              │ • LangChain astream() │                               │
│              │ • or generate_stream()│                               │
│              │ → notify_token_v2()   │  → Redis → SSE              │
│              └─────────────────┘                                    │
│                                                                     │
│  Lamport Clock: cleanup_sequence(job_id) at pipeline end            │
│                                                                     │
└─────────────────────────────────────────────────────────────────────┘

Token Streaming Architecture

LangChain Path

get_langchain_llm()
→ SystemMessage + HumanMessage
→ astream() 토큰 발행
→ notify_token_v2()

Native Path (Gemini)

generate_stream()
→ prompt + system_prompt
→ 네이티브 스트리밍
→ notify_token_v2()

주입 프롬프트 (general_instruction.txt)

역할: 환경/분리배출 관련 일반 대화 응대
성격: 친근하고 밝은 에코 캐릭터 페르소나
Fallback: 다른 Intent로 분류되지 않은 모든 질문
Multi-turn: 최근 10개 메시지 히스토리 유지

📂 Source Files

nodes/answer_node.py
commands/generate_answer_command.py
prompts/local/general_instruction.txt
prompts/global/eco_character.txt

┌─────────────────────────────────────────────────────────────────────────────────┐ │ Multi-LLM Provider Architecture │ ├─────────────────────────────────────────────────────────────────────────────────┤ │ │ │ ┌─────────────────────────────────────────────────────────────────────────┐ │ │ │ Client Request │ │ │ │ POST /api/v1/chat/{chat_id}/messages │ │ │ │ {"message": "...", "model": "gemini-3-pro-preview"} │ │ │ └────────────────────────────────┬────────────────────────────────────────┘ │ │ │ │ │ ▼ │ │ ┌─────────────────────────────────────────────────────────────────────────┐ │ │ │ Chat API │ │ │ │ (chat/presentation/http) │ │ │ │ │ │ │ │ • 요청 검증 (Request Validation) │ │ │ │ • Worker에 작업 제출 (model 파라미터 포함) │ │ │ └────────────────────────────────┬────────────────────────────────────────┘ │ │ │ RabbitMQ │ │ ▼ │ │ ┌─────────────────────────────────────────────────────────────────────────┐ │ │ │ Chat Worker │ │ │ │ ┌───────────────────────────────────────────────────────────────────┐ │ │ │ │ │ dependencies.py │ │ │ │ │ │ │ │ │ │ │ │ 1. model 파라미터 확인 │ │ │ │ │ │ 2. model명에서 provider 자동 추론 (Auto-Inference) │ │ │ │ │ │ ┌────────────────────────────────────────────────────────┐ │ │ │ │ │ │ │ gpt-* → openai │ gemini-* → google │ │ │ │ │ │ │ │ o1-* → openai │ models/* → google │ │ │ │ │ │ │ └────────────────────────────────────────────────────────┘ │ │ │ │ │ │ 3. provider에 따라 LLM 클라이언트 생성 │ │ │ │ │ │ • openai → LangChainLLMAdapter (LangChain 래퍼) │ │ │ │ │ │ • google → GeminiLLMClient (네이티브 SDK) │ │ │ │ │ └───────────────────────────────────────────────────────────────────┘ │ │ │ │ │ │ │ │ │ ▼ │ │ │ │ ┌───────────────────────────────────────────────────────────────────┐ │ │ │ │ │ LangGraph Pipeline │ │ │ │ │ │ intent_node → router → subagents → aggregator → answer_node │ │ │ │ │ └───────────────────────────────────────────────────────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────────────┘ │ └─────────────────────────────────────────────────────────────────────────────────┘

🔍 Provider 자동 추론 🔍 Provider Auto-Inference

시나리오 1: 기본Scenario 1: Default

model: 미지정
→ 환경변수 LLM_MODEL 사용 → Use env LLM_MODEL
Default: gpt-5.2

시나리오 2: OpenAIScenario 2: OpenAI

model: "gpt-5.2"
→ provider 자동 추론: openai → Auto-infer: openai
LangChainLLMAdapter

시나리오 3: GeminiScenario 3: Gemini

model: "gemini-3-pro-preview"
→ provider 자동 추론: google → Auto-infer: google
GeminiLLMClient (Native)

🔧 LLM Client 구조 🔧 LLM Client Architecture

OpenAI (LangChain Wrapper)

LangChainLLMAdapter

• get_langchain_llm() 메서드 제공method provided

• langchain_llm.astream() 스트리밍streaming

• LangGraph 네이티브 통합native integration

Google Gemini (Native SDK)

GeminiLLMClient

• generate_stream() 직접 스트리밍direct streaming

• google-generativeai SDK

• hasattr 분기로 호환성 확보branching for compatibility

✅ answer_node 호환성 처리 ✅ answer_node Compatibility

🧪 E2E 테스트 결과 🧪 E2E Test Results

시나리오Scenario	요청Request	Provider	LLM Client	결과Result
1	`model: 미지정`	openai (env)	LangChainLLMAdapter	✅
2	`model: "gpt-5.2"`	openai (auto)	LangChainLLMAdapter	✅
3	`model: "gemini-3-pro-preview"`	google (auto)	GeminiLLMClient	✅

🎯 핵심 이점 🎯 Key Benefits

• 확장성 — 새 Provider 추가 시 MODEL_REGISTRY + 클라이언트 구현만 추가
• 호환성 — hasattr 체크로 LangChain 래퍼/네이티브 모두 지원
• 추적성 — 로깅으로 선택된 Provider/Model 명확히 확인
• 유연성 — 클라이언트가 요청별로 LLM 선택 가능 • Extensibility — Add new Provider via MODEL_REGISTRY + client impl
• Compatibility — hasattr check supports both LangChain wrapper/native
• Traceability — Logging clearly shows selected Provider/Model
• Flexibility — Client can select LLM per request

📚 참고Reference

LLM 모델 선택 기능 E2E 검증 완료 → LLM Model Selection E2E Verification →

📋 파이프라인 단계 📋 Pipeline Stages

1️⃣ 이미지 입력 — 사용자가 폐기물 사진 업로드
2️⃣ Vision LLM 분석 — Gemini Vision으로 이미지 내 객체 식별
3️⃣ 폐기물 분류 — 식별된 객체를 폐기물 카테고리로 매핑
4️⃣ RAG 검색 — 해당 폐기물의 분리배출 방법 검색
5️⃣ 응답 생성 — 사용자 친화적인 분리배출 가이드 생성 1️⃣ Image Input — User uploads waste photo
2️⃣ Vision LLM Analysis — Gemini Vision identifies objects
3️⃣ Waste Classification — Map objects to waste categories
4️⃣ RAG Search — Retrieve disposal methods
5️⃣ Response Generation — Generate user-friendly disposal guide

📊 성능 지표 (k6 부하 테스트) 📊 Performance Metrics (k6 Load Test)

VU	완료율Success	RPM	E2E p95	Submit p95	Snapshot
500	100%	367.9	83.3s	232ms	→
600	99.7%	358.6	108.3s	360ms	→
700	99.2%	329.1	122.3s	444ms	Live
800	99.7%	367.3	144.6s	734ms	→
900	99.7%	405.5	149.6s	635ms	→
1000	97.8%	373.4	173.3s	787ms	→

테스트 환경: OpenAI Tier 4 (TPM 4M), Worker min=2/max=5 Test env: OpenAI Tier 4 (TPM 4M), Worker min=2/max=5

Control Plane

API Nodes

Workers

Data Layer

Observability

Network

👑 Control Plane

k8s-master

t3.xlarge · 16GB · 80GB

API Server, etcd, Prometheus

🔌 API Nodes (Business Logic)

api-auth

t3.small · 2GB

api-users

t3.small · 2GB

api-scan ⭐

t3.medium · 4GB

api-character

t3.small · 2GB

api-location

t3.small · 2GB

api-image

t3.small · 2GB

api-chat ⭐

t3.medium · 4GB

api-info

t3.small · 2GB

⚙️ Worker Nodes (Async Processing)

worker-storage

t3.medium · 4GB · 40GB

I/O Bound · Eventlet

worker-storage-2

t3.medium · 4GB · 40GB

I/O Bound · Eventlet

worker-ai ⭐

t3.medium · 4GB · 40GB

LLM · Prefork Pool

worker-ai-2 ⭐

t3.medium · 4GB · 40GB

LLM · Prefork Pool

💾 Data Layer (Persistence & Cache)

postgresql ⭐

t3.large · 8GB · 80GB

7 Domain DBs

redis-auth

t3.medium · 4GB

Blacklist, OAuth

redis-streams

t3.small · 2GB

Event Bus

redis-cache

t3.small · 2GB

Domain Cache

redis-pubsub

t3.small · 2GB

Real-time Events

rabbitmq

t3.medium · 4GB · 40GB

Message Queue

📊 Observability (Monitoring & Logging)

monitoring

t3.large · 8GB · 60GB

Prometheus, Grafana, Jaeger

logging ⭐

t3.xlarge · 16GB · 100GB

Elasticsearch, Kibana, Fluent Bit

🌐 Network Layer (Gateway & Events)

ingress-gateway ⭐

t3.medium · 4GB

Istio Ingress, ext-authz

sse-gateway

t3.small · 2GB

SSE Fan-out, Pub/Sub Subscribe

event-router

t3.small · 2GB

Streams → Pub/Sub Publisher

📋 리소스 요약 📋 Resource Summary

Category	Nodes	Instance Types	Total vCPU	Total Memory	Total Storage
Control Plane	1	t3.xlarge	4	16GB	80GB
API Nodes	8	t3.small(6), t3.medium(2)	16	20GB	180GB
Workers	4	t3.medium	8	16GB	160GB
Data Layer	6	t3.small(4), t3.medium(2), t3.large(1)	12	22GB	180GB
Observability	2	t3.large, t3.xlarge	6	24GB	160GB
Network	3	t3.small(2), t3.medium(1)	6	8GB	60GB
Total	24	-	52 vCPU	106GB	820GB

🔗 외부 링크External Links

→ Kiali - Service Mesh Topology → Grafana - Metrics Dashboard → Jaeger - Distributed Tracing

📁 Source: terraform/main.tf — IaC로 관리되는 AWS EC2 기반 Kubernetes 클러스터

🔄 NodePolicy - 장애 처리 정책 🔄 NodePolicy - Fault Handling Policies

⚡ Circuit Breaker Pattern ⚡ Circuit Breaker Pattern

┌─────────────────────────────────────────────────────────────────────────────┐ │ Circuit Breaker State Machine │ ├─────────────────────────────────────────────────────────────────────────────┤ │ │ │ ┌──────────┐ 5회 연속 실패 ┌──────────┐ 60초 후 ┌──────────┐│ │ │ CLOSED │ ─────────────────→ │ OPEN │ ───────────→ │HALF_OPEN ││ │ │ (정상) │ │ (차단) │ │ (시험) ││ │ └────┬─────┘ └────┬─────┘ └────┬─────┘│ │ │ │ │ │ │ │ ← 성공 │ 즉시 실패 반환 │ │ │ │ ↓ │ │ │ │ ┌──────────┐ │ │ │ └────────────── 성공 ←──── │ 1회 시험 │ ── 실패 → OPEN ────┘ │ │ │ 요청 │ │ │ └──────────┘ │ ├─────────────────────────────────────────────────────────────────────────────┤ │ Config: failure_threshold=5, recovery_timeout=60s, half_open_max=1 │ └─────────────────────────────────────────────────────────────────────────────┘

🤖 Agent별 Resilience 설정 🤖 Per-Agent Resilience Config

Agent	NodePolicy	Circuit Breaker	Timeout	Rationale
`intent_classifier`	FAIL_CLOSE	5회/60s	10s	Intent 없이 라우팅 불가
`weather_agent`	FALLBACK	5회/60s	15s	외부 API 의존, 대체 응답 가능
`eco_guide_agent`	FALLBACK	5회/60s	20s	RAG + Web Search 체인
`summarize`	FAIL_OPEN	3회/30s	10s	선택적 압축, 원본 전달 가능
`aggregator`	FAIL_CLOSE	-	30s	모든 결과 수집 필수

📚 참고Reference

Production 환경에서 LLM API의 불안정성에 대응하기 위한 전략. Circuit Breaker는 연쇄 장애 방지. Strategies for handling LLM API instability in production. Circuit Breaker prevents cascading failures.

📐 4-Dimension RAG 평가 (LangSmith Feedback) 📐 4-Dimension RAG Evaluation (LangSmith Feedback)

┌─────────────────────────────────────────────────────────────────────────────┐ │ 4-Dimension RAG Quality Metrics │ ├─────────────────────────────────────────────────────────────────────────────┤ │ │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │ │ RELEVANCE │ │ GROUNDEDNESS│ │ COHERENCE │ │ HELPFULNESS │ │ │ │ 관련성 │ │ 근거성 │ │ 일관성 │ │ 유용성 │ │ │ ├─────────────┤ ├─────────────┤ ├─────────────┤ ├─────────────┤ │ │ │ Query와 │ │ Retrieved │ │ 답변의 │ │ 사용자 │ │ │ │ Context │ │ Context에 │ │ 논리적 │ │ 질문에 │ │ │ │ 매칭 점수 │ │ 기반한 답변 │ │ 흐름 │ │ 도움 정도 │ │ │ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ │ │ │ │ │ │ │ │ └────────────────┴─────────┬───────┴─────────────────┘ │ │ ↓ │ │ ┌──────────────────────┐ │ │ │ Composite Score │ │ │ │ (0.0 ~ 1.0) │ │ │ │ threshold: 0.7 │ │ │ └──────────────────────┘ │ │ │ │ │ < 0.7 ──────────┴────────── ≥ 0.7 │ │ ↓ ↓ │ │ ┌──────────────┐ ┌──────────────┐ │ │ │ Fallback │ │ Use Response │ │ │ │ Chain 실행 │ │ 답변 사용 │ │ │ └──────────────┘ └──────────────┘ │ └─────────────────────────────────────────────────────────────────────────────┘

🔗 Fallback Chain 구조 🔗 Fallback Chain Structure

┌─────────────────────────────────────────────────────────────────────────────┐ │ RAG → Web Search → General LLM │ ├─────────────────────────────────────────────────────────────────────────────┤ │ │ │ ┌────────────────┐ Score < 0.7 ┌────────────────┐ │ │ │ 1️⃣ RAG │ ───────────────→ │ 2️⃣ Web Search │ │ │ │ (Vector DB) │ │ (Tavily API) │ │ │ │ │ │ │ │ │ │ eco_guide │ │ 실시간 정보 │ │ │ │ docs 검색 │ │ 최신 뉴스 │ │ │ └────────────────┘ └───────┬────────┘ │ │ │ │ │ Score < 0.7 │ │ ↓ │ │ ┌────────────────────┐ │ │ │ 3️⃣ General LLM │ │ │ │ (Fallback 응답) │ │ │ │ │ │ │ │ "죄송합니다, │ │ │ │ 정확한 정보를 │ │ │ │ 찾지 못했습니다" │ │ │ └────────────────────┘ │ │ │ ├─────────────────────────────────────────────────────────────────────────────┤ │ 📊 Metrics Tracked: │ │ • rag_hit_rate: RAG에서 직접 응답한 비율 │ │ • fallback_rate: Fallback 체인 진입 비율 │ │ • avg_response_quality: 평균 품질 점수 │ └─────────────────────────────────────────────────────────────────────────────┘

apps/chat_worker/application/agents/eco_guide_agent.py

# Fallback Chain Implementation
async def run_with_fallback(self, query: str) -> AgentResponse:
    # 1. RAG 시도
    rag_response = await self.rag_retriever.query(query)
    if self._evaluate_quality(rag_response) >= 0.7:
        return rag_response
    
    # 2. Web Search 시도
    web_response = await self.web_search.search(query)
    if self._evaluate_quality(web_response) >= 0.7:
        return web_response
    
    # 3. General LLM Fallback
    return await self.general_llm.generate_fallback(query)

📚 참고Reference

LangSmith의 Feedback 기능을 활용한 RAG 품질 평가. 낮은 품질 시 자동으로 대안 소스 탐색. RAG quality evaluation using LangSmith Feedback. Automatically explores alternative sources on low quality.

🔌 TelemetryConfigPort 추상화 🔌 TelemetryConfigPort Abstraction

┌─────────────────────────────────────────────────────────────────────────────┐ │ TelemetryConfigPort Interface │ ├─────────────────────────────────────────────────────────────────────────────┤ │ │ │ class TelemetryConfigPort(Protocol): │ │ """텔레메트리 설정 추상화 - 환경별 구현 교체 가능""" │ │ │ │ @property │ │ def langsmith_enabled(self) -> bool: ... │ │ @property │ │ def otel_enabled(self) -> bool: ... │ │ @property │ │ def project_name(self) -> str: ... │ │ @property │ │ def sampling_rate(self) -> float: ... │ │ │ ├─────────────────────────────────────────────────────────────────────────────┤ │ Implementations: │ │ ┌──────────────────┐ ┌──────────────────┐ ┌──────────────────┐ │ │ │ DevTelemetryConf │ │ ProdTelemetryConf│ │ TestTelemetryConf│ │ │ │ │ │ │ │ │ │ │ │ langsmith: ✅ │ │ langsmith: ✅ │ │ langsmith: ❌ │ │ │ │ otel: ❌ │ │ otel: ✅ │ │ otel: ❌ │ │ │ │ sampling: 1.0 │ │ sampling: 0.1 │ │ sampling: 0.0 │ │ │ └──────────────────┘ └──────────────────┘ └──────────────────┘ │ └─────────────────────────────────────────────────────────────────────────────┘

📍 Feature 단위 Run 추적 (LangSmith) 📍 Feature-Level Run Tracking (LangSmith)

┌─────────────────────────────────────────────────────────────────────────────┐ │ LangSmith Run Hierarchy │ ├─────────────────────────────────────────────────────────────────────────────┤ │ │ │ 🏃 Project: eco2-chat-worker │ │ │ │ │ ├── 📁 Session: user_123_session_456 │ │ │ │ │ │ │ ├── 🔷 Run: intent_classification │ │ │ │ ├── input: "오늘 날씨 어때?" │ │ │ │ ├── output: ["WEATHER"] │ │ │ │ ├── latency: 234ms │ │ │ │ └── tokens: { input: 12, output: 8 } │ │ │ │ │ │ │ ├── 🔷 Run: weather_agent │ │ │ │ ├── input: { query: "오늘 날씨", location: "서울" } │ │ │ │ ├── output: "서울 맑음, 24°C..." │ │ │ │ ├── latency: 1,234ms │ │ │ │ └── external_api: weather_api │ │ │ │ │ │ │ └── 🔷 Run: summarize │ │ │ ├── input: { messages: [...] } │ │ │ ├── output: "압축된 컨텍스트..." │ │ │ └── compression_ratio: 0.65 │ │ │ │ │ └── 📊 Metrics: │ │ ├── total_runs: 156 │ │ ├── avg_latency: 892ms │ │ └── error_rate: 0.02% │ └─────────────────────────────────────────────────────────────────────────────┘

🔗 OTEL Span 통합 (Jaeger) 🔗 OTEL Span Integration (Jaeger)

┌─────────────────────────────────────────────────────────────────────────────┐ │ OTEL + LangSmith Correlation │ ├─────────────────────────────────────────────────────────────────────────────┤ │ │ │ Jaeger Trace (Distributed Tracing) │ │ ┌─────────────────────────────────────────────────────────────────────┐ │ │ │ trace_id: abc123def456 │ │ │ │ │ │ │ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │ │ │ │ chat-api │→│ event-router │→│ chat-worker │→│ sse-gateway │ │ │ │ │ │ 50ms │ │ 12ms │ │ 1,200ms │ │ 8ms │ │ │ │ │ └──────────────┘ └──────────────┘ └──────┬───────┘ └──────────────┘ │ │ │ │ │ │ │ │ │ langsmith_run_id │ │ │ │ ↓ │ │ │ └─────────────────────────────────────────────────────────────────────┘ │ │ │ │ LangSmith Run (LLM Observability) │ │ ┌─────────────────────────────────────────────────────────────────────┐ │ │ │ run_id: xyz789 │ │ │ │ parent_trace_id: abc123def456 ← OTEL 연결 │ │ │ │ │ │ │ │ • LLM Call Details (prompt, completion, tokens) │ │ │ │ • RAG Retrieval (query, documents, scores) │ │ │ │ • Feedback Evaluation (relevance, groundedness) │ │ │ └─────────────────────────────────────────────────────────────────────┘ │ │ │ ├─────────────────────────────────────────────────────────────────────────────┤ │ 📍 Correlation Key: span.attributes["langsmith.run_id"] │ └─────────────────────────────────────────────────────────────────────────────┘

🔗 외부 링크External Links

→ Jaeger - Distributed Tracing → LangSmith - LLM Observability

❌ 문제: 토큰 0 / $0.00 ❌ Problem: Token 0 / $0.00

LangSmith 대시보드에서 39번의 LLM 호출이 있었음에도 "토큰: 0 / $0.00"으로 표시되는 문제 발생 LangSmith dashboard showed "Tokens: 0 / $0.00" despite 39 LLM calls being made

✅ 해결: 11개 LLM 호출 경로별 토큰 추적 ✅ Solution: Token Tracking for 11 LLM Call Paths

┌─────────────────────────────────────────────────────────────────────────────┐ │ 11 LLM Call Paths Token Tracking │ ├─────────────────────────────────────────────────────────────────────────────┤ │ │ │ OpenAI (4가지) Gemini (7가지) │ │ ┌─────────────────────────────┐ ┌─────────────────────────────┐ │ │ │ 1. Non-Streaming │ │ 5. Non-Streaming │ │ │ │ 2. Streaming │ │ 6. Streaming │ │ │ │ 3. Function Calling │ │ 7. Function Calling │ │ │ │ 4. Agents SDK │ │ 8. Image Analysis │ │ │ │ │ │ 9. Image Generation │ │ │ │ │ │ 10. Grounding Search │ │ │ │ │ │ 11. Native Tool Use │ │ │ └─────────────────────────────┘ └─────────────────────────────┘ │ │ │ ├─────────────────────────────────────────────────────────────────────────────┤ │ 🔑 Key: LangChain usage_metadata 필드로 표준화 │ │ │ │ response.usage_metadata = { │ │ "input_tokens": prompt_tokens, │ │ "output_tokens": completion_tokens, │ │ "total_tokens": total_tokens │ │ } │ └─────────────────────────────────────────────────────────────────────────────┘

📊 결과: 48,721 토큰 추적 성공 📊 Result: 48,721 Tokens Successfully Tracked

48,721

총 토큰 Total Tokens

$0.68

비용 추적 Cost Tracked

LLM 경로 LLM Paths

🎬 데모 영상 🎬 Demo Video

🔗 외부 링크External Links

→ 블로그 포스트 - LangSmith Token Tracking 개선 Blog Post - LangSmith Token Tracking Improvement → LangSmith Dashboard

이코에코(Eco²) Backend Portfolio Backend Portfolio

Service Service

Agent-Driven Development Agent-Driven Development

프로젝트 카테고리 Project Categories

Multi-Agent Chat

Kubernetes Cluster + GitOps + Service Mesh

Auth Offloading (ext-authz)

Observability

Message Queue

Event Streams & Scaling

Eventual Consistency

Clean Architecture Migration

시스템 아키텍처 System Architecture

🏗️ 전체 시스템 아키텍처 🏗️ System Architecture Overview

📊 데이터 흐름 요약 📊 Data Flow Summary

🤖 LLM Pipeline + Async SSE 🤖 LLM Pipeline + Async SSE

기술 스택 Tech Stack

Infrastructure

Backend

Data & Messaging

Observability

LLM

Scaling

프로젝트 타임라인 Project Timeline

🔐 Auth Domain - OAuth 2.0 + PKCE

Auth Relay Fallback Outbox

Google OAuth

Kakao OAuth

Naver OAuth

🌍 Character Domain - Dual Interface + Local Cache

HTTP REST

gRPC (CharacterService)

🗄️ 2-Tier Cache Architecture 🗄️ 2-Tier Cache Architecture

🏗️ Implementation Patterns 🏗️ Implementation Patterns

💬 Chat Agent

🎯 9-Class Intent Classification

Intent 분류 체계

Confidence Scoring Formula

🔗 Chain-of-Intent Transitions

🧠 Multi-Intent Detection

⚡ Intent Cache

📊 Classification Example

⚡ Send API Dynamic Router

LangGraph Send API Pipeline

Dynamic Router 구현

🌤️ ENRICHMENT_RULES

🗺️ INTENT_TO_NODE

📊 Multi-Intent Example

🚌 Chat Event Bus Architecture

Event Bus: Redis Streams + Pub/Sub

Token Streaming v2: Recoverable Stream

📝 Stream Key Structure

📸 State Snapshot

🔄 Catch-up Mechanism

📊 Event Types

KEDA Autoscaling Configuration

Chat Worker

SSE Gateway

Event Router

🔄 Taskiq: asyncio-native Worker 🔄 Taskiq: asyncio-native Worker

⚡ Celery vs Taskiq 비교 ⚡ Celery vs Taskiq Comparison

📊 성능 지표 📊 Performance Metrics

🐇 RabbitMQ: Job Execution Path 🐇 RabbitMQ: Job Execution Path

⚙️ RabbitMQ Topology 구성 ⚙️ RabbitMQ Topology Configuration

🔄 Message Flow 🔄 Message Flow

Submit

Route

Consume

Execute

⚡ KEDA Autoscaling ⚡ KEDA Autoscaling

📺 SSE Gateway: Token v2 Streaming 📺 SSE Gateway: Token v2 Streaming

🔄 Token v2 Recovery Protocol 🔄 Token v2 Recovery Protocol

📊 Consumer Groups (Two-Path) 📊 Consumer Groups (Two-Path)

🚀 eventrouter

💾 chat-persistence

⚡ KEDA Autoscaling ⚡ KEDA Autoscaling

🔀 Multi-Agent Chat LangGraph Workflow

🎯 Intent Node

🔀 Dynamic Router (Send API)

📦 Aggregator

이코에코(Eco²)
Backend Portfolio Backend Portfolio