류지환 Resume - AI Engineer

Eco² - LangGraph Multi-Agent Pipeline | Backend & Infrastructure (Solo) [Portfolio] [Service] [Workflow1] [Workflow2]

Oct 31, 2025 - Present

Multi-Agent 백엔드 플랫폼 설계·개발·운영 (Solo)
24-Node K8s | 10월 MVP, 12월 배포, 1월 VU 1000 | 2025 AI 새싹톤 우수상 (TOP4/181팀)

Multi-Agent Backend Platform design, development, operations (Solo)
24-Node K8s | Oct MVP, Dec Deploy, Jan VU 1000 | 2025 AI SeSACTHON Excellence Award (TOP4/181 teams)

Key Metrics: Scan API (LLM×2) VU 1000 97.8% · 373 RPM · p95 173s, Connection Pool 212→33 (84%↓), ext-authz 1,477 RPS p99 275ms @ VU 2,500

LangGraph StateGraph Pipeline: Intent Classification → Dynamic Router(Send API) → 10종 서브에이전트 병렬 실행 → Aggregator(결과 수집·필수 컨텍스트 검증) → Dynamic Summarization(토큰 임계값 초과 시 OpenCode 스타일 압축) → Answer Node(토큰 스트리밍), 복합 질의를 단일 요청에서 병렬 처리Intent Classification → Dynamic Router (Send API) → 10 sub-agents parallel execution → Aggregator (result collection, required context validation) → Dynamic Summarization (OpenCode-style compression on token threshold) → Answer Node (token streaming), compound queries processed in single request
Tag-Based Contextual Retrieval: Anthropic Contextual Retrieval 패턴 기반 2-tier RAG — item_class_list(167개 품목) + situation_tags(80개 상황) YAML 검색 → 18개 분류 규정 JSON 매칭, Classification→Keyword Fallback 전략, Relevance scoring으로 chunk 우선순위 결정2-tier RAG based on Anthropic Contextual Retrieval pattern — item_class_list (167 items) + situation_tags (80 situations) YAML search → 18 category regulation JSON matching, Classification→Keyword Fallback strategy, chunk prioritization via Relevance scoring
Global/Local Prompt Optimization: Global(캐릭터 정의·톤·규칙) + Local(Intent별 지침) 분리 패턴, 구조화 응답 템플릿(핵심 답변→구체적 방법→상태 개선 팁), LRU 캐싱으로 프롬프트 로드 최적화Global (character definition, tone, rules) + Local (per-intent instructions) separation pattern, structured response template (core answer→detailed method→improvement tips), prompt load optimization via LRU caching
Tool Calling (10 Sub-agents): @function_tool 데코레이터 기반 5종(location/Kakao, bulk_waste/MOIS, recyclable_price·collection_point/KECO, weather/KMA) + Native Tool 2종(web_search/GPT·Gemini 내장, image_generation/Gemini) + Other 3종(waste_rag/RAG, character/gRPC, general/LLM Direct) — LangChain 추상화 배제, OpenAI·Gemini SDK 직접 호출@function_tool decorator-based 5 tools (location/Kakao, bulk_waste/MOIS, recyclable_price·collection_point/KECO, weather/KMA) + 2 Native Tools (web_search/GPT·Gemini built-in, image_generation/Gemini) + 3 Others (waste_rag/RAG, character/gRPC, general/LLM Direct) — No LangChain abstraction, direct OpenAI·Gemini SDK calls
Intent Confidence Scoring: LLM 분류 신뢰도 + 키워드 맵 매칭 보정(+0.2) + Chain-of-Intent 전이 보너스(+0.15) + 짧은 질문 페널티(-0.1)로 final_confidence 산출, 0.6 미만 시 Fallback Chain(rag → web_search → general_llm)으로 품질 보장, FeedbackQuality 점수 기반 응답 품질 평가Calculate final_confidence via LLM classification confidence + keyword map matching (+0.2) + Chain-of-Intent transition bonus (+0.15) + short question penalty (-0.1), Fallback Chain (rag → web_search → general_llm) for scores below 0.6, FeedbackQuality-based response quality evaluation
Redis Event Bus 3-Tier: Streams(영속) + Pub/Sub(실시간 팬아웃) + State KV(복구) 분리, Lua Script 멱등성 + Lamport Clock seq 기반 이벤트 순서 보장 → Scan Pipeline(LLM×2) VU 1000 97.8% · 373 RPM · p95 173sStreams (persistence) + Pub/Sub (realtime fan-out) + State KV (recovery) separation, Lua Script idempotency + Lamport Clock seq-based event ordering → Scan Pipeline (LLM×2) VU 1000 97.8% · 373 RPM · p95 173s
Pub/Sub Shard 최적화: job_id별 채널 → shard별 채널(hash % 4) 변경, SSE Gateway가 4개 shard 구독 후 job_id로 내부 라우팅 → Pub/Sub 연결 O(N)→O(4), 1,000 VU 기준 99.6% 연결 절감Changed from per-job_id channels to shard-based channels (hash % 4), SSE Gateway subscribes to 4 shards with internal job_id routing → Pub/Sub connections O(N)→O(4), 99.6% connection reduction at 1,000 VU
ReadThroughCheckpointer: Redis Primary + PostgreSQL Async Sync 분리, Temporal Locality 기반 LRU 승격, CheckpointSyncService 배치 영속화 → Connection Pool 212→33 (84% 감소)Redis Primary + PostgreSQL Async Sync separation, Temporal Locality-based LRU promotion, CheckpointSyncService batch persistence → Connection Pool 212→33 (84% reduction)
Concurrency: LLM 호출(10-20s/req)로 인해 prefork 모델에서는 동시 처리량이 프로세스 수에 의해 제한되고 비용이 커짐 → I/O-bound 구간을 Gevent(협력적 스케줄링) 기반 동시 처리로 전환해 in-flight 요청 확대, LangGraph(asyncio) 파이프라인은 Taskiq 워커로 분리해 event loop 유지LLM calls (10-20s/req) limited prefork model concurrency to process count with high cost → Switched I/O-bound sections to Gevent (cooperative scheduling) for concurrent processing, expanding in-flight requests. LangGraph (asyncio) pipeline separated to Taskiq worker to maintain event loop
Multi-LLM Resilience: LLMClientPort 추상화로 OpenAI/Gemini Provider 자동 추론 및 런타임 전환, NodePolicy(FAIL_OPEN/CLOSE/FALLBACK) + Circuit Breaker로 노드 장애 시 그래프 중단 방지, Stateless Reducer + 체크포인팅으로 중간 실패 시 LLM 재호출 없이 복구LLMClientPort abstraction for OpenAI/Gemini provider auto-inference and runtime switching, NodePolicy (FAIL_OPEN/CLOSE/FALLBACK) + Circuit Breaker prevents graph interruption on node failure, Stateless Reducer + checkpointing for recovery without LLM re-call on intermediate failure
LLM Evaluation Pipeline (Swiss Cheese 3-Tier): Anthropic Defense-in-Depth 전략 기반 — L1 Code Grader(<50ms 동기, Regex/Token/Keywords, C-Grade→Regeneration 트리거) + L2 LLM Judge(5-Axis BARS Rubric: Faithfulness 0.30·Relevance 0.25·Completeness 0.20·Safety 0.15·Communication 0.10, Self-Consistency 3x CV<0.2) + L3 Calibration Monitor(CUSUM k=0.5 h=4.0, Krippendorff's α≥0.75), 직교 슬라이스로 단일 Grader 한계 보완, Expert Review 5라운드 69.4→99.8/100Based on Anthropic Defense-in-Depth strategy — L1 Code Grader (<50ms sync, Regex/Token/Keywords, C-Grade→Regeneration trigger) + L2 LLM Judge (5-Axis BARS Rubric: Faithfulness 0.30·Relevance 0.25·Completeness 0.20·Safety 0.15·Communication 0.10, Self-Consistency 3x CV<0.2) + L3 Calibration Monitor (CUSUM k=0.5 h=4.0, Krippendorff's α≥0.75), orthogonal slices complement single Grader limitations, Expert Review 5 rounds 69.4→99.8/100
Auth Offloading: 모든 요청이 ext-authz→Redis를 경유하며 Redis 경합으로 42 RPS 병목, P99 80ms 발생 → Istio EnvoyFilter ext-authz(Go) Local sync.Map 캐시 전환, RabbitMQ Fanout broadcast로 Pod 간 블랙리스트 동기화 → 48→1,500 RPS(31×), P50 57→7.5ms(87%↓), Cache Hit >99%All requests routed through ext-authz→Redis causing 42 RPS bottleneck at P99 80ms → Switched to Istio EnvoyFilter ext-authz(Go) with local sync.Map cache, RabbitMQ Fanout broadcast for cross-pod blacklist sync → 48→1,500 RPS (31×), P50 57→7.5ms (87%↓), Cache Hit >99%
Persistence Offloading: API가 DB 쓰기를 직접 처리하여 레이턴시 증가 + 커넥션 폭증 → API는 Redis Streams XADD 이벤트 발행만 수행, Worker가 5초 간격 일괄 배치 처리, Deterministic UUID 멱등성 키로 중복 방지 + Outbox Fallback 99.5% 전달 보장 → API 응답 50ms→5ms(90%↓), API/Worker 독립 스케일링API handling DB writes directly increased latency + connection explosion → API publishes via Redis Streams XADD only, Worker handles 5s interval batch processing, Deterministic UUID idempotency key prevents duplicates + Outbox Fallback ensures 99.5% delivery → API response 50ms→5ms (90%↓), independent API/Worker scaling
Eventual Consistency: 강결합·즉시 응답이 필요한 로직(OAuth 등)을 제외한 스코프에 적용 — Scan Reward: Judge Phase에서 즉시 SSE 응답 후 Fire-and-Forget Dispatch, Deterministic UUID 멱등성 키로 동시 Persist → 500→100ms(80%↓). Chat: FE Optimistic Update + BE Eventual Write + 30초 Retention Window로 논리적 정합성 확보. Auth: Local Cache + RabbitMQ Fanout Broadcast로 Pod 간 동기화Applied to scopes excluding tightly-coupled immediate-response logic (OAuth, etc.) — Scan Reward: immediate SSE response at Judge Phase, Fire-and-Forget Dispatch with Deterministic UUID idempotency key for concurrent Persist → 500→100ms (80%↓). Chat: FE Optimistic Update + BE Eventual Write + 30s Retention Window for logical consistency. Auth: Local Cache + RabbitMQ Fanout Broadcast for cross-pod sync
Agent-Driven Development: ADR·장애 로그·도메인 규칙을 Knowledge Base로 축적 → 자체 RAG + 27개 커스텀 Skills를 Sub agent에 주입하여 에이전트 온보딩 가속, Task 도구 7,500+ 호출(8,251 세션)이 관측되며 병렬 포함 멀티파트 작업을 수행. Anthropic·Vercel 등 공식 벤더 Skills도 적극 활용, 설계-구현-테스트-관측-검증 피드백 루프를 구성해 사이클 운용Accumulated ADR, incident logs, domain rules as Knowledge Base → self-RAG + 27 custom Skills injected into sub agents for onboarding acceleration, 7,500+ Task tool invocations observed across 8,251 sessions including parallel multi-part execution. Actively leveraging official vendor Skills (Anthropic, Vercel), operating a design-implement-test-observe-verify feedback loop

DREAM - Interactive Story Generation Service | Backend & Infrastructure

Sep 2024

LLM & RAG 기반 시니어 꿈 실현 스토리텔링 서비스 | KakaoTech Hackathon — Eco² 이전, 단일 노드 환경에서 분산 아키텍처의 초안을 실험한 프로젝트

Senior dream realization storytelling service based on LLM & RAG | KakaoTech Hackathon — Pre-Eco² project experimenting with distributed architecture drafts in single-node environment

Single-Node Microservices: Docker 네트워크로 단일 EC2에서 Spring Backend, FastAPI AI Server, Persistence Layer(Elasticsearch, MySQL)를 컨테이너 단위 분리 — Eco² 마이크로서비스 아키텍처의 원형Container-level separation of Spring Backend, FastAPI AI Server, Persistence Layer (Elasticsearch, MySQL) on single EC2 via Docker network — Prototype of Eco² microservices architecture
Nginx Reverse Proxy: 단일 진입점에서 path-based 라우팅(/api → Backend, /ai → AI Server)으로 서비스 간 트래픽 분배 — ALB + Bridge Ingress Controller + Istio Virtual Routing의 전신Traffic distribution between services via path-based routing (/api → Backend, /ai → AI Server) from single entry point — Predecessor to ALB + Bridge Ingress Controller + Istio Virtual Routing
Legacy CI/CD: GitHub Actions + S3 + CodeDeploy로 경량 배포 파이프라인 구축, 배포 시간 80% 단축 — ArgoCD GitOps 이전 단계Lightweight deployment pipeline with GitHub Actions + S3 + CodeDeploy, 80% deployment time reduction — Pre-ArgoCD GitOps stage
LLM + RAG: Elasticsearch 벡터 검색 + LangChain RAG로 시니어 꿈 기반 스토리/이미지 생성 — Eco² Scan API 파이프라인의 초기 실험Senior dream-based story/image generation with Elasticsearch vector search + LangChain RAG — Early experiment for Eco² Scan API pipeline
리소스 최적화Resource Optimization: Swap 메모리, Docker 리소스 제한으로 EC2 안정화 (CPU 89%→40%) — 제한된 환경에서의 성능 튜닝 경험EC2 stabilization with Swap memory, Docker resource limits (CPU 89%→40%) — Performance tuning experience in constrained environments

Languages	Python, Go, C, Solidity
LLM / AI	LangGraph, OpenAI Agents SDK, Gemini SDK, RAG, Prompt Engineering, Token Streaming (SSE), LangSmith
Infrastructure	Kubernetes, ArgoCD, Istio, OpenTelemetry, KEDA, Docker
Messaging	RabbitMQ, Redis Streams / Pub/Sub, Taskiq (asyncio), Celery (Gevent)
Databases	PostgreSQL, Redis, Elasticsearch
Architecture	Clean Architecture (Port/Adapter), DOMA, Event-Driven, Microservices
DevOps	GitOps, CI/CD, Kustomize, GitHub Actions, OSS Operators (Redis, RabbitMQ, EFK etc.)
Agents / Model	Claude Code, Cursor (Model: Claude Opus 4.5)

류지환 (Jihwan Ryu) Jihwan Ryu (류지환)

Development Timeline

Professional ExperienceProfessional Experience

AwardsAwards

SkillsSkills

Education & CertificationsEducation & Certifications

Education

Certifications

ProjectsProjects

LanguagesLanguages