Multi-Agent 오케스트레이션과 비결정론적 하네스 개발에 익숙한 AI 엔지니어입니다.
Rakuten 페타바이트급 분산 스토리지 실무를 거쳐, 현재 24-Node Kubernetes 클러스터에서 LangGraph 멀티에이전트 파이프라인을 설계·개발·운영 중입니다. LLM API n-hop 파이프라인(10-20s/req)의 고레이턴시에 대응하기 위해 Event Bus Layer를 설계 개발, VU 1000 97.8% · 373 RPM · p95 173s 포화지점을 달성했습니다.
최근에는 LLM-as-Judge 기반 Evaluation Pipeline에 관심을 가지고, Swiss Cheese 3-Tier 평가 체계(Code Grader + LLM Judge + Calibration Monitor)를 설계·구현하고 있습니다.
AI engineer experienced in Multi-Agent orchestration and non-deterministic harness development.
After hands-on experience with Rakuten's petabyte-scale distributed storage, I now design, develop, and operate LangGraph multi-agent pipelines on a 24-Node Kubernetes cluster. To address the high latency of LLM API n-hop pipelines (10-20s/req), I designed and developed an Event Bus Layer, reaching saturation point at VU 1000 with 97.8% completion · 373 RPM · p95 173s.
Recently focused on LLM-as-Judge based Evaluation Pipelines, designing and implementing a Swiss Cheese 3-Tier evaluation framework (Code Grader + LLM Judge + Calibration Monitor).
Development Timeline
2017.03 - 2023.08부산대학교 컴퓨터공학부Pusan National University정보컴퓨터공학과 공학사 (군복무: 2018.02-2019.12)B.S. in Computer Science (Military: 2018.02-2019.12)
2022.02 - 2022.11Ethereum 경매 플랫폼Ethereum Auction PlatformSolidity, Web3.js, IPFS 기반 수산물 경매Seafood auction based on Solidity, Web3.js, IPFS
2024.01 - 2024.06UE5 PVE Game게임 엔진 개발 및 Unreal Engine 5 프로젝트Game engine development and Unreal Engine 5 project
2024.06 - 2024.11KakaoTech BootcampLLM-Cloud 기반 웹/앱 서버 및 인프라 개발LLM-Cloud based web/app server and infrastructure development
2024.12 - 2025.08Rakuten Symphony KoreaCloud BU, CNP Storage 개발Cloud BU, CNP Storage Development
2025.10 - PresentEco2 Project2025 AI 새싹톤 우수상(TOP4), Backend/Infra 고도화2025 AI SeSACTHON Excellence Award (TOP4), Backend/Infra Enhancement
Education
Project
Work
In Progress
Professional ExperienceProfessional Experience
Rakuten Symphony Korea - Jr. Storage Developer (Server-side)
Dec 2024 - Aug 2025
Cloud BU | 정규직 | 페타바이트급 프로덕션 분산 스토리지 시스템 개발, 글로벌 팀(인도·일본·한국) 영어 기반 협업
Cloud BU | Full-time | Petabyte-scale production distributed storage development, global team (India/Japan/Korea) English-based collaboration
CNP Storage (C/RPC):2,000만 유저 통신망 기반 분산 스토리지 — 멀티스레드 동시성 버그 mutex_lock crash 재현·분석, SEGMENT_LOCK down_write→down_read 전환으로 읽기 병렬성 확보 + 프로덕션 데이터 무결성 유지Distributed storage for 20M-user telecom networks — Reproduced/analyzed multi-thread concurrency mutex_lock crash, SEGMENT_LOCK down_write→down_read conversion for read parallelism + production data integrity
Object Storage (C):CM→Gateway 브로드캐스트로 Eventual Consistency 환경의 토큰/키 캐시 정합성 확보, Install time 루트 유저 생성 기능 제안·설계·구현, User deactivate 시 access key가 유실되던 버그 수정Token/key cache consistency in Eventual Consistency environment via CM→Gateway broadcast, proposed/designed/implemented Install time root user creation, fixed bug where access key was lost on User deactivate
Tech Sharing & Learning:글로벌 팀(인도·일본·한국) 영어 기반 코드 리뷰·아키텍처 설계 논의·장애 분석, 팀 대상 영문 기술 발표를 통한 지속적 학습English-based code review, architecture design discussions, and incident analysis across global team (India/Japan/Korea), continuous learning through English technical presentations
AwardsAwards
2025 AI 새싹톤 우수상2025 AI SeSACTHON Excellence Award — 본선 4위/32팀 (예선 181팀) | 주최: 서울특별시, 운영: 데이콘 (Dec 2025)Finals 4th/32 teams (Prelims 181 teams) | Host: Seoul Metropolitan Government, Operated by: DACON (Dec 2025)
Multi-Agent 백엔드 플랫폼 설계·개발·운영 (Solo) 24-Node K8s | 10월 MVP, 12월 배포, 1월 VU 1000 | 2025 AI 새싹톤 우수상 (TOP4/181팀)
Multi-Agent Backend Platform design, development, operations (Solo) 24-Node K8s | Oct MVP, Dec Deploy, Jan VU 1000 | 2025 AI SeSACTHON Excellence Award (TOP4/181 teams)
Key Metrics: Scan API (LLM×2) VU 1000 97.8% · 373 RPM · p95 173s, Connection Pool 212→33 (84%↓), ext-authz 1,477 RPS p99 275ms @ VU 2,500
LangGraph StateGraph Pipeline:Intent Classification → Dynamic Router(Send API) → 10종 서브에이전트 병렬 실행 → Aggregator(결과 수집·필수 컨텍스트 검증) → Dynamic Summarization(토큰 임계값 초과 시 OpenCode 스타일 압축) → Answer Node(토큰 스트리밍), 복합 질의를 단일 요청에서 병렬 처리Intent Classification → Dynamic Router (Send API) → 10 sub-agents parallel execution → Aggregator (result collection, required context validation) → Dynamic Summarization (OpenCode-style compression on token threshold) → Answer Node (token streaming), compound queries processed in single request
Tag-Based Contextual Retrieval:Anthropic Contextual Retrieval 패턴 기반 2-tier RAG — item_class_list(167개 품목) + situation_tags(80개 상황) YAML 검색 → 18개 분류 규정 JSON 매칭, Classification→Keyword Fallback 전략, Relevance scoring으로 chunk 우선순위 결정2-tier RAG based on Anthropic Contextual Retrieval pattern — item_class_list (167 items) + situation_tags (80 situations) YAML search → 18 category regulation JSON matching, Classification→Keyword Fallback strategy, chunk prioritization via Relevance scoring
Global/Local Prompt Optimization:Global(캐릭터 정의·톤·규칙) + Local(Intent별 지침) 분리 패턴, 구조화 응답 템플릿(핵심 답변→구체적 방법→상태 개선 팁), LRU 캐싱으로 프롬프트 로드 최적화Global (character definition, tone, rules) + Local (per-intent instructions) separation pattern, structured response template (core answer→detailed method→improvement tips), prompt load optimization via LRU caching
Redis Event Bus 3-Tier:Streams(영속) + Pub/Sub(실시간 팬아웃) + State KV(복구) 분리, Lua Script 멱등성 + Lamport Clock seq 기반 이벤트 순서 보장 → Scan Pipeline(LLM×2) VU 1000 97.8% · 373 RPM · p95 173sStreams (persistence) + Pub/Sub (realtime fan-out) + State KV (recovery) separation, Lua Script idempotency + Lamport Clock seq-based event ordering → Scan Pipeline (LLM×2) VU 1000 97.8% · 373 RPM · p95 173s
Pub/Sub Shard 최적화:job_id별 채널 → shard별 채널(hash % 4) 변경, SSE Gateway가 4개 shard 구독 후 job_id로 내부 라우팅 → Pub/Sub 연결 O(N)→O(4), 1,000 VU 기준 99.6% 연결 절감Changed from per-job_id channels to shard-based channels (hash % 4), SSE Gateway subscribes to 4 shards with internal job_id routing → Pub/Sub connections O(N)→O(4), 99.6% connection reduction at 1,000 VU
Concurrency:LLM 호출(10-20s/req)로 인해 prefork 모델에서는 동시 처리량이 프로세스 수에 의해 제한되고 비용이 커짐 → I/O-bound 구간을 Gevent(협력적 스케줄링) 기반 동시 처리로 전환해 in-flight 요청 확대, LangGraph(asyncio) 파이프라인은 Taskiq 워커로 분리해 event loop 유지LLM calls (10-20s/req) limited prefork model concurrency to process count with high cost → Switched I/O-bound sections to Gevent (cooperative scheduling) for concurrent processing, expanding in-flight requests. LangGraph (asyncio) pipeline separated to Taskiq worker to maintain event loop
Multi-LLM Resilience:LLMClientPort 추상화로 OpenAI/Gemini Provider 자동 추론 및 런타임 전환, NodePolicy(FAIL_OPEN/CLOSE/FALLBACK) + Circuit Breaker로 노드 장애 시 그래프 중단 방지, Stateless Reducer + 체크포인팅으로 중간 실패 시 LLM 재호출 없이 복구LLMClientPort abstraction for OpenAI/Gemini provider auto-inference and runtime switching, NodePolicy (FAIL_OPEN/CLOSE/FALLBACK) + Circuit Breaker prevents graph interruption on node failure, Stateless Reducer + checkpointing for recovery without LLM re-call on intermediate failure
Auth Offloading:모든 요청이 ext-authz→Redis를 경유하며 Redis 경합으로 42 RPS 병목, P99 80ms 발생 → Istio EnvoyFilter ext-authz(Go) Local sync.Map 캐시 전환, RabbitMQ Fanout broadcast로 Pod 간 블랙리스트 동기화 → 48→1,500 RPS(31×), P50 57→7.5ms(87%↓), Cache Hit >99%All requests routed through ext-authz→Redis causing 42 RPS bottleneck at P99 80ms → Switched to Istio EnvoyFilter ext-authz(Go) with local sync.Map cache, RabbitMQ Fanout broadcast for cross-pod blacklist sync → 48→1,500 RPS (31×), P50 57→7.5ms (87%↓), Cache Hit >99%
Persistence Offloading:API가 DB 쓰기를 직접 처리하여 레이턴시 증가 + 커넥션 폭증 → API는 Redis Streams XADD 이벤트 발행만 수행, Worker가 5초 간격 일괄 배치 처리, Deterministic UUID 멱등성 키로 중복 방지 + Outbox Fallback 99.5% 전달 보장 → API 응답 50ms→5ms(90%↓), API/Worker 독립 스케일링API handling DB writes directly increased latency + connection explosion → API publishes via Redis Streams XADD only, Worker handles 5s interval batch processing, Deterministic UUID idempotency key prevents duplicates + Outbox Fallback ensures 99.5% delivery → API response 50ms→5ms (90%↓), independent API/Worker scaling
Eventual Consistency:강결합·즉시 응답이 필요한 로직(OAuth 등)을 제외한 스코프에 적용 — Scan Reward: Judge Phase에서 즉시 SSE 응답 후 Fire-and-Forget Dispatch, Deterministic UUID 멱등성 키로 동시 Persist → 500→100ms(80%↓). Chat: FE Optimistic Update + BE Eventual Write + 30초 Retention Window로 논리적 정합성 확보. Auth: Local Cache + RabbitMQ Fanout Broadcast로 Pod 간 동기화Applied to scopes excluding tightly-coupled immediate-response logic (OAuth, etc.) — Scan Reward: immediate SSE response at Judge Phase, Fire-and-Forget Dispatch with Deterministic UUID idempotency key for concurrent Persist → 500→100ms (80%↓). Chat: FE Optimistic Update + BE Eventual Write + 30s Retention Window for logical consistency. Auth: Local Cache + RabbitMQ Fanout Broadcast for cross-pod sync
Agent-Driven Development:ADR·장애 로그·도메인 규칙을 Knowledge Base로 축적 → 자체 RAG + 27개 커스텀 Skills를 Sub agent에 주입하여 에이전트 온보딩 가속, Task 도구 7,500+ 호출(8,251 세션)이 관측되며 병렬 포함 멀티파트 작업을 수행. Anthropic·Vercel 등 공식 벤더 Skills도 적극 활용, 설계-구현-테스트-관측-검증 피드백 루프를 구성해 사이클 운용Accumulated ADR, incident logs, domain rules as Knowledge Base → self-RAG + 27 custom Skills injected into sub agents for onboarding acceleration, 7,500+ Task tool invocations observed across 8,251 sessions including parallel multi-part execution. Actively leveraging official vendor Skills (Anthropic, Vercel), operating a design-implement-test-observe-verify feedback loop
DREAM - Interactive Story Generation Service | Backend & Infrastructure
Sep 2024
LLM & RAG 기반 시니어 꿈 실현 스토리텔링 서비스 | KakaoTech Hackathon — Eco² 이전, 단일 노드 환경에서 분산 아키텍처의 초안을 실험한 프로젝트
Senior dream realization storytelling service based on LLM & RAG | KakaoTech Hackathon — Pre-Eco² project experimenting with distributed architecture drafts in single-node environment
Single-Node Microservices:Docker 네트워크로 단일 EC2에서 Spring Backend, FastAPI AI Server, Persistence Layer(Elasticsearch, MySQL)를 컨테이너 단위 분리 — Eco² 마이크로서비스 아키텍처의 원형Container-level separation of Spring Backend, FastAPI AI Server, Persistence Layer (Elasticsearch, MySQL) on single EC2 via Docker network — Prototype of Eco² microservices architecture
Nginx Reverse Proxy:단일 진입점에서 path-based 라우팅(/api → Backend, /ai → AI Server)으로 서비스 간 트래픽 분배 — ALB + Bridge Ingress Controller + Istio Virtual Routing의 전신Traffic distribution between services via path-based routing (/api → Backend, /ai → AI Server) from single entry point — Predecessor to ALB + Bridge Ingress Controller + Istio Virtual Routing
Legacy CI/CD:GitHub Actions + S3 + CodeDeploy로 경량 배포 파이프라인 구축, 배포 시간 80% 단축 — ArgoCD GitOps 이전 단계Lightweight deployment pipeline with GitHub Actions + S3 + CodeDeploy, 80% deployment time reduction — Pre-ArgoCD GitOps stage
LLM + RAG:Elasticsearch 벡터 검색 + LangChain RAG로 시니어 꿈 기반 스토리/이미지 생성 — Eco² Scan API 파이프라인의 초기 실험Senior dream-based story/image generation with Elasticsearch vector search + LangChain RAG — Early experiment for Eco² Scan API pipeline
리소스 최적화Resource Optimization:Swap 메모리, Docker 리소스 제한으로 EC2 안정화 (CPU 89%→40%) — 제한된 환경에서의 성능 튜닝 경험EC2 stabilization with Swap memory, Docker resource limits (CPU 89%→40%) — Performance tuning experience in constrained environments
Web3.0-based decentralized auction system | Ensuring transparency and reliability in trading and distribution processes
Smart Contract Backend:ERC-1155 기반 상품/경매/화폐(MMT) Multi-Token 단일 컨트랙트 통합, 경매 비즈니스 로직(입찰/낙찰/정산/취소) + Governance 권한 관리 온체인 구현ERC-1155 based product/auction/currency (MMT) Multi-Token single contract integration, on-chain implementation of auction business logic (bid/award/settlement/cancel) + Governance permission management
Supply Chain Traceability:유통 이력 추적 체인 구현, 콜드체인 환경 데이터(온습도) 온체인 기록, 배송 상태 FSM(Empty→Prepare→Delivery)Distribution history tracking chain implementation, cold chain environment data (temperature/humidity) on-chain recording, delivery status FSM (Empty→Prepare→Delivery)