About
What I Do
I design and build real-time communication systems and AI infrastructure. My work spans WebRTC, voice AI pipelines, media server architecture, and cloud-native platforms operating at serious scale.
I’m drawn to the hardest problems in this space — the ones where latency budgets are measured in milliseconds, where a single architecture decision determines whether a system can handle 10 users or 10,000, and where AI needs to work in real-time, not batch.
What I’ve Built
- Architected media infrastructure handling thousands of concurrent real-time sessions across multiple regions
- Built voice AI pipelines achieving sub-300ms time-to-first-audio-byte — from microphone to intelligent response
- Designed SFU cascading architecture for global real-time communication with sub-150ms intra-region latency
- Reduced P99 media latency by 40% through edge deployment and transport optimization
- Scaled WebRTC infrastructure across AWS, GCP, and Azure with drain-based auto-scaling that never drops an active call
Areas of Depth
Real-Time Communication
WebRTC internals, SFU/MCU architecture, media pipeline optimization, low-latency protocol design, NAT traversal, Simulcast, and scaling real-time systems to thousands of concurrent sessions.
AI Infrastructure
Voice AI and speech pipelines (STT/TTS), LLMs in production, Retrieval-Augmented Generation (RAG), AI agents, and integrating intelligence into real-time communication flows with strict latency requirements.
Cloud & Distributed Systems
AWS, GCP, and Azure at scale. Kubernetes for media workloads, edge computing for latency-sensitive applications, and designing distributed systems that stay reliable under pressure.
Background
I’m a Founding Member at VideoSDK, where I’ve been building real-time communication infrastructure from the ground up. The experience of architecting systems that handle live audio, video, and AI processing simultaneously — with zero tolerance for lag — is what shapes everything I write about here.
What I’m Interested In
I enjoy working on problems at the intersection of real-time communication and AI — particularly voice AI latency optimization, media server architecture at scale, and cloud infrastructure for latency-sensitive workloads. If you’re working on something in this space, I’d enjoy hearing about it.