About

What I Do

I design and build real-time communication systems and AI infrastructure. My work spans WebRTC, voice AI pipelines, media server architecture, and cloud-native platforms operating at serious scale.

I’m drawn to the hardest problems in this space — the ones where latency budgets are measured in milliseconds, where a single architecture decision determines whether a system can handle 10 users or 10,000, and where AI needs to work in real-time, not batch.

What I’ve Built

Architected media infrastructure handling thousands of concurrent real-time sessions across multiple regions
Built voice AI pipelines achieving sub-300ms time-to-first-audio-byte — from microphone to intelligent response
Designed SFU cascading architecture for global real-time communication with sub-150ms intra-region latency
Reduced P99 media latency by 40% through edge deployment and transport optimization
Scaled WebRTC infrastructure across AWS, GCP, and Azure with drain-based auto-scaling that never drops an active call

Areas of Depth

Real-Time Communication

WebRTC internals, SFU/MCU architecture, media pipeline optimization, low-latency protocol design, NAT traversal, Simulcast, and scaling real-time systems to thousands of concurrent sessions.

AI Infrastructure

Voice AI and speech pipelines (STT/TTS), LLMs in production, Retrieval-Augmented Generation (RAG), AI agents, and integrating intelligence into real-time communication flows with strict latency requirements.

Cloud & Distributed Systems

AWS, GCP, and Azure at scale. Kubernetes for media workloads, edge computing for latency-sensitive applications, and designing distributed systems that stay reliable under pressure.

Background

I’m a Founding Member at VideoSDK, where I’ve been building real-time communication infrastructure from the ground up. The experience of architecting systems that handle live audio, video, and AI processing simultaneously — with zero tolerance for lag — is what shapes everything I write about here.

What I’m Interested In

I enjoy working on problems at the intersection of real-time communication and AI — particularly voice AI latency optimization, media server architecture at scale, and cloud infrastructure for latency-sensitive workloads. If you’re working on something in this space, I’d enjoy hearing about it.