Skip to content

Cloud telephony platform

Live in production

Primary backend engineer · Superfone · 2021 – present

The call-control backbone of a virtual business-phone platform — SIP at the edge, call-flow orchestration, billing, and the APIs that tie thousands of businesses' phone numbers together.

Telephony · SIP / VoIP · Node.js · Go · Kubernetes · Billing


Before the AI, there’s the phone system — and a phone system is a surprising amount of software. Someone has to provision the number, route the call through carriers, ring the right people in the right order, record it, bill for it, and never, ever drop it. I’ve been the primary engineer on that backbone since its early days.

The shape of it

A call touches several layers on its way through: a session-border layer that terminates carrier and client SIP at the edge, a call-flow controller that drives the actual telephony (ringing, menus, bridges, hold music), an event broker that turns raw call events into clean records, and a core API that serves the product — numbers, teams, subscriptions, messaging, and the configuration behind every IVR and ring group. Most of it runs on Kubernetes; the parts that need raw network sockets run close to the metal.

What I worked on

  • The call-control stack, end to end — SIP edge configuration, an ARI-driven call-flow controller in Go, a call-record event broker, and the core API. I’ve written across every one of these layers.

  • International expansion. Led a phased rollout that added a second carrier and region-aware number provisioning — a cross-cutting change that touched billing, wallet, subscriptions, and number assignment at the same time, and closed a long tail of currency bugs in production.

  • Billing that has to be exactly right. Mandate lifecycles, partial refunds with credit notes, tax-invariant enforcement, automated recovery of stuck payment orders — the unglamorous machinery where a rounding error is a support ticket.

  • Reliability work on stateful services. Zero-query-drop rolling deploys via connection-pooler lifecycle hooks, spreading pods across failure domains, and autoscaling a real-time workload on concurrent calls rather than CPU — because CPU is a useless signal for voice.

  • Cutting external dependencies. Migrated inter-service messaging off a cloud queue onto a self-hosted, Redis-backed queue — lower latency, fewer moving parts, no cross-cloud bill.

What I like about it

It’s the least flashy and most demanding kind of engineering: the system is judged entirely by whether a stranger’s phone call connects. The fun is in the seams — the race condition between two call legs, the deploy that drops a query, the throttle event nobody thought to measure.