01 / 29

Architecture Deep Dive

CVS Outbound Dialer

An automated outbound calling platform built on LiveKit Agents, deployed entirely on-prem.

$ cat agenda.md

01

What Is It?

Components & design decisions

02

LiveKit Agents

Framework, AgentServer, JobContext

03

Infrastructure

On-prem RKE cluster & services

04

Call Flow & SIP

End-to-end sequence + SIP trunk

05

Conversations

One-way, two-way & AMD

06

Ops

Recording, audit, deploy, TURN

01

The Problem

Automated outbound calls for CVS Health campaigns

One-way notifications (prescription reminders) + two-way interactive conversations (appointment scheduling) with AMD, recording & full audit trails.

# Key Design Decisions

# Core Components

API

outbound-dialer

FastAPI service. Receives call requests, creates rooms, dispatches agents, initiates SIP.

WKR

outbound-dialer-worker

LiveKit Agent process. Handles real-time voice AI logic via AgentSession.

LK

LiveKit Server

Self-hosted WebRTC server. Routes audio, manages rooms, bridges SIP, runs egress.

SIP

Twilio SIP Trunk

External PSTN gateway. Terminates outbound SIP calls to phone numbers.

02

# LiveKit Agents SDK

The agent worker is built on the LiveKit Agents SDK — a Python framework for building real-time voice AI agents that participate in LiveKit rooms as first-class participants.

AgentServer

The top-level process that connects to the self-hosted LiveKit server over WebSocket. It listens for agent dispatch requests and spawns agent sessions.

src/outbound_dialer/agent/entrypoint.py

JobContext

Provided to the entrypoint when the agent is dispatched. Gives access to the LiveKit room, participant events, and room metadata. Calls ctx.connect() to join (with TURN).

# Agent Architecture

AgentServer
JobContext
AgentSession
OutboundDialerAgent
03

# System Architecture

RKE Kubernetes — CVS On-Prem
API Server
FastAPI :21121
Agent Worker
LIVEKIT=true :8081
LiveKit Server
wss://…livekit.corp
Egress
→ GCS MP3
Internal CVS Services
CVS ASR
wss://…/listen
CVS TTS
wss://…/speak
Campaign Svc
Event Pub → Kafka
External — Twilio
SIP Trunk
TLS + SRTP
PSTN Gateway
TURN Server
:443 TCP

# Internal Service Map

ServiceProtocolURL
LiveKit ServerWebSocket/gRPCwss://devservices-colo-west-livekit.corp…
CVS ASRWebSocketwss://…speak-colo-west…/v1/listen
CVS TTSWebSocketwss://…speak-colo-west…/v1/speak
Campaign ServiceHTTPShttps://devservices-colo-west…/campaigns/v1
Event PublisherHTTPShttps://digital-retail-rx-qa…/kafka/send-message
04

# SIP Call Sequence

End-to-end SIP call flow sequence diagram showing Client, API Server, LiveKit, Agent Worker, and Twilio SIP interactions

# Call Flow — Setup

# Call Flow — Connection

Source: src/outbound_dialer/services/call_service.py, sip_service.py

# SIP Telephony Architecture

On-Prem LiveKit Server

LiveKit Room

Agent (worker) audio
SIP Participant audio
Mixed audio (recording)

SIP Bridge

Trunk: cvs-retail-outbound
Address: cvsretailoutbound​terminate.pstn.ashburn.twilio.com
From: +12134147078
Transport: TLS
Media: SRTP encrypted

LiveKit SIP Bridge
SIP INVITE (TLS)
Twilio PSTN Gateway
Phone ☎

Credentials stored in Vault · SIPServiceSingleton ensures trunk exists at startup

05

# Conversation Routing

SIP Participant Joins Room
Start Egress Recording
campaign.type?
one_way
Play message → hang up

session.say() with
interruptions disabled

two_way
Full voice pipeline

STT → AMD → LLM → TTS
interactive loop

# Two-Way Voice Pipeline

User Speech
STT
CVS ASR
AMD Check
LLM Node
External API
TTS
CVS TTS
Speaker

# Answering Machine Detection

First user utterance (via STT)
Contains voicemail pattern?
No
Continue conversation
Yes → AMD enabled?
HANGUP
Publish amd_hangup
→ End call
LEAVE_MESSAGE
Play voicemail msg
→ Delayed hangup
06

# Egress Recording Architecture

Self-Hosted LiveKit Server

LiveKit Room

Agent (worker) audio track
SIP Participant (phone) audio track

subscribe
all audio →

Egress Service

RoomComposite
audio_only = True

MP3 Encode

Mixed audio stream

GCS Upload

Bucket: livekit-recordings-nonprod
Path: recordings/{campaign}/{session}/recording.mp3

Auto-stops on participant departure (30s timeout) · Manual stop via stop_and_wait_for_egress()

# Event Audit System

API Server
session events
Agent Worker
conversation events
Event Publisher
HTTPS
Kafka Topics
EventWhen Published
conversation_startedGreeting delivered (two-way)
notification_deliveredOne-way message played
amd_hangupVoicemail detected, call hung up
user_hangupUser disconnected early
error_http / error_exceptionAgent API or runtime error

# TURN Server Configuration

TURN relays are required because the on-prem LiveKit server needs to route media to/from external Twilio SIP endpoints across the corporate firewall.

Corporate Network

LiveKit Server

on-prem

Agent Worker

on-prem

NAT / Firewall
direct path blocked
Twilio TURN
global.turn.twilio.com:443
TCP · relay-only transport
External — Twilio

SIP Gateway

PSTN Bridge

TRANSPORT_RELAY forced · Credentials fetched from Twilio API, cached 24h · Stored in Vault

# Dual-Mode Deployment

API Server

Port21121
Health/health
EnvLIVEKIT not set
Entryscripts/start.py → FastAPI
Route/microservices/outbound-dialer/

Agent Worker

Port8081
Health/
EnvLIVEKIT=true
Entryentrypoint.py → AgentServer
Route/microservices/outbound-dialer-worker/

Single Docker image · docker_entrypoint.py checks LIVEKIT env var to select mode

# Kubernetes Operations

# CI/CD Pipeline

CI — Continuous Integration

Push to branch
Python 3.13 Build
pytest
Docker Build
Push → JFrog

CD — Continuous Deployment

deploy-configs/** change
Detect env
Validate TPS
Helm Deploy
RKE Cluster

API server and worker have separate CD pipelines triggered by different deploy-configs/ paths

Key Takeaways

$ summary