Skip to content

Call server

The Call Server is the core real-time backend for Frontier, implemented as an orchestrated set of Cloudflare Workers and Durable Objects. It handles live call audio processing, transcription, AI-driven detection, and the delivery of coaching signals to the agent’s desktop application. This system is designed for high performance at the edge, leveraging Cloudflare’s global network and serverless primitives.

flowchart TB
subgraph client["Client"]
  hud["Desktop app + HUD"]
end
subgraph cs["call-agent Worker — single deploy unit (Cloudflare)"]
  router["Router<br/>Agents SDK routeAgentRequest + Hono"]
  subgraph dos["Per-call Durable Objects"]
    callagent["CallAgent<br/>orchestrator · per call"]
    relay["WebSocketRelayAgent<br/>inbound relay"]
    quick["QuickAnswerAgent<br/>fast answers · deprecated"]
    chat["CallChatAgent<br/>per thread"]
    organs["OrgAnswerAgent<br/>org-scoped"]
    callans["CallAnswerAgent<br/>call-scoped"]
  end
end
subgraph companions["Companion workers · service bindings"]
  torch["transcript-orchestration"]
  qdet["question-detection"]
  faq["faq-detection"]
  scriptw["script-completion"]
  logw["logging"]
end
subgraph data["Cloudflare data bindings"]
  d1[("D1<br/>transcripts")]
  ais[("AI Search + R2<br/>knowledge")]
  kv[("KV<br/>cache")]
  vec[("Vectorize")]
end
subgraph ext["External"]
  dg["Deepgram<br/>speech-to-text"]
  gw["AI Gateway 'frontier-ai'<br/>OpenAI + Workers AI · cost log"]
  orouter["OpenRouter<br/>embeddings fallback"]
end
hud -->|"WebSocket · JWT (signals)"| router
hud -.->|"audio (Desktop SDK / direct)"| dg
router --> callagent
relay -->|"HTTP POST"| callagent
dg -->|"transcripts"| torch
callagent --> quick
callagent --> chat
callagent --> organs
callagent --> callans
callagent -->|"dispatch"| torch
torch --> qdet
torch --> faq
torch --> scriptw
callagent --> logw
callagent --> d1
torch --> d1
quick --> ais
organs --> ais
callans --> ais
callagent --> kv
callagent -.-> vec
callagent -->|"LLM"| gw
callagent -.->|"fallback"| orouter
classDef worker fill:#e3f2fd,stroke:#1565c0,color:#000
classDef do fill:#fce4ec,stroke:#c2185b,color:#000
classDef comp fill:#ede7f6,stroke:#5e35b1,color:#000
classDef store fill:#e8f5e9,stroke:#2e7d32,color:#000
classDef extn fill:#fff8e1,stroke:#f9a825,color:#000
class router worker
class callagent,relay,quick,chat,organs,callans do
class torch,qdet,faq,scriptw,logw comp
class d1,ais,kv,vec store
class dg,gw,orouter extn

The Call Server is not a single Cloudflare Worker, but an orchestrated set deployed together as a single unit. The main call-agent Worker hosts the per-call Durable Objects and handles primary routing. It is supported by five companion Workers, which are bound via Cloudflare service bindings for efficient internal communication:

  • script-completion worker: Detects progress against predefined sales scripts.
  • faq-detection worker: Identifies when a prospect’s statement matches a known FAQ.
  • question-detection worker: Uses Large Language Models (LLMs) to detect questions asked by the prospect.
  • transcript-orchestration worker: Manages the flow of transcript events, persists final transcripts to Cloudflare D1, and dispatches to other detection workers.
  • logging worker: Centralised logging for the Call Server. It exists because Durable Objects and other long-lived paths can’t ship logs the usual way — they fire-and-forget over RPC to this worker, which forwards them to Axiom. That gives real-time visibility into live calls; without it, DO logs would be dropped.

All these Workers, including the Durable Objects, are deployed as a single Cloudflare Worker script named call-agent (with environment-specific suffixes like -stg, -demo, -rc), meaning they constitute a single deployment and rollback unit.

The Call Server hosts a collection of Durable Objects (DOs), each providing stateful, real-time services for active calls. Each DO instance acts as a unit of statefulness and concurrency, typically keyed by a callId or threadId.

The primary Durable Objects are:

  • CallAgent (the orchestrator): The central DO for a live call, coordinating call state, managing script progress, handling question/FAQ detections, and broadcasting coaching signals to the HUD. It exposes approximately 22 callable RPC methods for client interaction.
  • CallChatAgent: A Durable Object dedicated to handling chat threads, keyed by threadId rather than callId. It supports progressive quick answers and follow-up threading.
  • QuickAnswerAgent (deprecated — superseded by CallAnswerAgent): the older fast inline-answer DO; still bound but being phased out. The current fast-answer path is CallAnswerAgent (warm, per-call).
  • WebSocketRelayAgent: An inbound relay that terminates external WebSocket traffic (such as from Recall.ai) and forwards events to the CallAgent via internal HTTP POSTs. Its hibernation is explicitly disabled to keep the inbound WebSocket connections active.
  • OrgAnswerAgent: Provides organization-scoped knowledge and answer retrieval.
  • CallAnswerAgent: Provides call-scoped answer retrieval, leveraging call-specific context.

Two further Durable Object classes are exported but not bound via wrangler.jsonc today:

  • TranscriptStreamAgent is part of the migration to direct dual-audio transcription — it accepts streaming audio from the desktop client for STT processing.
  • VoiceAgent is a full voice pipeline built on Cloudflare’s @cloudflare/voice (Flux STT + Aura TTS) that powers the onboarding / demo experience. It is not part of the live call path.

Each Durable Object also has its own Agent storage — SQLite-backed per-DO storage (the Agents SDK new_sqlite_classes migration) that holds call state right next to the compute that uses it.

Requests to the Call Server are routed through a combination of Hono for standard HTTP endpoints and the Cloudflare Agents SDK for Durable Object interactions. The Agents SDK’s routeAgentRequest function is used as a fallback for /agents/* paths, handling WebSocket upgrades and RPC calls to the correct Durable Object instance. Frontier does not use PartyServer for this routing. Authentication for Durable Object connections is handled via JWTs passed in the query string, with orgId validation on onConnect to enforce tenant isolation.

The Call Server Worker leverages Cloudflare-native data stores, bound directly into the Worker environment for high-performance, low-latency access:

  • Cloudflare D1 (DB binding): A SQLite database used for persisting final call transcripts and other structured call data.
  • Cloudflare Vectorize (VECTORIZE binding): A vector database for embeddings, used in knowledge retrieval and detection processes.
  • Cloudflare Workers AI (AI binding): Provides access to Cloudflare’s inference platform for local AI models, such as embeddings or specialized detection.
  • Cloudflare AI Search (AI_SEARCH binding): Used in conjunction with an R2 knowledge bucket (AI_SEARCH_KNOWLEDGE_BUCKET) for organization-scoped knowledge retrieval.
  • Cloudflare KV (SCRIPT_CACHE binding): A key-value store primarily used for caching script-related data and pre-warmed knowledge.

Configuration and background job data, as well as application/account data, are stored in Supabase (Postgres), which is accessed less frequently by the Call Server for specific configuration lookups.

For Large Language Model (LLM) inference, the Call Server primarily uses the internal AI Gateway, ‘frontier-ai’, which routes requests to OpenAI and Cloudflare Workers AI. Other providers may bypass this gateway for specific use cases. OpenRouter is used as a fallback for embedding generation. Frontier maintains a multi-provider strategy via the Vercel AI SDK, with dependencies for Anthropic, Google, OpenAI, and TogetherAI.


ComponentTypeRoleKeying (if DO)
call-agent WorkerCloudflare WorkerMain entry point; hosts all Durable Objects and routes requests.N/A
CallAgentDurable ObjectPer-call orchestrator; manages call state, script, questions, broadcasts to HUD.callId
CallChatAgentDurable ObjectPer-thread knowledge chat; handles follow-up questions and answers.threadId
QuickAnswerAgent (deprecated)Durable ObjectOlder fast inline-answer DO — superseded by CallAnswerAgent; still bound, being phased out.callId
WebSocketRelayAgentDurable ObjectRelays external WebSocket traffic (e.g., Recall.ai) to the CallAgent DO.callId
OrgAnswerAgentDurable ObjectProvides organization-scoped knowledge retrieval.org_<orgId>
CallAnswerAgentDurable ObjectProvides call-scoped answer retrieval, including call-context.callId
TranscriptStreamAgentDurable ObjectProcesses direct audio streams for Speech-to-Text (STT) and forwards events. (Unbound in current config.)callId
script-completion workerCompanion WorkerDetects progress on sales scripts.N/A
faq-detection workerCompanion WorkerDetects frequently asked questions using vector similarity.N/A
question-detection workerCompanion WorkerUses LLMs to detect prospect questions.N/A
transcript-orchestration workerCompanion WorkerPersists transcripts, groups words, throttles partials, dispatches detectors.N/A
logging workerCompanion WorkerHandles centralized logging.N/A
Cloudflare D1 (DB)Data StorePrimary database for final transcripts and structured call data.N/A
Cloudflare Vectorize (VECTORIZE)Data StoreVector database for embeddings in retrieval and detection.N/A
Cloudflare AI Search (AI_SEARCH)Data StoreKnowledge search engine, backed by Cloudflare R2 (AI_SEARCH_KNOWLEDGE_BUCKET).N/A
Cloudflare KV (SCRIPT_CACHE)Data StoreKey-value store for script caching and speculative pre-warmed knowledge.N/A
Cloudflare Workers AI (AI)External ServiceLLM inference platform.N/A
OpenAIExternal ServiceExternal LLM provider, routed via AI Gateway.N/A
TogetherAIExternal ServiceExternal LLM provider, routed via AI Gateway.N/A
GoogleExternal ServiceExternal LLM provider, routed via AI Gateway.N/A
AnthropicExternal ServiceExternal LLM provider, routed via AI Gateway.N/A
DeepgramExternal ServiceSpeech-to-Text (STT) provider, directly connected to by the Call Server for the direct-audio path.N/A
OpenRouterExternal ServiceEmbeddings fallback provider.N/A