Skip to content

Data

Frontier is built with a commitment to responsible data handling, ensuring that sales call coaching is effective while respecting privacy and maintaining strong security boundaries. This page details what data Frontier collects, where it is stored, and how tenant isolation is achieved across its services.

Frontier collects various types of data to provide real-time coaching and post-call insights. This includes identifying information about users and sales calls, the content of the calls themselves, and derived insights.

Data CategoryExamplesPII?
User & Organization MetadataOrganization records (name, email, image URL), user profiles (first/last name, email, image URL), call metadata (title, meeting URL, status), call participant details (name, email, platform identifiers), calendar data (event title, meeting URL, attendee list).Yes (names, emails, image URLs, meeting details, attendee lists).
Real-time Call ContentWord-level transcripts (spoken words, speaker ID, start timestamps), detected questions, AI-generated answers and responses, script progress.Yes (raw transcripts, detected questions, AI answers may contain PII).
Knowledge Base ContentFAQs (question, ideal answer), call scripts, structured organization facts, uploaded knowledge documents (original files, filenames, content, MIME types).Yes (any PII contained within uploaded documents, FAQs, scripts, or facts).
Derived Data & CachesShort-lived retrieval caches holding derived call content or knowledge query results.Potentially (if derived from PII-containing sources).
Observability DataError events, structured logs.Yes (events are enriched with org_id, user_id, call_id tags).

Frontier utilizes a distributed architecture with specialized data stores for different data types, ensuring scalability and performance for real-time operations.

Data TypePrimary Store (Provider)Other Stores/ProcessorsRegion/Residency
User/Org/Account Configuration, Call Metadata, Participants, FAQs, Scripts, Org Facts, AI Answers, CalendarsSupabase Postgres (Postgres)Clerk (identity provider for authentication).:::caution[GAP — founder supplies] Supabase project region.
Live Call Transcripts (word-level), Detected Questions, Call Metadata (subset)Cloudflare D1 (SQLite database):::caution[GAP — founder supplies] Cloudflare D1 region.
Knowledge Base Source DocumentsCloudflare R2 (object storage) & Supabase Storage:::caution[GAP — founder supplies] Cloudflare R2 and Supabase Storage regions.
Vector Embeddings / Knowledge IndexesCloudflare AI Search (AutoRAG), SupermemoryPinecone (vector database, legacy path). Cloudflare Vectorize binding exists but is not used in the live retrieval path.:::caution[GAP — founder supplies] Cloudflare AI Search, Supermemory, Pinecone regions.
Short-lived CachesCloudflare KV (Key-Value store):::caution[GAP — founder supplies] Cloudflare KV region.
LLM Inference (real-time + post-call)External LLM providers (Anthropic, Google, OpenAI, Together.ai), Cloudflare Workers AI:::caution[GAP — founder supplies] LLM provider regions.
Speech-to-Text (live direct-audio transcription)Deepgram (speech-to-text provider):::caution[GAP — founder supplies] Deepgram region.
Error ReportingSentry (error reporting):::caution[GAP — founder supplies] Sentry region.
Structured LoggingAxiom (structured logging):::caution[GAP — founder supplies] Axiom region.
Background JobsInngest (background job orchestration):::caution[GAP — founder supplies] Inngest region.
Secrets/ConfigurationDoppler (secrets management):::caution[GAP — founder supplies] Doppler region.
Call Audio (Desktop Recordings)Recall Desktop SDK for capture:::caution[GAP — founder supplies] Specific storage location and retention of raw audio blobs are founder-supplied gaps.:::caution[GAP — founder supplies] Recall Desktop SDK audio storage region.

Raw meeting transcripts and word-level data, which include PII, are primarily stored in Cloudflare D1. Participant names and emails are stored in Supabase Postgres. Knowledge document contents can reside in Cloudflare R2 or Supabase Storage.

Frontier implements robust tenant isolation to ensure that each organization’s data is logically segregated and inaccessible to others. This is primarily achieved through org_id identifiers and application-level enforcement.

  • Supabase Postgres: Tenant isolation is enforced using Postgres Row-Level Security (RLS) policies. These policies authorize access to data based on the org_id claim extracted from the JSON Web Token (JWT) issued by Clerk, the identity provider. All sensitive tables have RLS enabled, restricting CRUD operations to data matching the user’s org_id. Tables without an org_id column enforce isolation by joining to a parent calls row and checking its org_id.
  • Cloudflare D1 (SQLite database): D1 does not natively support Row-Level Security. Isolation for data like transcript_words (which lacks its own org_id column) is achieved at the application query layer by linking to the call_id and subsequently to the calls table’s org_id. The calls and questions tables in D1 do carry an org_id directly.
  • Cloudflare AI Search: Tenant isolation for indexed knowledge content in Cloudflare AI Search is implemented via a hard org_id metadata-equality filter applied to every query. As a defense-in-depth measure, a post-filter mechanism logs and drops any results whose org_id metadata does not match the querying organization. Source documents for AI Search live in Cloudflare R2 object storage, where object keys are also org-prefixed (e.g., org/sites/example.com/page-1.html).
  • Supermemory: For the Supermemory knowledge backend (currently an interim solution), tenant isolation is achieved using a container tag built from the environment and the orgId. Retrieval queries are filtered by source_type metadata and, for document drills, by source_id.
  • Pinecone (legacy): The legacy Pinecone vector database also uses a metadata-filter model for tenant isolation, applying an { org_id: orgId } filter to queries.
  • Cloudflare KV (Key-Value store): Used as a short-lived cache for knowledge retrieval results (300s TTL) and global configuration. It holds derived call content and not the source of record.
  • Durable Objects (OrgAnswerAgent, CallAnswerAgent): Warm per-tenant answer agents (OrgAnswerAgent and CallAnswerAgent) enforce org_id binding from the JWT provided at connection time. Any attempt to reconnect from a different organization to an already bound Durable Object is explicitly rejected with an “Unauthorized org” error.
  • Multi-backend Knowledge Base: Frontier’s knowledge base is designed to be multi-backend and is currently in active evaluation, supporting Cloudflare AI Search, Supermemory, and a legacy Pinecone path. The active backend is resolved at runtime based on per-request overrides, KV configurations, or environment defaults. This means an organization’s knowledge data may reside in multiple vector/storage backends simultaneously.

Frontier maintains data for the duration necessary to provide its services and meet business requirements.

  • Limited retention periods were found in code for specific components: Cloudflare KV retrieval caches expire after 300 seconds, and detected question events in Cloudflare D1 are cleaned up after 14 days.

  • The system supports cascading deletes (e.g., deleting a calls row in D1 cascades to its transcript_words and questions). However, the overall data retention schedule, including for primary stores like Cloudflare D1 transcripts and Supabase call/participant data, is:

  • Data residency and geographical location of all data stores are also critical aspects:

  • The process for Data Subject Access Requests (DSARs), including data deletion and export, needs to be fully defined:

Frontier engages with various sub-processors to deliver its services. Ensuring appropriate contractual agreements, including Data Processing Agreements (DPAs), is crucial.

  • Sub-processors: Key sub-processors on the data path include Clerk (identity provider), Deepgram (speech-to-text), Anthropic, Google, OpenAI, Together.ai (LLM providers), Sentry (error reporting), Axiom (structured logging), and Inngest (background jobs). Recall Desktop SDK is used for audio capture.

  • LLM Provider Commitments:

  • DPA Posture:

  • Supply-Chain Hygiene: Frontier uses Dependabot to automatically track and update dependencies for the Bun ecosystem on a weekly schedule, contributing to overall supply-chain security.

  • Encryption and Key Management: