Data
Frontier is built with a commitment to responsible data handling, ensuring that sales call coaching is effective while respecting privacy and maintaining strong security boundaries. This page details what data Frontier collects, where it is stored, and how tenant isolation is achieved across its services.
What We Collect
Section titled “What We Collect”Frontier collects various types of data to provide real-time coaching and post-call insights. This includes identifying information about users and sales calls, the content of the calls themselves, and derived insights.
| Data Category | Examples | PII? |
|---|---|---|
| User & Organization Metadata | Organization records (name, email, image URL), user profiles (first/last name, email, image URL), call metadata (title, meeting URL, status), call participant details (name, email, platform identifiers), calendar data (event title, meeting URL, attendee list). | Yes (names, emails, image URLs, meeting details, attendee lists). |
| Real-time Call Content | Word-level transcripts (spoken words, speaker ID, start timestamps), detected questions, AI-generated answers and responses, script progress. | Yes (raw transcripts, detected questions, AI answers may contain PII). |
| Knowledge Base Content | FAQs (question, ideal answer), call scripts, structured organization facts, uploaded knowledge documents (original files, filenames, content, MIME types). | Yes (any PII contained within uploaded documents, FAQs, scripts, or facts). |
| Derived Data & Caches | Short-lived retrieval caches holding derived call content or knowledge query results. | Potentially (if derived from PII-containing sources). |
| Observability Data | Error events, structured logs. | Yes (events are enriched with org_id, user_id, call_id tags). |
Where It Lives
Section titled “Where It Lives”Frontier utilizes a distributed architecture with specialized data stores for different data types, ensuring scalability and performance for real-time operations.
| Data Type | Primary Store (Provider) | Other Stores/Processors | Region/Residency |
|---|---|---|---|
| User/Org/Account Configuration, Call Metadata, Participants, FAQs, Scripts, Org Facts, AI Answers, Calendars | Supabase Postgres (Postgres) | Clerk (identity provider for authentication). | :::caution[GAP — founder supplies] Supabase project region. |
| Live Call Transcripts (word-level), Detected Questions, Call Metadata (subset) | Cloudflare D1 (SQLite database) | :::caution[GAP — founder supplies] Cloudflare D1 region. | |
| Knowledge Base Source Documents | Cloudflare R2 (object storage) & Supabase Storage | :::caution[GAP — founder supplies] Cloudflare R2 and Supabase Storage regions. | |
| Vector Embeddings / Knowledge Indexes | Cloudflare AI Search (AutoRAG), Supermemory | Pinecone (vector database, legacy path). Cloudflare Vectorize binding exists but is not used in the live retrieval path. | :::caution[GAP — founder supplies] Cloudflare AI Search, Supermemory, Pinecone regions. |
| Short-lived Caches | Cloudflare KV (Key-Value store) | :::caution[GAP — founder supplies] Cloudflare KV region. | |
| LLM Inference (real-time + post-call) | External LLM providers (Anthropic, Google, OpenAI, Together.ai), Cloudflare Workers AI | :::caution[GAP — founder supplies] LLM provider regions. | |
| Speech-to-Text (live direct-audio transcription) | Deepgram (speech-to-text provider) | :::caution[GAP — founder supplies] Deepgram region. | |
| Error Reporting | Sentry (error reporting) | :::caution[GAP — founder supplies] Sentry region. | |
| Structured Logging | Axiom (structured logging) | :::caution[GAP — founder supplies] Axiom region. | |
| Background Jobs | Inngest (background job orchestration) | :::caution[GAP — founder supplies] Inngest region. | |
| Secrets/Configuration | Doppler (secrets management) | :::caution[GAP — founder supplies] Doppler region. | |
| Call Audio (Desktop Recordings) | Recall Desktop SDK for capture | :::caution[GAP — founder supplies] Specific storage location and retention of raw audio blobs are founder-supplied gaps. | :::caution[GAP — founder supplies] Recall Desktop SDK audio storage region. |
Raw meeting transcripts and word-level data, which include PII, are primarily stored in Cloudflare D1. Participant names and emails are stored in Supabase Postgres. Knowledge document contents can reside in Cloudflare R2 or Supabase Storage.
Tenant Isolation
Section titled “Tenant Isolation”Frontier implements robust tenant isolation to ensure that each organization’s data is logically segregated and inaccessible to others. This is primarily achieved through org_id identifiers and application-level enforcement.
- Supabase Postgres: Tenant isolation is enforced using Postgres Row-Level Security (RLS) policies. These policies authorize access to data based on the
org_idclaim extracted from the JSON Web Token (JWT) issued by Clerk, the identity provider. All sensitive tables have RLS enabled, restricting CRUD operations to data matching the user’sorg_id. Tables without anorg_idcolumn enforce isolation by joining to a parentcallsrow and checking itsorg_id. - Cloudflare D1 (SQLite database): D1 does not natively support Row-Level Security. Isolation for data like
transcript_words(which lacks its ownorg_idcolumn) is achieved at the application query layer by linking to thecall_idand subsequently to thecallstable’sorg_id. Thecallsandquestionstables in D1 do carry anorg_iddirectly. - Cloudflare AI Search: Tenant isolation for indexed knowledge content in Cloudflare AI Search is implemented via a hard
org_idmetadata-equality filter applied to every query. As a defense-in-depth measure, a post-filter mechanism logs and drops any results whoseorg_idmetadata does not match the querying organization. Source documents for AI Search live in Cloudflare R2 object storage, where object keys are alsoorg-prefixed(e.g.,org/sites/example.com/page-1.html). - Supermemory: For the Supermemory knowledge backend (currently an interim solution), tenant isolation is achieved using a container tag built from the environment and the
orgId. Retrieval queries are filtered bysource_typemetadata and, for document drills, bysource_id. - Pinecone (legacy): The legacy Pinecone vector database also uses a metadata-filter model for tenant isolation, applying an
{ org_id: orgId }filter to queries. - Cloudflare KV (Key-Value store): Used as a short-lived cache for knowledge retrieval results (300s TTL) and global configuration. It holds derived call content and not the source of record.
- Durable Objects (OrgAnswerAgent, CallAnswerAgent): Warm per-tenant answer agents (
OrgAnswerAgentandCallAnswerAgent) enforceorg_idbinding from the JWT provided at connection time. Any attempt to reconnect from a different organization to an already bound Durable Object is explicitly rejected with an “Unauthorized org” error. - Multi-backend Knowledge Base: Frontier’s knowledge base is designed to be multi-backend and is currently in active evaluation, supporting Cloudflare AI Search, Supermemory, and a legacy Pinecone path. The active backend is resolved at runtime based on per-request overrides, KV configurations, or environment defaults. This means an organization’s knowledge data may reside in multiple vector/storage backends simultaneously.
Data Retention, Residency, and Deletion
Section titled “Data Retention, Residency, and Deletion”Frontier maintains data for the duration necessary to provide its services and meet business requirements.
-
Limited retention periods were found in code for specific components: Cloudflare KV retrieval caches expire after 300 seconds, and detected question events in Cloudflare D1 are cleaned up after 14 days.
-
The system supports cascading deletes (e.g., deleting a
callsrow in D1 cascades to itstranscript_wordsandquestions). However, the overall data retention schedule, including for primary stores like Cloudflare D1 transcripts and Supabase call/participant data, is: -
Data residency and geographical location of all data stores are also critical aspects:
-
The process for Data Subject Access Requests (DSARs), including data deletion and export, needs to be fully defined:
Data Processing Agreements
Section titled “Data Processing Agreements”Frontier engages with various sub-processors to deliver its services. Ensuring appropriate contractual agreements, including Data Processing Agreements (DPAs), is crucial.
-
Sub-processors: Key sub-processors on the data path include Clerk (identity provider), Deepgram (speech-to-text), Anthropic, Google, OpenAI, Together.ai (LLM providers), Sentry (error reporting), Axiom (structured logging), and Inngest (background jobs). Recall Desktop SDK is used for audio capture.
-
LLM Provider Commitments:
-
DPA Posture:
-
Supply-Chain Hygiene: Frontier uses
Dependabotto automatically track and update dependencies for the Bun ecosystem on a weekly schedule, contributing to overall supply-chain security. -
Encryption and Key Management: