C Concierge Documentation
Docs / Start here / How it works
Architecture

How it works.

Concierge runs as a single Cloudflare Worker that handles four messaging channels through one unified pipeline. Every inbound event (a WhatsApp message, an Instagram DM, an email arriving at your catch‑all, or a lead form submission) is normalized into the same shape and processed by the same steps.

The pipeline · runs identically for every channel
Inbound
WhatsAppwebhook
Instagramwebhook
Emailemail() handler
Lead formPOST /lead/…
Normalize
Common pipeline
01NormalizeInboundMessage
02Tenant + credit check
03Reply rules · keyword + embedding
04Persona prompt + action dispatch
05Log metadata · D1
Dispatch
Actions
Canned replyStatic text · no AI · free
AI replyWorkers AI · llama‑4‑scout · persona + rule prompt
Forward → DiscordEmbed + Reply / Approve / Drop
Forward emailReverse‑alias for replies
Drop / spam rejectSilent or NDR

The unified pipeline

Regardless of which channel a message arrives on, it’s normalized into an InboundMessage struct (channel, sender, recipient, tenant, metadata), and processed through the same steps. This is the spine of the codebase. Channel handlers exist only to translate webhooks into InboundMessage and to dispatch outbound replies.

  1. Channel handler receives the raw event

    WhatsApp and Instagram POST signed webhook payloads to /webhook/*. Email arrives via Cloudflare’s send_to_worker action, invoking the Worker’s email() entrypoint. Lead forms POST to /lead/{slug}.

  2. Normalize into InboundMessage

    Channel, sender, recipient, tenant ID, and any channel‑specific metadata. From here, every code path is identical.

  3. Log metadata to messages

    An append‑only row in D1: channel, direction, sender ID, recipient ID, tenant, timestamp. Message content is never persisted: only the fact that something happened, with whom.

  4. Reply rules evaluate in order

    Each channel carries an ordered ReplyConfig. Rules pair a matcher (case‑insensitive keyword substring or BGE‑embedding cosine similarity over a user‑written intent description) with a response (canned text or an AI prompt). First match wins; the mandatory default rule fires if nothing else does.

  5. Action dispatches · canned text or LLM call

    Canned responses send verbatim, no credit charge. Prompt responses concatenate the tenant’s persona prompt with the rule’s prompt and run the main reply model; one credit is deducted before the call (optimistic) and restored if generation or send fails. AI replies are blocked unless the persona’s asynchronous safety check has approved the current prompt.

i
Privacy note

The unified messages table stores only metadata: channel, direction, sender ID, recipient ID, tenant, timestamp. No subjects, no bodies, no attachments. AI replies are generated synchronously from in‑memory data and discarded.

WhatsApp / Instagram auto‑reply

Both Meta channels run through the same reply pipeline:

  1. Meta delivers the inbound message to POST /webhook/whatsapp or POST /webhook/instagram.
  2. Concierge looks up the channel account (phone number ID for WhatsApp, page ID for Instagram) and its ReplyConfig.
  3. The body is truncated to 1000 chars and run past a fast prompt‑injection scanner; injection attempts are dropped.
  4. If any rule has a Prompt matcher, the inbound message is embedded once and compared via cosine similarity to each rule’s precomputed embedding.
  5. Rules are walked in order; the first match wins. Otherwise the mandatory default rule fires.
  6. Canned responses send verbatim with no credit charge. Prompt responses combine persona + rule prompt + a context block, deduct one credit, and run the main LLM. AI replies require the tenant’s persona to be safety‑Approved.

Persona & safety check

  1. The tenant picks a curated preset, fills in the builder (tone / catch‑phrases / off‑topic boundaries), or writes a raw custom prompt at /admin/persona.
  2. On save, the active prompt is hashed; if the hash differs from the last‑vetted hash, status flips to Pending and a SafetyJob is enqueued onto Cloudflare Queue concierge-safety.
  3. The queue consumer reads the job, re‑checks the hash (drops stale jobs), and runs the safety classifier with Calculon Tech’s content policy.
  4. The result lands back in KV as Approved or Rejected with a vague user‑facing reason.
  5. While Pending or Rejected, AI replies are blocked tenant‑wide; canned default replies still send.

Email routing

  1. An email arrives at your catch‑all domain (configured via Cloudflare Email Routing).
  2. Cloudflare triggers the Worker’s email() handler.
  3. Concierge extracts the domain, looks up the tenant, and parses the MIME message.
  4. Routing rules are evaluated in priority order using glob‑pattern matching on from, to, subject, body, and has_attachment.
  5. The matched rule’s action executes: drop, spam reject, forward email, forward to Discord, or AI reply with approval.
  6. For email forwarding, a reverse‑alias address is generated so replies route back through Concierge to the original sender.
Glob semantics  last match wins
*
any sequence of characters, zero or more
?
exactly one character
case
matching is case‑insensitive
combine
all non‑None criteria are AND‑ed (from + to + subject + body + has_attachment)
order
rules are sorted by ascending priority; the highest priority match wins

Discord relay

  1. When a message from any channel is forwarded to Discord (via email routing rules or future direct integrations), it arrives as an embed with Reply, Approve, and Drop buttons.
  2. A ConversationContext is saved in KV, linking the Discord message to the original channel, sender, and reply metadata.
  3. When someone clicks Reply, a modal opens for composing a response.
  4. The reply is sent back through the originating channel using the stored context.
  5. For AI‑generated drafts, Approve sends the draft and Drop discards it.

Lead capture forms

  1. You create a lead form in the admin and embed it on your website via iframe.
  2. A visitor enters their phone number and submits.
  3. Concierge generates a message (canned or AI‑prompt) and sends it via WhatsApp.
  4. The submission metadata is logged to the database.

Billing

Each AI‑mode reply (rule with a Prompt response) deducts one credit from the tenant’s balance. Canned replies, embedding lookups, intent classification, and persona safety checks are free. Credits are deducted before the AI call (optimistic deduction) and restored if generation or send fails. When credits reach zero, AI replies stop; canned defaults still send. Credits can be granted by management or purchased via Razorpay.

Platform model

Per-channel architectureHow each channel attaches to a tenant
ChannelModelToken storage
WhatsApp Shared WABA: you own one WABA, customers add numbers via Meta Embedded Signup Single platform token WHATSAPP_ACCESS_TOKEN
Instagram Per-account OAuth: Facebook Login, page tokens per customer Encrypted in KV, rotated daily by cron
Email Per-domain: each tenant registers domains and creates rules No tokens; Cloudflare Email Routing dispatches
Discord Guild → tenant: each Discord server is linked to one tenant Shared bot token (DISCORD_BOT_TOKEN env secret)

Architecture

  • Cloudflare Worker: Rust compiled to WebAssembly. Handles all HTTP routes, webhooks, and email events.
  • Cloudflare KV: tenant configs, account configs, tokens, sessions, routing rules, billing state, conversation contexts, persona.
  • Cloudflare D1: SQLite for message metadata, email metrics, lead form submissions, credit packs, payments, audit logs.
  • Cloudflare Workers AI: reply generation, prompt‑injection scanning, persona safety classification, BGE embeddings.
  • Cloudflare Queues: persona safety classifier (concierge-safety + concierge-safety-dlq).
  • Cloudflare Email Routing: triggers the Worker’s email handler for inbound emails.
  • Discord Interactions API: slash commands, button interactions, modal submissions via POST /discord/interactions.
  • Razorpay: payment processing for credit pack purchases.
rust src/types.rs
/// The normalized form every channel produces. Channel handlers
/// exist only to translate webhooks into this struct.
pub struct InboundMessage {
    pub id:                 String,
    pub channel:            Channel,        // WhatsApp | Instagram | Email | Discord
    pub sender:             String,
    pub sender_name:        Option<String>,
    pub recipient:          String,
    pub body:               String,         // in‑memory only, never persisted
    pub tenant_id:          String,
    pub channel_account_id: String,
    pub raw_metadata:       Value,
}