Docs / Start here / How it works

Architecture

How it works.

Concierge runs as a single Cloudflare Worker that handles four messaging channels through one unified pipeline. Every inbound event (a WhatsApp message, an Instagram DM, an email arriving at your catch‑all, or a lead form submission) is normalized into the same shape and processed by the same steps.

The pipeline · runs identically for every channel

Inbound

WhatsAppwebhook

Instagramwebhook

Emailemail() handler

Lead formPOST /lead/…

Normalize

Common pipeline

01Normalize → InboundMessage

02Tenant + credit check

03Reply rules · keyword + embedding

04Persona prompt + action dispatch

05Log metadata · D1

Dispatch

Actions

Canned replyStatic text · no AI · free

AI replyWorkers AI · llama‑4‑scout · persona + rule prompt

Forward → DiscordEmbed + Reply / Approve / Drop

Forward emailReverse‑alias for replies

Drop / spam rejectSilent or NDR

The unified pipeline

Regardless of which channel a message arrives on, it’s normalized into an InboundMessage struct (channel, sender, recipient, tenant, metadata), and processed through the same steps. This is the spine of the codebase. Channel handlers exist only to translate webhooks into InboundMessage and to dispatch outbound replies.

Channel handler receives the raw event

WhatsApp and Instagram POST signed webhook payloads to /webhook/*. Email arrives via Cloudflare’s send_to_worker action, invoking the Worker’s email() entrypoint. Lead forms POST to /lead/{slug}.
Normalize into InboundMessage

Channel, sender, recipient, tenant ID, and any channel‑specific metadata. From here, every code path is identical.
Log metadata to messages

An append‑only row in D1: channel, direction, sender ID, recipient ID, tenant, timestamp. Message content is never persisted: only the fact that something happened, with whom.
Reply rules evaluate in order

Each channel carries an ordered ReplyConfig. Rules pair a matcher (case‑insensitive keyword substring or BGE‑embedding cosine similarity over a user‑written intent description) with a response (canned text or an AI prompt). First match wins; the mandatory default rule fires if nothing else does.
Action dispatches · canned text or LLM call

Canned responses send verbatim, no credit charge. Prompt responses concatenate the tenant’s persona prompt with the rule’s prompt and run the main reply model; one credit is deducted before the call (optimistic) and restored if generation or send fails. AI replies are blocked unless the persona’s asynchronous safety check has approved the current prompt.

Privacy note

The unified messages table stores only metadata: channel, direction, sender ID, recipient ID, tenant, timestamp. No subjects, no bodies, no attachments. AI replies are generated synchronously from in‑memory data and discarded.

WhatsApp / Instagram auto‑reply

Both Meta channels run through the same reply pipeline:

Meta delivers the inbound message to POST /webhook/whatsapp or POST /webhook/instagram.
Concierge looks up the channel account (phone number ID for WhatsApp, page ID for Instagram) and its ReplyConfig.
The body is truncated to 1000 chars and run past a fast prompt‑injection scanner; injection attempts are dropped.
If any rule has a Prompt matcher, the inbound message is embedded once and compared via cosine similarity to each rule’s precomputed embedding.
Rules are walked in order; the first match wins. Otherwise the mandatory default rule fires.
Canned responses send verbatim with no credit charge. Prompt responses combine persona + rule prompt + a context block, deduct one credit, and run the main LLM. AI replies require the tenant’s persona to be safety‑Approved.

Persona & safety check

The tenant picks a curated preset, fills in the builder (tone / catch‑phrases / off‑topic boundaries), or writes a raw custom prompt at /admin/persona.
On save, the active prompt is hashed; if the hash differs from the last‑vetted hash, status flips to Pending and a SafetyJob is enqueued onto Cloudflare Queue concierge-safety.
The queue consumer reads the job, re‑checks the hash (drops stale jobs), and runs the safety classifier with Calculon Tech’s content policy.
The result lands back in KV as Approved or Rejected with a vague user‑facing reason.
While Pending or Rejected, AI replies are blocked tenant‑wide; canned default replies still send.

Email routing

An email arrives at your catch‑all domain (configured via Cloudflare Email Routing).
Cloudflare triggers the Worker’s email() handler.
Concierge extracts the domain, looks up the tenant, and parses the MIME message.
Routing rules are evaluated in priority order using glob‑pattern matching on from, to, subject, body, and has_attachment.
The matched rule’s action executes: drop, spam reject, forward email, forward to Discord, or AI reply with approval.
For email forwarding, a reverse‑alias address is generated so replies route back through Concierge to the original sender.

Glob semantics last match wins

*: any sequence of characters, zero or more
?: exactly one character
case: matching is case‑insensitive
combine: all non‑None criteria are AND‑ed (from + to + subject + body + has_attachment)
order: rules are sorted by ascending priority; the highest priority match wins

Discord relay

When a message from any channel is forwarded to Discord (via email routing rules or future direct integrations), it arrives as an embed with Reply, Approve, and Drop buttons.
A ConversationContext is saved in KV, linking the Discord message to the original channel, sender, and reply metadata.
When someone clicks Reply, a modal opens for composing a response.
The reply is sent back through the originating channel using the stored context.
For AI‑generated drafts, Approve sends the draft and Drop discards it.

Lead capture forms

You create a lead form in the admin and embed it on your website via iframe.
A visitor enters their phone number and submits.
Concierge generates a message (canned or AI‑prompt) and sends it via WhatsApp.
The submission metadata is logged to the database.

Billing

Each AI‑mode reply (rule with a Prompt response) deducts one credit from the tenant’s balance. Canned replies, embedding lookups, intent classification, and persona safety checks are free. Credits are deducted before the AI call (optimistic deduction) and restored if generation or send fails. When credits reach zero, AI replies stop; canned defaults still send. Credits can be granted by management or purchased via Razorpay.

Platform model

Per-channel architectureHow each channel attaches to a tenant

Channel	Model	Token storage
WhatsApp	Shared WABA: you own one WABA, customers add numbers via Meta Embedded Signup	Single platform token `WHATSAPP_ACCESS_TOKEN`
Instagram	Per-account OAuth: Facebook Login, page tokens per customer	Encrypted in KV, rotated daily by cron
Email	Per-domain: each tenant registers domains and creates rules	No tokens; Cloudflare Email Routing dispatches
Discord	Guild → tenant: each Discord server is linked to one tenant	Shared bot token (`DISCORD_BOT_TOKEN` env secret)

Architecture

Cloudflare Worker: Rust compiled to WebAssembly. Handles all HTTP routes, webhooks, and email events.
Cloudflare KV: tenant configs, account configs, tokens, sessions, routing rules, billing state, conversation contexts, persona.
Cloudflare D1: SQLite for message metadata, email metrics, lead form submissions, credit packs, payments, audit logs.
Cloudflare Workers AI: reply generation, prompt‑injection scanning, persona safety classification, BGE embeddings.
Cloudflare Queues: persona safety classifier (concierge-safety + concierge-safety-dlq).
Cloudflare Email Routing: triggers the Worker’s email handler for inbound emails.
Discord Interactions API: slash commands, button interactions, modal submissions via POST /discord/interactions.
Razorpay: payment processing for credit pack purchases.

rust src/types.rs

/// The normalized form every channel produces. Channel handlers
/// exist only to translate webhooks into this struct.
pub struct InboundMessage {
    pub id:                 String,
    pub channel:            Channel,        // WhatsApp | Instagram | Email | Discord
    pub sender:             String,
    pub sender_name:        Option<String>,
    pub recipient:          String,
    pub body:               String,         // in‑memory only, never persisted
    pub tenant_id:          String,
    pub channel_account_id: String,
    pub raw_metadata:       Value,
}

How it works.

The unified pipeline

Channel handler receives the raw event

Normalize into `InboundMessage`

Log metadata to `messages`

Reply rules evaluate in order

Action dispatches · canned text or LLM call

WhatsApp / Instagram auto‑reply

Persona & safety check

Email routing

Discord relay

Lead capture forms

Billing

Platform model

Architecture

How it works.

The unified pipeline

Channel handler receives the raw event

Normalize into InboundMessage

Log metadata to messages

Reply rules evaluate in order

Action dispatches · canned text or LLM call

WhatsApp / Instagram auto‑reply

Persona & safety check

Email routing

Discord relay

Lead capture forms

Billing

Platform model

Architecture

Normalize into `InboundMessage`

Log metadata to `messages`