Reference
Architecture
Concierge is a Cloudflare Worker (Rust → WebAssembly). All persistent state lives in Cloudflare D1 (metadata, payments) and KV (configs, sessions, in-flight buffers). No message content is stored at rest.
Inbound channels
WhatsApp Business
- Transport: Meta Cloud API webhooks at
POST /webhook/whatsapp. - Auth: WhatsApp Embedded Signup OAuth → tenant exchanges code for system-user token, scoped to one phone number ID.
- Tenant lookup: reverse index
wa_phone:{phone_number_id}→ WhatsApp account id → tenant id. - Outbound: Meta Graph API
POST /{phone_number_id}/messageswith the system-user token. - Limits: no message-history endpoint, so post-hoc batching can't reconstruct text from older messages: see Reply Buffer below.
Instagram DMs
- Transport: Meta webhook events, same handler family as WhatsApp.
- Auth: Facebook Login → finds the user's Pages → finds the IG business account on each page → stores per-page access token (AES-256-GCM encrypted with
ENCRYPTION_KEY). - Tenant lookup:
ig_page:{page_id}reverse index → IG account → tenant. - Outbound: Graph API
POST /me/messageswith the page token.
Discord
- Install: OAuth2
scope=bot+applications.commands, permission bitfield76928(SEND_MESSAGES | VIEW_CHANNEL | READ_MESSAGE_HISTORY | ADD_REACTIONS | MANAGE_MESSAGES). Callback at/auth/discord/callbackrecordsguild_id → tenant_idin KV. - Inbound transport: Application Webhook Events at
POST /discord/events.MESSAGE_CREATEevents drive AI auto-reply. - Triggers: per-tenant flags on
DiscordConfig:inbound_mentions(reply when @-mentioned),inbound_channel_ids[](reply to every message in these channels). DMs unsupported with the shared bot. - Interactions:
POST /discord/interactionshandles slash commands (/status,/domains list,/rules list) and buttons (Reply, Approve, Reject, Drop). - Signature verification: Ed25519 over
timestamp + bodyusingDISCORD_PUBLIC_KEY; same scheme for both endpoints. - Outbound: shared bot token (
DISCORD_BOT_TOKENenv secret), POST to/channels/{id}/messagesvia thebotrelaycrate.
- Transport: Cloudflare Email Routing. Every
*.cncg.emailsubdomain gets MX records pointed at the worker. Inbound mail invokes the worker'semailevent handler with the raw RFC 2822 bytes. - Tenant lookup:
email_domain:{domain}KV reverse index. - Routing rules: per-domain ordered list in KV at
email_rules:{tenant}:{domain}. Each rule hasMatchCriteria(from, to, subject, body globs + has_attachment) and anEmailAction(drop, spam, forward_email, forward_discord, ai_reply). - Outbound: Cloudflare Email Service via the
EMAILbinding's structured-message API. Sender domain must be onboarded in the Email Service dashboard. - Reverse aliases: when forwarding, the From header is rewritten to a generated address on the tenant's domain so replies route back through Concierge. Mapping stored in
email_reverse:*with 30-day TTL. - Loop detection: outbound messages carry
X-EmailProxy-Forwarded; inbound messages with that header are rejected.
AI reply pipeline
- Inference binding: Cloudflare Workers AI
AIbinding. Default models:llama-4-scout-17b-16e-instructfor replies,llama-3.1-8b-instruct-fastfor prompt-injection scanning and persona safety classification,@cf/baai/bge-base-en-v1.5for embeddings. Reply and fast models are configurable viaAI_MODEL/AI_FAST_MODELenv vars; the embedding model id is centralized inai::EMBEDDING_MODEL. - Persona prompt: tenant-wide. Lives in
PersonaConfig.sourceas one of three variants:Preset(PersonaPreset),Builder(PersonaBuilder), orCustom(String): never a mix.PersonaConfig::active_prompt()resolves the chosen variant on demand (preset constant, generated from builder fields, or the raw custom string). - Reply rules: per-channel
ReplyConfig { enabled, rules: Vec<ReplyRule>, default_rule, wait_seconds }. The pipeline walksrulesin order; first match wins; otherwise the mandatorydefault_rulefires. Each rule has amatcher(StaticText { keywords }for case-insensitive substring orPrompt { description, embedding, threshold }for cosine-similarity intent matching) and aresponse(Canned { text }sent verbatim, orPrompt { text }appended to the persona prompt and run through the LLM). - Embedding step: if any
Promptrule exists, the inbound message is embedded once per delivery and compared viaai::cosineto each rule's pre-computed embedding (computed at rule-save time, stored in the rule alongside the model id). Default threshold is 0.72; tunable per rule. - Persona safety gate: AI replies (
ReplyResponse::Prompt) are blocked unless the tenant's persona isApprovedand its hash hasn't drifted since the last vetting. Canned responses are unaffected. See "Persona safety queue" below. - Final prompt: the system prompt sent to the reply model is
persona.active_prompt() + "\n\n" + rule_prompt. The user message wraps the inbound text and sender name as a "Context: ... Generate an appropriate response." block. - Injection scan: incoming bodies are truncated to 1000 chars, then a fast classifier checks for instruction-override patterns. Rejected messages skip the entire pipeline (no rule matching, no reply).
- Billing: only the AI reply step deducts a credit; static
Cannedresponses are free, and embeddings/intent matching/safety classification are free. Deduction happens before the AI call (optimistic), restored on any failure path. Free monthly grant of 100 credits per tenant. - Pricing: flat $0.02 / ₹2 per AI reply, no tiers.
UNIT_PRICE_PAISE = 200,UNIT_PRICE_CENTS = 2insrc/billing/mod.rs.
Persona safety queue
- Trigger: the admin persona handler (
POST /admin/persona) computessha256(active_prompt())on save; if it differs fromsafety.checked_prompt_hash, it setssafety.status = Pendingand sends aSafetyJob { tenant_id, prompt_hash }onto theSAFETY_QUEUEproducer binding. Saves that don't change the active prompt skip enqueue. - Consumer:
#[event(queue)]insrc/lib.rsdispatches tosafety_queue::handle_batch. Each job re-reads the persona, drops the job if the prompt hash has drifted (a newer save has already enqueued), runssafety::classify_personaagainst the fast model, and writesApprovedorRejected { vague_reason }back to KV withchecked_prompt_hashandchecked_at. - Classifier: system prompt enumerates Calculon Tech's content policy (no incitement, harassment, discrimination, sexualization of minors, self-harm, illegal-activity promotion, unconsented impersonation). The model returns strict JSON
{"verdict":"approve"|"reject","category":"..."}. Categories are logged for abuse review but never echoed; the user-facing rejection text comes from a fixed mapping insafety::vague_reason_forso users can't iterate prompts against the classifier. - Failure mode: classifier or KV failures call
message.retry(); the queue's DLQ policy (3 retries, thenconcierge-safety-dlq) takes over. While the persona staysPending, AI replies are blocked but canned default rules still send. - Bindings: producer + consumer for
concierge-safety, DLQconcierge-safety-dlq. Both queues must exist before deploy: see Deploy.
Localization
- Locale model: every tenant carries a BCP-47 tag (
Tenant.locale, e.g.en-IN,en-US) and an optional independentcurrencyoverride.src/locale.rs::Localebundles the two into a single value carried through templates and handlers, replacing the previous tangle ofif currency == "INR"branches. - Resolution chain (first hit wins): tenant-stored locale →
Accept-Languageheader (parsed via theaccept-languagecrate, intersected with the supported set) →cf-ipcountrymapping (IN→en-IN, defaulten-US) → hardcodeden-IN. Set once at signup; admin-overrideable from/admin/settings/currency. - Number / currency formatting:
helpers::format_countandhelpers::format_moneyuseicu::decimal::FixedDecimalFormatter(icucrate,compiled_datafeature).en-INrenders1,00,000(lakh / crore grouping);en-USrenders100,000. INR shows whole rupees with the ₹ symbol; USD shows two decimals with$. - Translation:
fluent-bundlewith FTL files atassets/locales/{tag}/messages.ftl, baked in at build time viainclude_str!.src/i18n.rsexposes aOnceLock-backedTranslatorandt(locale, key)sugar. Lookup falls back to the canonicalen-INbundle, then to the literal key (so a missed key is loud in the rendered HTML and caught by template tests). - Adding a locale: drop a new FTL file under
assets/locales/{tag}/, add the tag toTranslator::newandLocale::from_request's match arms, and register it inlocale::parse_supported. CLDR data for the new locale is shipped automatically via thecompiled_datafeature. - Out of scope: AI-generated reply content stays English. Per-language persona prompts and a classifier model that handles target languages well are deferred: see the persona safety queue notes above.
Reply buffer (Durable Object)
- Class:
ReplyBufferDOinsrc/durable_objects/reply_buffer.rs; bindingREPLY_BUFFER. - Keying: one DO instance per
{tenant_id}:{channel}:{sender}conversation. - Sliding window: each push appends to a pending list and resets the alarm to
now + wait_seconds. Bursts collapse into one alarm fire. - Drop-after-send: the alarm handler clears DO storage before calling the LLM. Bodies live in DO state for ≤ wait_seconds (5s default), then gone.
- Bypass:
wait_seconds = 0on the channel'sAutoReplyConfigskips the buffer for instant replies.
Approval relay
- Discord: AI drafts post to the tenant's approval channel as embeds with Approve/Reject buttons. Button click triggers
/discord/interactions→ component handler → outbound send via the originating channel adapter. - Conversation context: stored in KV at
conv:{id}with 7-day TTL, holds the Discord message id and origin channel/sender so the reply routes back correctly. - Email: approval-by-email digest sent at the tenant's configured cadence (default 15 min); links contain signed tokens for one-click approve/reject.
Lead capture forms
- Storage:
LeadCaptureFormin KV atlead_form:{id}, indexed by tenant. - Rendering:
GET /lead/{id}/{slug}serves an iframe-friendly HTML form; CSP andallowed_originsrestrict where it embeds. - Submission:
POSTto the same path validates the phone number and triggers a WhatsApp message via the configured account, then logs tolead_form_submissionsin D1.
Storage layout
D1 tables
tenants: id, email (UNIQUE), facebook_id, plan, currency.messages: unified inbound/outbound metadata (channel, direction, sender, recipient, action_taken). No body content.whatsapp_messages,instagram_messages,email_messages,email_metrics,lead_form_submissions: channel-specific logs.tenant_billing: credit ledger as JSON (entries with optional expiry).payments: Razorpay event log for compliance.audit_log: management-action history.
KV keys
session:*,csrf:*: auth cookies (TTL 7d).whatsapp:{id},instagram:{id},lead_form:{id}: per-resource configs. Channel records embed their ownReplyConfig(rules + default rule + wait_seconds).tenant:{tenant}:whatsapp:{id}etc.: per-tenant indexes (empty values; existence is the index).wa_phone:*,ig_page:*,email_domain:*: webhook → tenant reverse indexes.email_domains:{tenant},email_rules:{tenant}:{domain},email_reverse:*: email config + alias mapping.discord_guild:{guild_id},discord_config:{tenant}: guild ↔ tenant.onboarding:{tenant}: wizard state. Holds thePersonaConfig(source variant + safety status) anddefault_wait_secondsapplied to newly connected channels.conv:{id}: approval-relay conversation context (TTL 7d).
Auth
- Login: Google OAuth (
/auth/callback) and Facebook Login (/auth/facebook/callback). Same tenant gets linked to both providers if their email matches. - Session: 7-day HttpOnly cookie; CSRF via double-submit cookie checked on every
POST/PUT/DELETEunder/admin. - Management panel:
/manage/*protected by Cloudflare Access (verifies theCf-Access-Jwt-Assertionheader against the team's JWKS).
Outbound APIs Concierge calls
- Meta Graph API for WhatsApp + Instagram + Facebook Login.
- Discord REST API (
discord.com/api/v10) for messages, channels, guild lookup. - Razorpay API for orders, subscriptions, payment verification.
- Cloudflare Workers AI binding (no HTTP: direct binding call). Used for reply generation, prompt-injection scan, persona safety classification, and BGE embeddings.
- Cloudflare Queues binding (
SAFETY_QUEUE) for fanning persona safety jobs to the queue consumer.
Limits and known constraints
- Discord DM auto-reply is unsupported with the shared bot: incoming DMs hit the events endpoint with no
guild_id, so we can't attribute them to a tenant. - WhatsApp has no message-history API; the reply buffer relies on its own DO state to reconstruct bursts.
- Cloudflare Email Service requires sender domains to be onboarded in the dashboard before sends from them succeed; new tenant subdomains may need manual onboarding until that step is automated.
- No per-message body storage. If you need a conversation web-view, that's a future feature requiring a schema change and ToS update.