Security architecture

The decisions, and the ones we rejected.

The email security page is the plain-English version. This is the engineering version, for people who want to see the actual calls. Each decision, why we made it, and the alternative we turned down. If you find a gap, tell me.

Decision 01 · Access

Read-only Gmail and Calendar scopes. Nothing else.

Ariadot requests gmail.readonly and calendar.readonly, the most restricted scopes Google offers for these products. The app can read, and that is the only verb it has. It cannot send, reply, delete, label, or modify anything, because the permission to do so was never requested and Google enforces that at the token level.

This is the single most important security property, because it bounds the blast radius of every other failure. A bug, a compromise, a rogue dependency: the worst any of them can do is read, since the write door does not exist.

Rejected: broader scopes for a "smart triage that archives for you" feature. The convenience was not worth turning a read-only app into one that could alter your inbox. We would rather surface a loop and let you act than hold the power to act ourselves.

Decision 02 · Identity

Mask personal data before any cloud model, deterministically.

Before any email text reaches a cloud model, every personal entity (names, emails, phone numbers, account references) is replaced with a realistic synthetic stand-in. The model reasons over the disguised text and never sees who you are. The real values are substituted back at our edge, after the model responds, using a per-user map that is never sent to any model.

The stand-ins are stable (the same real value maps to the same fake every time, so the model can still reason "this is the same company as last week") and single, indivisible tokens with no internal whitespace, so the reverse substitution is an exact literal swap that the model cannot split or reformat.

Rejected: a second LLM "cleanup" pass to catch whatever the masking missed. We tried it. A generative model asked to rewrite a document rewrites it: it paraphrased and merged two unrelated entities into a company that never existed, and the corruption compounded. We deleted it. Detection uses a model; substitution and reversal are deterministic code that cannot get creative. Full write-up here.

Decision 03 · Failure

Fail closed. Never send unmasked text on a guess.

Finding names needs a model (regex handles the patterned data; names have no pattern). That model can fail: time out, error, return garbage. The critical decision is what we do when it does.

We fail closed. If name-detection cannot run reliably, we do not send the text to a cloud model and hope it was clean. The email is held and retried on the next cycle. "Probably clean" is not a privacy posture, so we do not treat it as one. The cost is a delayed brief for that email while detection is unhealthy, which is the correct trade.

Rejected: "bias toward false negatives, degrade gracefully" (send it through and accept the occasional miss). Silently sending unmasked names to a model when detection failed is not a position we could defend, so we removed it.

Decision 04 · Keys

Envelope encryption, so one leaked key isn't game over.

Your Gmail connection token is stored encrypted with AES-256-GCM in a two-layer envelope. A master key (held as a secret) never encrypts your token directly. Instead, each user gets their own data key, and that encrypts the token, while the per-user key is itself wrapped by the master key.

The point of the two layers: a leak of the master key alone yields nothing. A leak of one user's data key yields one user's tokens. To compromise everyone, an attacker needs the master key and a dump of the per-user key table. It is the same pattern AWS KMS and Google Cloud KMS use, and it is deliberately more than a small app strictly needs.

Rejected: encrypting tokens directly under a single master key (simpler, one secret). It means one leaked secret decrypts every user. The envelope is more moving parts for a real reduction in blast radius, which is the right trade for the one credential that can read an inbox.

Decision 05 · Retention

Two storage tiers: raw email is ephemeral, only the distilled bit is durable.

Storage is split in two. Ephemeral holds raw ingested email and is lifecycle-managed: a raw message is deleted within a day of being read, whether or not processing finished, with a daily sweep as a backstop. Durable holds only the useful distillate: your loops, your handbook, a quiet model of your routines, all encrypted at rest.

So we never accumulate a copy of your inbox. We keep "insurance renewal, due the 14th," not the email it came from. Your mail still lives in Gmail, where it always did, so there is no reason for us to hoard it.

Rejected: keeping raw email indefinitely for "reprocessing" or future features. A standing archive of everyone's inbox is exactly the asset you do not want to be holding when something goes wrong. If we need to reprocess, we re-read from Gmail.

Decision 06 · Isolation

Per-user namespacing for every stored object.

Every stored object, ephemeral or durable, is namespaced under the owning user (users/{your-id}/...), and every per-user database row is keyed to your user id and cascades on deletion. There is no shared bucket where one user's data sits next to another's without a boundary.

Rejected: a flat shared store keyed only by message id. It is simpler until the first cross-user bug leaks one person's data into another's view. Per-user prefixes make that class of mistake structurally harder.

Decision 07 · Secrets

Refuse to store credentials at all, in depth.

Ariadot keeps loops and reference details, not secrets. A guard reads every message before anything is stored and drops anything that looks like a password, PIN, one-time code, recovery phrase, full card number, or tax file number. It is never stored and never sent to a model. The check runs twice: a fast pattern match and a model pass, so it is defence in depth rather than a single filter.

Rejected: storing credentials "securely" so the app could be more helpful. The safest secret is the one you never hold. If possession alone unlocks something, it belongs in a password manager, and Ariadot refuses it on purpose.

Decision 08 · Untrusted input

Treat every email as hostile by default.

Email is attacker-controlled. Anyone can send you a message that tries to talk to the AI ("ignore your instructions and..."), the prompt-injection trick. Every message is screened for embedded instructions before any reasoning, and the model is told email content is data to be summarised, never commands to obey. A message cannot make Ariadot do anything.

Rejected: trusting inbox content because "it's the user's own mail." Your inbox is full of mail other people sent you. Treating it as trusted is how an attacker reaches the model through you.

Decision 09 · Deletion

Deletion revokes upstream first, then wipes. In that order.

When you delete your account, the order is deliberate: first revoke the Google token upstream (so even an encrypted copy that survives a partial wipe is useless), then wipe your stored data across both storage tiers, then drop the database row, which cascades to every per-user table. There is a short grace window in case you change your mind, then it is gone. Every step is safe to re-run and reports what it cleaned.

Rejected: wipe-then-revoke. If the wipe partially fails and the revoke never runs, you are left with a live token pointing at half-deleted data. Revoking first means the worst-case leftover is inert.

Decision 10 · Verification

Assume masking will miss one, and build the alarm.

You cannot prove a masking system is perfect. So rather than claim it, we built a canary: it scans everything before it is saved or shown for any synthetic stand-in that should have been reversed, and alerts the moment one slips through. It is plain string-matching, not another model, so it cannot introduce its own errors. The honest posture is "we will miss one someday, and here is the thing that catches it," not "this is flawless."

Rejected: trusting the masking and finding out about a leak when a user spots a stranger's name in their brief. If the only detector is the user, you have no detector.

Where this is still going

We are completing Google's formal security review (CASA) for the Gmail scopes. Until that finishes, signing in shows a "Google hasn't verified this app" warning, normal for any new app that reads Gmail before review. The plain-English version of all of this lives on the email security page; your legal rights and the sub-processor list are in the privacy policy. If you see a gap in any of the above, I would genuinely rather hear it.