How do you build privacy-first AI for sensitive data?

Watch the video

Quick version? Watch the video. For full context, read the article.

Open video page →

Most AI demos use data that can't hurt anyone: public text, marketing copy, code. But the moment the input becomes personal — medical complaints, legal files, a diary, a dream — the question changes. It's no longer "what can AI do", but "what is AI allowed to do, and what must it deliberately not do".

I built a product that sits exactly on that edge: Dreamalizing, an AI that lets people explore their dreams. Dream content is about the most personal input there is. That's precisely why it's a sharp test case for a question every organization putting AI on sensitive data runs into: how do you keep AI useful and safe at the same time?

The problem

How do you build privacy-first AI for sensitive data? - probleem

With sensitive data, two requirements pull against each other:

  • AI has to be useful — otherwise it's just an empty text field
  • AI must never pretend to know the truth — no false certainty, no diagnosis, no claims it can't back up

A naive product picks one side: either an assertive "oracle" that fakes certainty, or a harmless storage field with no value. Both are wrong. And then there's the hardest requirement: the data itself must not leak — not to third parties, not to a large AI platform, not into an audit log where it doesn't belong.

The solution: treat AI as a system, not a feature

How do you build privacy-first AI for sensitive data? - oplossing

Privacy-first AI doesn't come from one setting, but from a series of deliberate choices. These are the six I made in Dreamalizing — and they apply just as well to a healthcare, legal or HR context:

  1. Guided exploration instead of black-box output. The AI doesn't interpret; it asks targeted questions. The user stays the authority. That's not a UX choice but a safety choice: a model that asks questions can present less nonsense as truth.

  2. Bounded AI with limits in the system. No therapy, no diagnosis, no medical claims. Those limits don't live in a disclaimer at the bottom, but in the prompts, the guardrails and the evaluation — enforceable, not cosmetic.

  3. Encrypted storage on our own infrastructure. Sensitive input is stored encrypted (AES-256-GCM, per-user key) on our own infrastructure — not at a large cloud vendor by default.

  4. Local inference. The language models run locally, enforced through configuration. So sensitive input doesn't go to an external AI service by default. That's the difference between "we promise privacy" and "technically it can't be otherwise".

  5. Opt-in, not opt-out. Pattern recognition across multiple inputs only happens when the user deliberately chooses it. The default is: as little as possible.

  6. Deletable and traceable. A deletion request actually wipes the data completely (GDPR), and the chain is auditable — you can show afterwards what happened to the data.

Why this goes beyond a dream app

The dream is the extreme example here, but the principles are generic. Context as truth (RAG on your own sources), measurability (quality, cost, latency as metrics), governance (roles, logging, policies) and clear boundaries — that's exactly the approach I use to bring AI to production in client work around AI in production.

The difference between an AI pilot that stalls and an AI system that runs is rarely in the model. It's in whether you take the sensitive data, the boundaries and the traceability as seriously as the smart answer.

Want to put AI to work on data that really matters?

Want to learn more?

Get in touch to discuss what this could mean for your organization.

Contact us