Dictation is the most personal text you produce. It is your unfiltered voice — half-formed thoughts, client names, patient details, passwords read aloud, the email you have not decided to send yet. So it is worth asking a plain question before you let an app listen all day: where does your voice actually go?
The answer is not the same for every app, and it is not always what the marketing says. But it comes down to a single fork.
The one question that decides it
Whether your dictation is private comes down to one thing: where your audio is processed. There are only two answers, and they lead to completely different privacy stories.
Either the app transcribes on your own device — in which case your voice never leaves it — or it sends your audio over the internet to a company’s servers, in which case it does. Everything else (encryption, retention policies, privacy modes) is detail layered on top of that one fact. A cloud app can encrypt your audio beautifully and still be an app that sends your voice to someone else’s computer. An on-device app does not have to make any promises about handling your data, because it never receives it.
So when you evaluate any dictation tool, start there: local or cloud. The rest follows.
What cloud dictation actually sends
Cloud dictation, by definition, sends your audio off your device. But “your audio” is often not the only thing that travels. Here is what can leave your machine when you dictate into a cloud app:
| What gets sent | When | Why it matters |
|---|---|---|
| Your raw audio | Every dictation | Your actual voice leaves the device; may be stored and, in some apps, used to train models |
| The transcript | Every dictation | The text of everything you said, held on a server you do not control |
| Screen context / screenshots | If “context awareness” is on | Content of your active window — emails, code, records — can travel alongside the audio, sometimes to third-party APIs |
| A voiceprint | Depends on the service | Voice can be biometric data that uniquely identifies you, which raises the legal stakes under GDPR |
Privacy researchers describe three points where cloud dictation creates exposure: at capture, in transmission, and in storage. The audio is recorded, sent across the network, and then kept — often for weeks or months, sometimes shared with the cloud infrastructure providers behind the service, and in some cases used to improve the company’s models. None of that is necessarily malicious. It is simply what it means to do the work somewhere other than your own machine.
”Privacy Mode” is not the same as offline
This is the most common source of false comfort, so it is worth being precise. Many cloud dictation apps offer a “Privacy Mode,” and people reasonably assume it means their voice stays on their device. It does not.
In practice, Privacy Mode means zero-retention cloud processing: your audio is still sent over the internet to the provider’s servers and transcribed there — it just is not stored afterward. That is a real, meaningful policy. But it is a policy, not an architecture. Your voice still leaves your machine and passes through someone else’s system, and you are trusting them to delete it as promised. Offline dictation is a different thing entirely: the audio never leaves the device, so there is nothing to retain, delete, or trust.
The distinction matters most exactly when privacy matters most. “We delete it after” is a weaker guarantee than “it never left.”
When it stops being abstract
Privacy is easy to wave off until you see what an app is actually doing in the background. In April 2026, an independent technical investigation documented the behavior of one popular cloud dictation app’s desktop client on macOS, with evidence from binary analysis and runtime logs.
The findings included system-wide keystroke interception, 1,688 app-focus and URL changes logged in 30 hours, accessibility-tree harvesting up to nine levels deep, and a 694 MB local database holding raw audio (198 MB), full transcripts, and the contents of text boxes up to 36,191 characters long. The app’s privacy policy described “audio Inputs” and “Usage Data” but did not disclose system-wide keystroke interception, always-on app and URL tracking, or screen-content reading.
The point is not that one company is uniquely bad. It is that once your dictation tool runs with broad system access and a network connection, the gap between what a policy says and what the software does is invisible to you. The only version of this story that cannot go wrong is the one where the data never leaves your machine in the first place.
The compliance angle: HIPAA and GDPR
If you dictate anything regulated — clinical notes, legal work product, anything covered by privacy law — the architecture question becomes a compliance question.
Under HIPAA, any vendor that processes protected health information must sign a Business Associate Agreement (BAA). Apple and Google do not sign BAAs for their built-in dictation, which is why Siri and Google Voice are not HIPAA compliant for patient data out of the box. Cloud dictation vendors that want healthcare customers have to offer a BAA, encryption, access controls, and audit logs — and you have to verify all of it.
Under GDPR, voice can be treated as biometric data when it is processed to identify a person, which puts cloud voice services in a more demanding category for storage and processing.
On-device dictation takes a different route around both. If the audio never leaves the device, no business associate processes it, so there is no BAA to sign; there is no cross-border transfer to assess and no external processor to account for. As Mac-privacy and healthcare-IT writers put it, the simplest path to compliance for sensitive dictation is to not transmit the audio at all. (This is an architectural argument, not a certification — always confirm against your own obligations. If you work in one of these fields, our pages for therapists and for lawyers go deeper.)
What on-device dictation actually changes
On-device dictation removes the entire question of what gets sent, because nothing does. The model that turns your speech into text lives on your own machine, so the audio is captured, transcribed, and turned into text without ever touching a network.
The good versions go a step further in how they handle the audio even locally. In SnailText, for example, the audio buffer stays in memory (RAM) during a recording and is never written to disk — so it is not just kept off the network, it is not persisted at all. There is no keystroke logging, no screenshotting of your active window, and no app-and-URL tracking. There is nothing to retain because there is nothing collected.
That is the whole privacy story, and it is short by design: your voice goes from your microphone to the text field, and stops there.
How to check any dictation app yourself
You do not have to take any company’s word for it. Run a dictation tool through these questions and the privacy picture becomes clear fast:
- Where is the audio processed — on my device, or on the company’s servers? (If it needs an internet connection to transcribe, it is cloud.)
- Is the audio stored, and for how long? Is it ever used to train models?
- Does it read my screen or capture screenshots through a “context” feature?
- Does it log keystrokes or track which apps and URLs I use?
- If I handle regulated data, does the vendor sign a BAA, or does on-device processing remove the need?
- Is “Privacy Mode” offline, or just zero-retention cloud?
If the honest answer to “where is the audio processed” is “on my device,” most of the rest stops mattering. If it is “the cloud,” every other answer is a promise you are choosing to trust.
How SnailText handles this
SnailText is local by design. It runs Whisper and Parakeet on your own Mac or Windows machine, so your voice is transcribed on-device and never sent to a server. The audio buffer lives in RAM and is never written to disk. There is no keystroke logging, no screenshotting, and no app or URL tracking — the things that turn a dictation tool into a surveillance tool simply are not in the product.
That is what lets us give a straight answer to “is my dictation private”: yes, because your voice never leaves your machine. It is free to start, needs no account, and the model downloads once and then works offline — download SnailText and the audio stays on your device from the first word.
The short version
Whether your dictation is private comes down to where your audio is processed. Cloud apps send your voice — and sometimes your screen — to servers where it can be stored, shared, or used for training; “Privacy Mode” reduces retention but your audio still leaves the device. On-device apps process everything locally and send nothing, which also makes regulated work far simpler because data that never leaves your machine needs no agreement to protect. Ask one question of any dictation tool — local or cloud — and the privacy answer follows from there.