SnailText
EN

Voice to text on Mac

Voice to text on Mac — dictation in any app, no cloud

Press a hotkey, talk, the text lands at your cursor. Works in Slack, Notion, VS Code, Mail — anywhere you type. Audio stays on your Mac, runs on Apple Silicon Metal.

The short version

Voice to text on Mac means a macOS app that converts speech to text in any application, with the speech recognition model running locally on Apple Silicon. macOS ships with Apple Dictation built in — useful for short bursts inside Apple apps, but it auto-stops after 30 seconds of silence and integrates inconsistently with third-party apps. SnailText runs the Whisper speech model locally with Metal GPU acceleration — no silence cutoff, works in every text field via a global hotkey, audio never leaves your device.

Apple Dictation vs SnailText, structurally

macOS ships with built-in dictation. For short, casual use inside Notes or Messages it is fine. For sustained work it has structural limits.

FeatureApple DictationSnailText
Recording lengthAuto-stops after 30 seconds of silence (per Apple support docs); integrates inconsistently with third-party appsUnlimited — runs until you press the hotkey again
Where it worksNative Apple apps and a subset of third-party apps that opt inAny text field in any app — Slack, VS Code, Cursor, Telegram, terminals, web inputs
Model sizeCompact Apple model, not user-selectableWhisper Tiny through Large v3 (Parakeet TDT on Pro) — pick your tradeoff
Custom vocabularyNot user-editable beyond what Apple already knowsDictionary for proper nouns + snippets for boilerplate (Pro)
HotkeyFixed to Fn-Fn or single modifier; activation conflicts in many appsGlobal Cmd+Shift+Space (configurable); no focus steal
Offline guarantee"Enhanced Dictation" downloads offline model; default varies by macOS versionAlways offline by design — no cloud option, no opt-out toggle to forget

Apple's offering is best understood as a system convenience. SnailText is the tool you reach for when dictation is part of how you actually work.

Why Apple Dictation is not enough for daily voice to text

Apple Dictation works. It runs on-device on any Mac with an M1 chip or newer, the transcription is acceptable for short bursts, and it costs nothing. For a quick text message or a one-line search query, it does the job.

It stops being enough the moment you try to use it for real work. Apple's own documentation states Dictation has no hard duration timeout on Apple Silicon — but it auto-stops after 30 seconds of detected silence, which includes the natural pauses you take while thinking. Re-activating the hotkey two or three times in a single email becomes routine.

The second is the accuracy on technical content. Apple Dictation is fine on general clear speech and visibly worse on code, jargon, accented English, and domain-specific vocabulary. Third-party tools running Whisper-class models are materially better.

The third is the integration boundary. Apple Dictation works inside Apple apps and most native macOS text fields. It does not have a consistent flow across web apps, Electron apps, or terminals. You end up disabling it in half the places you want to use it.

Apple Silicon dictation: why Whisper runs fast on M-series

The whisper.cpp engine, which powers most modern Mac dictation apps including ours, compiles with Apple Metal GPU acceleration by default on Apple Silicon. Metal is Apple's GPU API, and on M-series chips it sits directly on top of the unified memory pool. The model weights and the audio buffer live in the same physical memory as your application code — no memory copy between CPU and GPU.

That single architectural detail is why M-series Macs run larger Whisper models faster than equivalent Intel hardware, often in real time or better. On Windows, the same model class typically requires a discrete NVIDIA GPU to reach comparable latency.

For per-chip latency numbers across M1 through M4 with Whisper Small / Medium / Large v3, see our dictation for Mac deep-dive — it cites third-party Metal benchmarks from Voicci, PromptQuorum, and DEV Community testing. SnailText also streams inference on closed phrases as you speak, so end-to-end wait at the cursor feels shorter than raw model-pass timing suggests.

Voice to text on Mac for code, docs, and clinical work

The hotkey is the same in every app. Cmd+Shift+Space (configurable). Press once, recording starts. Press again, transcribed text lands at your cursor. No menu, no toolbar, no focus change. See how it works for the full pipeline.

Custom dictionary (Pro) handles the words Whisper does not know yet — your stack names, your colleagues' names, jurisdiction-specific legal terms, DSM codes for clinicians. Add a term once and SnailText replaces the misheard version on the way to the text field. For audience-specific framing see developers, lawyers, and therapists.

Audio never leaves your Mac. The buffer stays in RAM during recording and is discarded the moment the text is ready. Verifiable in Little Snitch or Lulu — no outbound traffic during dictation. For the architectural argument see offline dictation. On Windows? See voice to text on Windows.

Frequently asked questions

Does this work on Intel Macs?

+

Technically yes, in degraded form. The whisper.cpp engine works on Intel CPUs but inference speed without Metal acceleration is significantly slower. Real-time dictation with the small model is borderline acceptable on a high-end Intel iMac from 2019 or 2020. We recommend Apple Silicon (M1 or later) for the actual experience.

How is this different from Apple Dictation?

+

Apple Dictation is built into macOS, runs on-device on Apple Silicon, and is free. Apple's docs state there is no hard duration timeout, but Dictation auto-stops after 30 seconds of silence — pauses for thought count. There is also no extensibility (custom vocabulary, snippets, configurable hotkey). SnailText runs larger Whisper-class models, has no silence cutoff, supports custom vocabulary and snippets (Pro), and works through a unified hotkey across all apps including Slack, browser-based tools, and terminals.

Do you upload my audio anywhere?

+

No. Local Whisper runs in our app on your Mac. The audio buffer stays in RAM during a recording session and is not written to disk. We do not upload audio to any server in any mode, free or paid.

Does it use the Neural Engine?

+

No — and that's fine. whisper.cpp runs on Metal, Apple's general-purpose GPU compute API. The Neural Engine is a separate accelerator only addressable through Apple's private frameworks (Core ML, MLX); there is no public ggml backend for it as of 2026. The Metal path on M-series is fast enough that the absence of an ANE backend does not matter for dictation latency.

What about HIPAA, GDPR, regulated industries?

+

The simplest path to compliance for voice dictation is to not transmit the audio anywhere. Local Whisper does exactly that — no Business Associate Agreement needed, no Data Processing Agreement, no cross-border data transfer assessment. Data that never leaves your device is the easiest data to keep compliant.

How accurate is voice to text on Mac with SnailText?

+

Accuracy depends on which Whisper model you pick, your microphone quality, and your speech pattern. The free tier ships with Whisper Base, which handles everyday English dictation cleanly. Pro adds Whisper Medium and Large v3 for better technical jargon, accented English, and noisy environments. We are publishing per-model methodology separately.

Voice to text on Mac. Local. Free to start.

Download for macOS 12 or later. Apple Silicon recommended. No account needed.