Download Web Installer 659 KB — Windows 10/11 64-bit — downloads ~1.7 GB dependencies during setup

Whisper Dictation by Firsh

Press a hotkey, speak, and your words appear at the cursor. Free and offline after setup.

What it does

Why

I first started out with Wispr Flow then tried the then-free alternative, WhisperTyping. However, these tools became paid and I was really fed up when they didn't work, losing some dictations, or the fact that my voice was sent to the cloud and got transcribed there. Even if inference is faster when using a provider, simply the latency and privacy concerns made me want to write a local version without any fancy graphical interface. Basically I wanted the most minimal and performant thing that just gets the job done. I have used it extensively over a month with Claude Code, writing this tool with itself, polishing it adhering to this mindset. So, this became my daily driver as it's really simple but effective.

System requirements

Installation

  1. Extract the ZIP to any folder (this is portable)
  2. Right-click install.ps1Run with PowerShell (or the executable also runs this for you)
  3. The installer detects your GPU, installs aria2c for fast downloads, procures the right Whisper.cpp build, FFmpeg, and the AI model, asks for your primary and secondary language (ISO codes, e.g. en, hu, de), detects your microphone, and writes the config file for you
  4. Double-click firsh-whisper-dictation.exe to start (it is a built AHK script)
  5. If you like it, set Run at Startup from the tray icon (copies a shortcut to your shell:startup folder)

How to use

  1. Press AppsKey (the context-menu key, between right Alt and right Ctrl)
  2. Speak: the tray icon shows Listening... while recording
  3. Press AppsKey again to stop
  4. The transcription is pasted at your cursor and gets copied to the clipboard

Hotkeys

All of these are toggles, not push-to-talk:

Key Action
AppsKey Record in primary language / stop
Ctrl+AppsKey Record in secondary language / stop
Win+AppsKey Retry last recording with auto-detect language

Note:

Batch transcription

  1. Right-click the tray icon and Show Transcribe Window
  2. Drag and drop audio files (MP3, WAV, M4A, OGG, FLAC) onto it
  3. Wait a bit. Each file is transcribed and saved as a .txt next to the original.

Note: Text is automatically split into paragraphs deterministically, but additional processing by an LLM may be required for best results.

Configuration

Editing config.ini is optional, as is auto-created on first run:

Use the Reload Script from the tray menu after making changes.

Archive recordings

The tray menu's Archive Recordings converts old WAV recordings to 32 kbps Opus (16 kHz mono). Saves roughly 90% disk space, becoming very cheap to store. Files are packed into monthly archives (RAR with recovery record, or ZIP as fallback). Transcription text files are kept as-is. Or you could delete them any time, the Open Recordings Folder tray menu option gets you there.

Troubleshooting

Support

Feedback? Chat with me on bsky: @firsh.dev

Enjoying my Whisper Dictation tool? If you find it useful, buy me a tea 🍵 via Stripe or support directly using crypto. Completely optional, but appreciated!

Credits

This project stands on the shoulders of excellent open-source work: