VoiceToTex

Local, offline voice-to-text for Linux — powered by Whisper Large V3.

Press a hotkey, speak, release. Your words appear as text instantly — no cloud, no latency, no subscription.

Features

Fully offline — Whisper runs locally on your GPU. Nothing leaves your machine.
Global hotkey — Hold Ctrl+Super+Space to talk, release to transcribe. Or toggle mode.
20+ languages — Auto-detect or lock to Vietnamese, English, Japanese, and more.
Dark & Light themes — Switch instantly in Settings → Appearance.
Real-time waveform — Live audio visualization while recording.
Transcript history — Search, edit, tag, and export all your transcriptions.
Audio ducking — Automatically mutes other apps while you record.
Sound feedback — Chirps on record start/stop, chime on transcript ready.
Floating overlay — Compact always-on-top indicator during recording.
Smart text injection — Paste, type (xdotool), or copy-to-clipboard output modes.
Dashboard & gamification — Track usage stats and earn milestones.
Session analysis — AI-powered insights on your transcription patterns.

Screenshots

Dark Theme

Light Theme

Install

Pre-built .deb (Ubuntu/Debian)

Download the latest .deb from Releases, then:

sudo dpkg -i voicetotex_1.1.0_amd64.deb

From source

git clone https://github.com/riftzen-bit/voicetotex.git
cd voicetotex
bash scripts/setup.sh
bash scripts/start.sh

Manual setup

python3 -m venv venv && source venv/bin/activate
pip install -r backend/requirements.txt
npm install
npm start

Usage

Shortcut	Action
`Ctrl+Super+Space`	Hold to record, release to transcribe
Mic button	Click or hold (follows hotkey mode setting)

Output modes: Paste (clipboard → Ctrl+V), Type (xdotool keystrokes), or Copy-only.

Configuration

All settings are in the Settings tab:

Section	Options
Appearance	Dark / Light theme
Recognition	Model size, language, beam size, initial prompt
Input	Audio device, VAD sensitivity, noise reduction
Output	Output mode (paste/type/copy), hotkey, hotkey mode
Advanced	WebSocket port, max recording duration

Config is stored at ~/.config/voicetotex/config.json.

Tech Stack

Layer	Technology
Desktop shell	Electron 35
Transcription	faster-whisper (CTranslate2) on CUDA/CPU
Backend	Python 3.11+ WebSocket server
IPC	WebSocket (real-time audio levels + state)
Audio	PulseAudio/PipeWire via sounddevice

Architecture

┌─────────────────────────────────────────────┐
│  Electron Main Process                      │
│  ├── BrowserWindow (renderer)               │
│  ├── System Tray                            │
│  └── Floating Overlay                       │
└──────────────┬──────────────────────────────┘
               │ spawns
┌──────────────▼──────────────────────────────┐
│  Python Backend (WebSocket server)          │
│  ├── AudioCapture (sounddevice)             │
│  ├── Transcriber (faster-whisper)           │
│  ├── AudioDucker (pactl)                    │
│  ├── TextInjector (xdotool/xclip/xsel)     │
│  ├── HotkeyListener (pynput)               │
│  └── TranscriptionHistory (JSON)            │
└─────────────────────────────────────────────┘

Project Structure

voicetotex/
├── electron/          # Main process, preload, tray, overlay
├── src/
│   ├── css/           # Styles (dark/light themes)
│   ├── js/            # App logic, waveform, sounds, websocket
│   └── index.html     # Main renderer
├── backend/           # Python server, transcriber, audio
├── scripts/           # setup.sh, start.sh
├── build/             # Packaging scripts
├── docs/screenshots/  # App screenshots
└── package.json

Requirements

OS: Linux (Ubuntu 22.04+ recommended)
GPU: NVIDIA with CUDA support (recommended for real-time transcription)
CPU: Works on CPU too (set device to cpu in settings, slower)
Python: 3.11+
Node.js: 18+

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.github		.github
assets		assets
backend		backend
build		build
docs		docs
electron		electron
scripts		scripts
src		src
.gitignore		.gitignore
.nvmrc		.nvmrc
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
electron-builder.yml		electron-builder.yml
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VoiceToTex

Features

Screenshots

Dark Theme

Light Theme

Install

Pre-built .deb (Ubuntu/Debian)

From source

Manual setup

Usage

Configuration

Tech Stack

Architecture

Project Structure

Requirements

License

Acknowledgments

About

Uh oh!

Releases 6

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

VoiceToTex

Features

Screenshots

Dark Theme

Light Theme

Install

Pre-built .deb (Ubuntu/Debian)

From source

Manual setup

Usage

Configuration

Tech Stack

Architecture

Project Structure

Requirements

License

Acknowledgments

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 6

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages