Skip to main content

Overview

The cactus transcribe command provides speech-to-text capabilities using on-device models. Supports both live microphone input and audio file transcription.

Syntax

cactus transcribe [model] [flags]

Arguments

  • [model] - Optional transcription model name. Default: parakeet-1.1b

Flags

—file

Transcribe an audio file instead of live microphone input:
cactus transcribe --file <path/to/audio.wav>
Supported formats: WAV, MP3, FLAC, OGG

—precision

Set the quantization precision level:
cactus transcribe --precision INT4|INT8|FP16
Default: INT4 Options:
  • INT4 - 4-bit quantization (smallest size, fastest)
  • INT8 - 8-bit quantization (balanced)
  • FP16 - 16-bit floating point (highest quality)

—token

Provide a HuggingFace API token for gated models:
cactus transcribe <model> --token <your-hf-token>

—reconvert

Force reconversion of the model from source weights:
cactus transcribe --reconvert

Examples

# Start live transcription with default model
cactus transcribe

Live Microphone Mode

When run without --file, transcription starts from your default microphone:
┌─────────────────────────────────────────────┐
│ Cactus Transcribe - parakeet-1.1b           │
│ Listening... (press Ctrl+C to stop)        │
└─────────────────────────────────────────────┘

[00:00:03] Hello, this is a test of the
           transcription system.

[00:00:08] It's working really well and
           capturing my speech accurately.

File Transcription Mode

With --file, the audio is processed and transcription is displayed:
cactus transcribe --file podcast-episode.mp3
┌─────────────────────────────────────────────┐
│ Transcribing: podcast-episode.mp3           │
│ Duration: 45:32                             │
└─────────────────────────────────────────────┘

Processing... ████████████████████ 100%

[Transcript]

Welcome to today's episode. We're going to
discuss the latest developments in...

Default Model

The default transcription model is parakeet-1.1b, which provides:
  • Fast real-time transcription
  • Good accuracy for general speech
  • Low memory footprint (~300MB INT4)

Available Models

Supported speech-to-text models:
  • parakeet-1.1b (default)
  • moonshine-base
  • moonshine-tiny
See the Model Library for full specifications.

See Also

Download Command

Pre-download transcription models

Test Command

Run STT benchmarks