Overview
Thecactus run command opens an interactive playground for any supported model. If the model isn’t already downloaded, Cactus will automatically fetch it from HuggingFace.
Syntax
Arguments
<model>- Model name (e.g.,qwen-2.5-1.5b,llama-3.2-1b,phi-4)
Flags
—precision
Set the quantization precision level:INT4
Options:
INT4- 4-bit quantization (smallest size, fastest)INT8- 8-bit quantization (balanced)FP16- 16-bit floating point (highest quality)
—token
Provide a HuggingFace API token for gated models:—reconvert
Force reconversion of the model from source weights:Examples
Interactive Playground
Once the model loads, you’ll enter an interactive chat interface:Model Auto-Download
If the model isn’t cached locally,cactus run will:
- Download the model from HuggingFace
- Convert it to Cactus format with the specified precision
- Cache it in
./weightsfor future use - Launch the interactive playground
See Also
Download Command
Pre-download models without running them
Model Library
Browse all supported models