cactus download

Overview

The cactus download command fetches models from HuggingFace and converts them to Cactus format. Models are cached in ./weights for offline use.

Syntax

cactus download <model> [flags]

Arguments

<model> - Model name or HuggingFace repository

Model Naming Conventions

Cactus supports several model name formats:

Short Names

cactus download qwen-2.5-1.5b
cactus download llama-3.2-1b
cactus download phi-4

HuggingFace Repository Format

cactus download Qwen/Qwen2.5-1.5B-Instruct
cactus download meta-llama/Llama-3.2-1B-Instruct

Flags

—precision

Set the quantization precision level:

cactus download <model> --precision INT4|INT8|FP16

Default: INT4 Options:

INT4 - 4-bit quantization (smallest size, ~1-2GB per model)
INT8 - 8-bit quantization (medium size, ~2-4GB per model)
FP16 - 16-bit floating point (largest size, ~4-8GB per model)

—token

Provide a HuggingFace API token for authentication:

cactus download <model> --token <your-hf-token>

Required for:

Gated models (Llama, Gemma)
Private repositories
Rate-limited downloads

—reconvert

Force reconversion from source weights:

cactus download <model> --reconvert

Useful when:

Model format has been updated
Previous conversion was incomplete
Switching between precision levels

Examples

# Download Qwen with default INT4 precision
cactus download qwen-2.5-1.5b

Download Progress

The command shows real-time download and conversion progress:

┌─────────────────────────────────────────────┐
│ Downloading: qwen-2.5-1.5b                  │
│ Precision: INT4                             │
└─────────────────────────────────────────────┘

Fetching from HuggingFace...
model.safetensors ████████████████ 100% 1.2GB
tokenizer.json    ████████████████ 100% 2.1MB
config.json       ████████████████ 100% 1.8KB

Converting to Cactus format...
Quantizing to INT4 ████████████████ 100%

✓ Model downloaded to ./weights/qwen-2.5-1.5b-int4

Cache Location

All downloaded models are stored in:

./weights/
├── qwen-2.5-1.5b-int4/
├── llama-3.2-1b-fp16/
├── phi-4-int8/
└── parakeet-1.1b-int4/

Each model directory contains:

Quantized weights
Tokenizer files
Model configuration
Metadata

Disk Space Requirements

Typical sizes by precision:

Precision	1B Model	3B Model	7B Model
INT4	~800MB	~2GB	~4GB
INT8	~1.5GB	~3.5GB	~7GB
FP16	~3GB	~7GB	~14GB

Offline Usage

Once downloaded, models can be used without internet:

# Download while online
cactus download qwen-2.5-1.5b

# Use offline later
cactus run qwen-2.5-1.5b  # Uses cached version

Run Command

Run downloaded models interactively

Convert Command

Convert models with custom settings

​Overview

​Syntax

​Arguments

​Model Naming Conventions

​Short Names

​HuggingFace Repository Format

​Flags

​—precision

​—token

​—reconvert

​Examples

​Download Progress

​Cache Location

​Disk Space Requirements

​Offline Usage

​See Also