Overview
Thecactus download command fetches models from HuggingFace and converts them to Cactus format. Models are cached in ./weights for offline use.
Syntax
Arguments
<model>- Model name or HuggingFace repository
Model Naming Conventions
Cactus supports several model name formats:Short Names
HuggingFace Repository Format
Flags
—precision
Set the quantization precision level:INT4
Options:
INT4- 4-bit quantization (smallest size, ~1-2GB per model)INT8- 8-bit quantization (medium size, ~2-4GB per model)FP16- 16-bit floating point (largest size, ~4-8GB per model)
—token
Provide a HuggingFace API token for authentication:- Gated models (Llama, Gemma)
- Private repositories
- Rate-limited downloads
—reconvert
Force reconversion from source weights:- Model format has been updated
- Previous conversion was incomplete
- Switching between precision levels
Examples
Download Progress
The command shows real-time download and conversion progress:Cache Location
All downloaded models are stored in:- Quantized weights
- Tokenizer files
- Model configuration
- Metadata
Disk Space Requirements
Typical sizes by precision:| Precision | 1B Model | 3B Model | 7B Model |
|---|---|---|---|
| INT4 | ~800MB | ~2GB | ~4GB |
| INT8 | ~1.5GB | ~3.5GB | ~7GB |
| FP16 | ~3GB | ~7GB | ~14GB |
Offline Usage
Once downloaded, models can be used without internet:See Also
Run Command
Run downloaded models interactively
Convert Command
Convert models with custom settings