Skip to main content

Overview

The cactus test command runs the Cactus test suite, including unit tests, performance benchmarks, and on-device testing for iOS and Android.

Syntax

cactus test [flags]

Flags

—model

Specify the LLM model to test:
cactus test --model <model-name>
Default: LFM2-VL-450M

—transcribe_model

Specify the speech-to-text model to test:
cactus test --transcribe_model <model-name>
Default: moonshine-base

—benchmark

Run benchmarks with larger, more comprehensive models:
cactus test --benchmark
Uses production-scale models instead of test fixtures.

—precision

Regenerate model weights at a specific precision:
cactus test --precision INT4|INT8|FP16
Forces conversion of test models at the specified quantization level.

—reconvert

Force reconversion of test models from source:
cactus test --reconvert
Useful when model format has been updated.

—no-rebuild

Skip rebuilding the library before testing:
cactus test --no-rebuild
Use existing build artifacts. Faster for iteration on tests.

—llm / —stt / —performance

Run specific test suites:
cactus test --llm            # Only LLM tests
cactus test --stt            # Only speech-to-text tests
cactus test --performance    # Only performance benchmarks
By default, all suites run.

—ios

Run tests on a connected iPhone or iPad:
cactus test --ios
Requirements:
  • Physical iOS device connected via USB
  • Xcode with device provisioning
  • Device in developer mode

—android

Run tests on a connected Android device:
cactus test --android
Requirements:
  • Physical Android device or emulator
  • ADB debugging enabled
  • Device authorized for USB debugging

Examples

# Run all tests with default models
cactus test

Test Suites

LLM Tests (--llm)

Tests language model functionality:
  • Model loading and initialization
  • Text generation with various prompts
  • Tokenization accuracy
  • Context window handling
  • Stop sequence detection
  • Temperature and sampling
  • Batch processing
┌─────────────────────────────────────────────┐
│ Running LLM Tests                           │
│ Model: LFM2-VL-450M                         │
└─────────────────────────────────────────────┘

✓ test_model_loading          (0.3s)
✓ test_simple_generation      (1.2s)
✓ test_context_window         (2.1s)
✓ test_stop_sequences         (0.8s)
✓ test_temperature_sampling   (1.5s)
✓ test_batch_processing       (3.2s)

6 passed, 0 failed

STT Tests (--stt)

Tests speech-to-text functionality:
  • Model loading and initialization
  • Audio file transcription
  • Real-time streaming transcription
  • Multiple audio formats
  • Accuracy on test dataset
  • Performance metrics
┌─────────────────────────────────────────────┐
│ Running STT Tests                           │
│ Model: moonshine-base                       │
└─────────────────────────────────────────────┘

✓ test_model_loading          (0.2s)
✓ test_file_transcription     (1.8s)
✓ test_streaming_audio        (2.5s)
✓ test_audio_formats          (3.1s)
✓ test_accuracy_dataset       (12.4s)
✓ test_performance_metrics    (5.3s)

6 passed, 0 failed

Performance Tests (--performance)

Benchmarks system performance:
  • Token generation speed (tokens/sec)
  • Time to first token (TTFT)
  • Memory usage and leaks
  • Model load time
  • Concurrent request handling
  • Device-specific optimizations
┌─────────────────────────────────────────────┐
│ Running Performance Benchmarks              │
│ Model: LFM2-VL-450M (INT4)                  │
└─────────────────────────────────────────────┘

Token generation:     45.2 tokens/sec
Time to first token:  0.3s
Model load time:      1.2s
Memory usage:         320MB
Peak memory:          380MB

✓ All benchmarks passed

Device Testing

iOS Device (--ios)

Deploys and runs tests on a connected iPhone/iPad:
cactus test --ios --model qwen-2.5-1.5b
┌─────────────────────────────────────────────┐
│ Testing on iOS Device                       │
│ Device: iPhone 15 Pro (iOS 18.0)            │
└─────────────────────────────────────────────┘

Building for iOS...
Deploying to device...
Running tests...

✓ test_model_loading          (0.5s)
✓ test_generation_speed       (2.1s)
  → 38.4 tokens/sec on A17 Pro
✓ test_memory_usage           (1.2s)
  → Peak: 420MB

3 passed, 0 failed

Android Device (--android)

Deploys and runs tests on a connected Android device:
cactus test --android --model llama-3.2-1b
┌─────────────────────────────────────────────┐
│ Testing on Android Device                   │
│ Device: Pixel 8 (Android 14)                │
└─────────────────────────────────────────────┘

Building for Android...
Installing APK...
Running tests...

✓ test_model_loading          (0.7s)
✓ test_generation_speed       (2.5s)
  → 32.1 tokens/sec on Tensor G3
✓ test_memory_usage           (1.4s)
  → Peak: 480MB

3 passed, 0 failed

Benchmark Mode

With --benchmark, tests use larger production models:
cactus test --benchmark
SuiteDefault ModelBenchmark Model
LLMLFM2-VL-450MQwen-2.5-3B
STTmoonshine-baseparakeet-1.1b
Benchmark mode provides more realistic performance metrics but takes longer to run.

Continuous Integration

For CI/CD pipelines:
# Fast test run
cactus test --llm --no-rebuild

# Full test suite
cactus test --benchmark

# Platform-specific
cactus test --android --model qwen-2.5-1.5b

See Also

Build Command

Build libraries before testing

Run Command

Test models interactively