Spark Control
DGX Spark · 192.168.1.42
GPU 23% MEM 41.2 / 128 GB TEMP 52°C PWR 68W
Model Size Quantization Memory Fit Status
INTELLECT-3-MoE
Prime Intellect · 100B+ MoE
103B
FP4
52 GB
Loaded
Hermes-3-70B
Nous Research · Llama 3.1
70B
FP8
72 GB
Cached
NousCoder-14B
Nous Research · Qwen3-14B RL
14B
FP4
8.2 GB
Cached
Llama-3.3-70B-Instruct
Meta · Llama 3.3
70B
Q4_K_M
38 GB
On disk
GPT-OSS-120B
NVIDIA · Nemotron
120B
FP4
62 GB
On disk
DeepSeek-R1-70B
DeepSeek · Reasoning
70B
FP8
72 GB
On disk
Hermes-Agent-7B
Nous Research · Agentic
7B
BF16
14 GB
Cached
NuminaMath-QwQ-CoT-5M
Prime Intellect · Reasoning Traces
5M rows
Parquet
28 GB
On disk
Atropos-RL-24K
Nous Research · Competitive Programming
24K problems
JSONL
4.1 GB
On disk
Prime Intellect

INTELLECT-3-MoE

100B+ parameter Mixture-of-Experts reasoning model. State-of-the-art on math, code, science & reasoning. Trained via decentralized RL on PRIME-RL.
MoE NVFP4 reasoning code PRIME-RL
Specs
Benchmarks
Config
Files
Architecture
Parameters
103B
Active Params
14B
Experts
64 / 8
Context
128K
Vocab Size
152K
Layers
64
Memory Fit · GB10
Model + KV Cache (8K ctx) 52 / 128 GB · 40.6%
0 32 64 96 128 GB
Fine-tune (LoRA r=16, bs=4) 89 / 128 GB · 69.5%
0 32 64 96 128 GB
Spark Compatibility
FP4 Tensor Cores ✓ native
Memory Capacity ✓ 52/128 GB
Bandwidth Bound ⚠ 273 GB/s
2× Spark Cluster ✓ FP8 capable
Benchmarks
MATH-500
91.2
LiveCode v6
74.1
GPQA
68.4
MMLU-Pro
82.7
ARC-C
95.3
Quick Serve Config
# spark-control auto-generated backend: sglang model: /models/intellect-3-moe-fp4 quantization: fp4 tensor_parallel: 1 max_model_len: 8192 gpu_memory_utilization: 0.85 api_compat: openai port: 8000
INTELLECT-3-MoE loaded · /v1/chat/completions on :8000 14 models · 8 datasets · 23 checkpoints SSD: 1.8 / 4.0 TB
52°C
23% GPU Spark Control v0.1.0