Spark Control — DGX Spark Command Center

Model Size Quantization Memory Fit Status

◆

INTELLECT-3-MoE

Prime Intellect · 100B+ MoE

103B

FP4

52 GB

Loaded

◆

Hermes-3-70B

Nous Research · Llama 3.1

70B

FP8

72 GB

Cached

◆

NousCoder-14B

Nous Research · Qwen3-14B RL

14B

FP4

8.2 GB

Cached

◆

Llama-3.3-70B-Instruct

Meta · Llama 3.3

70B

Q4_K_M

38 GB

On disk

◆

GPT-OSS-120B

NVIDIA · Nemotron

120B

FP4

62 GB

On disk

◆

DeepSeek-R1-70B

DeepSeek · Reasoning

70B

FP8

72 GB

On disk

◆

Hermes-Agent-7B

Nous Research · Agentic

BF16

14 GB

Cached

◇

NuminaMath-QwQ-CoT-5M

Prime Intellect · Reasoning Traces

5M rows

Parquet

28 GB

On disk

◇

Atropos-RL-24K

Nous Research · Competitive Programming

24K problems

JSONL

4.1 GB

On disk

Prime Intellect

INTELLECT-3-MoE

100B+ parameter Mixture-of-Experts reasoning model. State-of-the-art on math, code, science & reasoning. Trained via decentralized RL on PRIME-RL.

MoE NVFP4 reasoning code PRIME-RL

Specs

Benchmarks

Config

Files

Architecture

Parameters

103B

Active Params

14B

Experts

64 / 8

Context

128K

Vocab Size

152K

Layers

Memory Fit · GB10

Model + KV Cache (8K ctx) 52 / 128 GB · 40.6%

0 32 64 96 128 GB

Fine-tune (LoRA r=16, bs=4) 89 / 128 GB · 69.5%

0 32 64 96 128 GB

Spark Compatibility

FP4 Tensor Cores ✓ native

Memory Capacity ✓ 52/128 GB

Bandwidth Bound ⚠ 273 GB/s

2× Spark Cluster ✓ FP8 capable

Benchmarks

MATH-500

91.2

LiveCode v6

74.1

GPQA

68.4

MMLU-Pro

82.7

ARC-C

95.3

Quick Serve Config

# spark-control auto-generated backend: sglang model: /models/intellect-3-moe-fp4 quantization: fp4 tensor_parallel: 1 max_model_len: 8192 gpu_memory_utilization: 0.85 api_compat: openai port: 8000