SQLite · local ~/.dgxmode/experiments/

llama-finetune

7 runs
All Complete Best Loss < 0.8 lr = 1e-4
Training Loss step vs loss · smoothing 0.6
0.0 0.5 1.0 1.5 2.0 0 500 1000 1500 2000
run-7 · lr=1e-4 run-5 · lr=3e-4 run-3 · lr=5e-5 run-1 · lr=1e-3 ✗ OOM
Learning Rate 1e-4
Grad Norm 0.42
GPU Util 94%
Run Status Best Loss Eval Acc Steps Loss Curve Duration Config Artifacts
run-7
f8a2c1d
complete 0.312
step 1847
91.4% 2000
↓ 0.31
2h 14m
lr 1e-4 bs 8 r 16
3 ckpt
run-6
b3e91f0
complete 0.387
step 1920
89.2% 2000
↓ 0.39
2h 18m
lr 2e-4 bs 8 r 16
2 ckpt
run-5
91d4e7a
complete 0.402
step 1956
88.7% 2000
↓ 0.40
2h 21m
lr 3e-4 bs 4 r 32
2 ckpt
run-4
c7f20b8
running 0.614
step 823
823 / 2000
41%
57m…
lr 1e-4 bs 16 r 8
run-3
e2a48c3
complete 0.428
step 1890
87.1% 2000
↓ 0.43
2h 08m
lr 5e-5 bs 8 r 16
2 ckpt
run-2
a9d17f6
complete 0.511
step 1780
84.3% 2000
↓ 0.51
1h 52m
lr 5e-4 bs 4 r 16
1 ckpt
run-1
d4c38e1
OOM 1.247
step 312
312 / 2000
18m
lr 1e-3 bs 32 r 64
llama-finetune · 7 runs · 3 selected for comparison best: 0.312 (run-7 · f8a2c1d) Parquet: ~/.dgxmode/experiments/llama-finetune/runs/*/metrics.parquet DuckDB query: 4.2ms experiment-logger v0.1.0 · dgxmode.com