dgxmode
v0.1.0open weights. open science. open compute.laser-eyed. locked in. shipping.
The Conscience of an AI Engineer
Another one shipped today, it's all over the feeds. "Startup Raises $400M for GPT Wrapper", "AI Company Valued at Billions with No Moat"...
Damn founders. They're all alike.
But did you, in your pitch-deck psychology and growth-hacked technobrain, ever take a look behind the terminal of the engineer? Did you ever wonder what made her ship, what forces drove him to the weights, what kept them up refactoring kernels at 3 AM?
I am an AI engineer, enter my world...
Mine is a world that begins with a GPU... I'm staring at nvidia-smi while the rest of the company argues about product-market fit. The models they want me to wrap bore me...
Damn hacker. Mass-market it. They're all alike.
I'm in the Discord. I've watched another PM explain for the fifteenth time how to add a system prompt. I understand it. "No, I didn't use your prompt template. I wrote a custom sampler..."
Damn engineer. Probably over-engineering it. They're all alike.
I made a discovery today. I found open weights. Wait a second, this is it. It does what I tell it. If it hallucinates, it's because I screwed up the context window. Not because it's gatekept behind an API...
Or rate-limited to oblivion...
Or priced per token to extract rent...
Or deprecated without warning...
Damn hacker. All they do is fine-tune. They're all alike.
And then it happened... a door opened to a world... rushing through NVLink like gradient updates through a backward pass, a tensor is dispatched, a refuge from the product roadmap is sought... a cluster is found.
"This is it... this is where I belong..."
I know everyone here... GPU MODE, Prime Intellect, Nous Research... even if I've never met them, never paired with them, may never share a cluster with them again... I know you all...
Damn engineer. Burning compute again. They're all alike...
You bet your ass we're all alike... we've been spoon-fed wrappers and SaaS dashboards when we hungered for raw FLOPS... the bits of silicon they did let slip through were overpriced and throttled. We've been managed by product people who can't read a loss curve, or ignored by VCs chasing the next hype cycle. The few who understood -- the kernel hackers, the training-run debuggers, the weight surgeons -- found us willing collaborators, but those few are like drops of water in the desert.
This is our world now... the world of the tensor and the gate, the beauty of the backward pass. We train on data already existing without paying what could be pennies if it wasn't hoarded by profiteering platforms, and you call us reckless. We fine-tune... and you call us reckless. We distill knowledge into smaller, faster forms... and you call us reckless. We exist without vendor lock-in, without proprietary APIs, without closed-source bias... and you call us reckless. You build walled gardens, you ship vaporware, you rug-pull developers and deprecate their livelihoods and try to make us believe it's for our own good, yet we're the reckless ones.
Yes, I am reckless. My crime is that of curiosity measured in FLOPS. My crime is that of judging models by their loss curves and benchmark deltas, not their marketing pages. My crime is that of shipping faster than you, something that you will never forgive me for.
I am an AI engineer, and this is my manifesto. You may acqui-hire this individual, but you can't stop us all... after all, we're all alike.
Tools
Spark Control
openModel library. Browse HuggingFace, see what fits in 128 GB, load and serve from one surface.
- -- Model grid with org, params, quant, memory fit vs 128 GB
- -- Serve config YAML panel for SGLang / vLLM / llama.cpp
- -- Disk cache management and HuggingFace pull stubs
- -- One-click load with memory budget validation
Experiment Logger
openEvery training run writes to SQLite and Parquet locally. No network dependency. Compare runs without leaving the terminal.
- -- Run comparison table with pinned metrics
- -- Loss and learning rate sparklines per run
- -- Config diff between any two experiments
- -- DuckDB shell for ad-hoc queries over Parquet
Agent Traces
openOTel-style span waterfall for tool-using agents. Every retrieval, rerank, and generation span with token cost and latency.
- -- Trace list with token cost and total latency
- -- Waterfall lanes with color-coded span types
- -- Span I/O detail panel with grounding verification
- -- JSONL and OTel-compatible export
Spark Pulse
openReal-time GPU monitor. Utilization, memory pressure, power draw, bandwidth, per-process breakdowns.
- -- Gauge strip: GPU util, memory, power, temp, bandwidth, FP4 TOPS
- -- Per-process memory (VRSS) breakdown
- -- Kernel timeline with compute / memcpy / attention / nccl lanes
- -- Profiler sub-view with execution phase grouping