CraftRigs
articles

George Hotz's $12K Tinybox Red vs. Building Your Own: The DIY Math

By Chloe Smith 11 min read

Some links on this page may be affiliate links. We disclose it because you deserve to know, not because it changes anything. Every recommendation here comes from benchmarks, not budgets.

TL;DR

The Tinybox Red v2 packs 4x RX 9070 XT into a single box for $12,000. We priced the same configuration as a DIY build: $6,700-$8,900 depending on where you buy the CPU. You're paying a $3,000-$5,000 convenience premium. If you can build a PC and configure ROCm, build it yourself. If you want plug-and-play with support, the Tinybox earns its markup — but only if you actually need four GPUs.


Tinybox Red v2 vs DIY — Specs at a Glance

CraftRigs DIY Build

4x AMD RX 9070 XT (64 GB VRAM)

AMD TR PRO 7965WX (24C/48T)

128 GB DDR5

2 TB NVMe

2x Seasonic PRIME PX-1600

4U server or open frame

Your choice

$6,700-$8,900

Parts available now The hardware is nearly identical. The Tinybox uses an AMD EPYC instead of Threadripper PRO — both provide 128 PCIe lanes, both are server-grade AMD chips. For inference workloads, the CPU difference is negligible. The GPUs, VRAM, and memory bandwidth are the same.


The $4,000 Question — What Does the Tinybox Premium Buy?

The Tinybox Red v2 charges $3,000-$5,300 over DIY component cost for integration, testing, and a software stack. Whether that's worth it depends entirely on how you value your time and tolerance for multi-GPU troubleshooting.

At $12,000, you're getting a tested, assembled system from tinygrad.org that George Hotz and his team have validated for multi-GPU inference. At $6,700-$8,900, you're getting the same silicon but handling assembly, BIOS configuration, PSU wiring, and software setup yourself.

The premium breaks down roughly like this: $500-$800 for assembly and testing, $200-$400 for the rack-mountable chassis engineering, and the rest is margin plus the tinygrad software integration. That's a legitimate business model — but it's not a hardware advantage.

Pre-Built Integration and Multi-GPU Testing

Four-GPU builds are not like single-GPU builds. Every additional GPU multiplies the potential failure points: PCIe lane allocation, power delivery across multiple 12VHPWR connectors, thermal management with four 304W heat sources in close proximity, and BIOS settings that need manual adjustment for multi-GPU operation.

The Tinybox ships tested. Every unit runs a multi-GPU stress test before leaving the factory. If you've never built a multi-GPU system, the first build takes 8-12 hours including troubleshooting — not the 2-3 hours a single-GPU build takes.

That said, if you've built PCs before, multi-GPU isn't magic. It's the same process with more cables and a BIOS setting for PCIe bifurcation. The Tinybox solves a real problem, but it's not a problem that requires $4,000 to solve.

Tinygrad Software Stack — Is It Worth the Lock-In?

Tinygrad is open source. You don't need a Tinybox to use it, and most local AI users won't use tinygrad anyway.

The tinygrad framework is George Hotz's neural network library — it's a real project with real contributors. But the local AI ecosystem has standardized around Ollama, llama.cpp, and vLLM. These run on any AMD GPU with ROCm support.

If you buy a Tinybox, you'll likely end up installing Ollama on it anyway. The tinygrad integration is a nice bonus for ML researchers, but it's not what most buyers are paying for. You're paying for the assembled hardware.

Tip

The tinygrad software stack is fully open source on GitHub. You can install it on any DIY build with AMD GPUs — no Tinybox required.


CraftRigs DIY Build — Same 4x RX 9070 XT for ~$7,000

We priced a complete DIY build that matches the Tinybox Red v2 spec for spec. At Micro Center CPU pricing, the total lands at $6,700-$7,100 as of March 2026.

This isn't a theoretical build — every part is currently available for purchase. The only variable is the Threadripper PRO CPU price, which swings by nearly $2,000 depending on where you buy it.

CPU and Motherboard — PCIe Lanes for 4 GPUs

You need a Threadripper PRO 7000 WX for 128 PCIe 5.0 lanes. Regular Threadripper 7000 only has 48 lanes — not enough to run four GPUs at full x16 bandwidth.

This is where most DIY builders trip up. A standard desktop platform (AM5, LGA1700) maxes out at 24-28 CPU PCIe lanes. That's fine for one or two GPUs, but four GPUs at x16 each requires 64 lanes minimum — and you want headroom for NVMe and chipset links.

The AMD TR PRO 7965WX (24-core, 48-thread) provides 128 PCIe 5.0 lanes. Retail price runs ~$3,000+, but Micro Center has historically offered open-box and clearance deals as low as ~$1,200. Check current pricing before buying — this CPU is the single biggest price variable in the build.

For motherboards, two options stand out:

  • Gigabyte MH53-G40: $876 as of March 2026. Seven PCIe 5.0 x16 slots. This is our pick — it's $400 cheaper than the ASUS alternative.
  • ASUS Pro WS WRX90E-SAGE SE: $1,291 as of March 2026. Six PCIe 5.0 x16 slots. Premium build quality, but the Gigabyte covers the same lane requirements for less.

Both boards support the full 128 lanes. We recommend the Gigabyte unless you need a specific ASUS feature.

PSU and Cooling — The Parts Most Builders Undersize

Four RX 9070 XTs draw ~1,200W sustained for the GPUs alone. Add CPU and system components and you're at ~1,700W sustained with transient spikes up to 2,300W. A single 1600W PSU is not sufficient.

This is the part of the build where the Tinybox actually provides elegant engineering. Their dual 1600W PSU setup on separate circuits handles the power draw cleanly. You need to match it.

Your options:

  • Dual PSU setup: 2x Seasonic PRIME PX-1600 at $500 each (as of March 2026). This mirrors the Tinybox approach. Requires a dual-PSU adapter or manual wiring.
  • Single high-wattage unit: A 2000W+ ATX PSU, if you can find one. These are rare and expensive in consumer form factors.

We recommend the dual Seasonic approach. It's proven, it's what the Tinybox uses, and it gives you redundancy.

Warning

Dual PSUs on separate circuits means your build needs two wall outlets on different breakers. A single 15A circuit provides ~1,800W max — not enough for the full system at peak load. Plan your electrical setup before buying parts.

For cooling, four GPUs in an enclosed case need serious airflow. A 4U server chassis with front-to-back airflow works best. Open-frame test benches are cheaper ($200 vs $400) but louder and collect more dust. Budget $200-$400 for the case.

Complete Parts List with Prices (March 2026)

Price (as of March 2026)

4 x $600-$730 = $2,400-$2,920

~$3,000 retail / ~$1,200 Micro Center open-box (check availability)

$876

~$350

~$150

2 x $500 = $1,000

$200-$400

$6,176-$7,196

$7,976-$8,996

$12,000

$3,004-$5,824 The ASRock Challenger is the lowest-priced RX 9070 XT AIB card, with street prices ranging $600-$730 as of March 2026 depending on sales. MSRP is $729, but Newegg and other retailers regularly discount to the $600 range. Other AIB models run $729-$849. We don't recommend used GPUs for a build you're running 24/7.


Inference Performance — Does the Tinybox Optimize Better?

Same GPUs, same VRAM, same memory bandwidth — same inference performance. The Tinybox has no hardware-level optimization advantage over a DIY build with identical components.

There's a persistent myth that pre-built AI systems have some secret sauce that makes the same silicon perform better. They don't. Four RX 9070 XTs in a Tinybox produce the same tokens per second as four RX 9070 XTs in your DIY build.

The only variable is software. Tinygrad's inference engine may be slightly faster or slower than llama.cpp depending on the model and quantization format. In practice, the difference is marginal — within 5-10% in most benchmarks.

Model Capacity — What 64 GB of VRAM Actually Runs

64 GB of total VRAM across four GPUs runs Llama 3.1 70B at FP16, or 120B+ parameter models at Q4 quantization. This is serious capacity.

Here's what fits:

Fits in 64 GB?

No (single precision) / Yes (tensor parallel)

Yes, with room to spare

Yes

Tight fit

Yes (single GPU)

Yes, at the limit For most local AI users, 64 GB of VRAM is more than enough to run any open-weight model available today. The question isn't whether you have enough VRAM — it's whether tensor parallelism across four GPUs introduces enough inter-GPU communication overhead to matter.

With PCIe 5.0 x16 per GPU (which both the Tinybox and our DIY build provide), the overhead is minimal. You'll see 80-90% scaling efficiency across four cards for inference workloads.

ROCm vs CUDA — The Software Reality for AMD Builds

ROCm works. It's not as polished as CUDA, but for inference via Ollama and llama.cpp, it's production-ready on Linux as of early 2026.

The RX 9070 XT uses AMD's RDNA 4 architecture. ROCm support for RDNA 4 shipped with ROCm 6.4.2 (via Radeon Software for Linux), and ROCm 7.2 expanded consumer GPU support further. Here's where things stand:

  • Ollama: Native ROCm support. Works out of the box on Ubuntu 24.04. This is how most Tinybox owners will run inference.
  • llama.cpp: ROCm backend is stable. Compile with LLAMA_HIPBLAS=1. Performance is within 5-10% of the CUDA equivalent.
  • vLLM: ROCm support exists but is less mature. If you need serving infrastructure, expect some rough edges.
  • PyTorch: ROCm builds available. Training workloads work but may require occasional workarounds.

The honest assessment: if you're comfortable on Linux and primarily running inference, ROCm is fine. If you need bleeding-edge training features or Windows support, NVIDIA is still easier. The Tinybox ships with Ubuntu 24.04 and ROCm pre-configured — but so can your DIY build with 30 minutes of setup.

Related: RTX 3090 vs RX 9060 XT for Local LLMs — our full AMD vs NVIDIA breakdown.


Who Should Buy the Tinybox Red (And Who Shouldn't)

The Tinybox is a good product at a premium price. Whether it's right for you depends on one question: is $3,000-$5,000 worth 10-15 hours of your time?

Buy It If You Value Time Over Money

The Tinybox makes sense if you're:

  • A professional or team that needs a working multi-GPU box this week, not next month. Your hourly rate makes DIY savings irrelevant.
  • A researcher or ML engineer who wants to focus on models, not hardware. The pre-configured ROCm + tinygrad stack saves real setup time.
  • Someone who has never built a PC. A four-GPU build is not a beginner project. If your first build is a $7,000 multi-GPU system, the risk of an expensive mistake is real.

For these buyers, $12,000 for a tested, supported system is reasonable. The Tinybox earns its markup through time savings and reduced risk.

Build It If You Want Control and Savings

The DIY build makes sense if you're:

  • An experienced builder who's comfortable with BIOS configuration, PCIe lane allocation, and Linux setup. The assembly is straightforward if you've done it before.
  • Budget-conscious. $3,000-$5,000 in savings buys another GPU, a year of electricity, or a significant chunk of your next upgrade.
  • A power user who wants customization. Choose your own case, cooling solution, storage configuration, and OS. The Tinybox ships with fixed specs.
  • A gamer who also runs local AI. Four RX 9070 XTs can do double duty. The Tinybox's rack-mount form factor isn't gaming-friendly.

For these buyers, the DIY path is clearly better. You get the same performance, more flexibility, and thousands in savings.

Not sure where to start? See our Local LLM Hardware Upgrade Ladder — the step-by-step path from your first GPU to a full multi-GPU build.

Skip Both If You Don't Need 4 GPUs

Most local AI users don't need four GPUs. If you're running models under 30B parameters, a single RX 9070 XT ($600-$730 as of March 2026) or RTX 3090 handles it fine. Two GPUs cover 70B models at Q4 comfortably.

Four GPUs are for:

  • Running 70B+ models at higher precision (Q8 or FP16)
  • Serving multiple users simultaneously
  • Running 120B+ parameter models locally
  • Research workloads that need maximum VRAM

If none of those describe you, spending $7,000-$12,000 on a quad-GPU build is overkill. Start with one or two GPUs and scale up when you hit VRAM limits.

Need just one GPU? Read RTX 5060 Ti 8GB vs 16GB for Local LLMs before buying.


FAQ — Tinybox Red vs DIY Multi-GPU Builds (2026)

Can I use regular Threadripper instead of Threadripper PRO for four GPUs?

No. Regular Threadripper 7000 provides 48 PCIe lanes. Four GPUs at x16 each need 64 lanes minimum, plus you need lanes for NVMe and chipset. Threadripper PRO 7000 WX with 128 PCIe 5.0 lanes is the minimum platform for a proper four-GPU build.

Is the Tinybox Red v2 currently available?

As of March 2026, the Tinybox Red v2 is listed as out of stock on the tinycorp shop. There's no publicly announced restock date. If you need a multi-GPU system now, DIY is your only option regardless of budget preference.

Can I use NVIDIA GPUs instead of AMD for a similar DIY build?

Yes, but the math changes significantly. Four RTX 4090s (24 GB each, 96 GB total) would cost $7,200-$8,000 for the GPUs alone (as of March 2026), putting the total build at $11,000-$13,000. You'd get CUDA instead of ROCm and 50% more VRAM, but lose the cost advantage over the Tinybox.

Does the Tinybox come with a warranty?

Tinycorp offers support for Tinybox purchases through their website. Specific warranty terms vary — check tinygrad.org for current policy. A DIY build relies on individual component warranties (typically three to five years for GPUs and PSUs from major manufacturers).


Verdict — Build Your Own Unless Time Is Your Bottleneck

We recommend the DIY build for most readers. The $3,000-$5,300 savings is real money, the hardware is identical, and the assembly is manageable for anyone who's built a PC before.

The Tinybox Red v2 is a legitimate product — it's not overpriced for what it is, and the engineering is solid. But the convenience premium only makes sense if your time is worth more than the savings, or if you genuinely can't build a PC.

Our pick for the optimal DIY configuration: the Gigabyte MH53-G40 motherboard, TR PRO 7965WX from Micro Center, 4x ASRock Challenger RX 9070 XT, and dual Seasonic PRIME PX-1600 PSUs. Total: ~$6,900 as of March 2026. That's $5,100 less than the Tinybox for the same inference performance.

Save the difference. Put it toward electricity costs, a fifth GPU down the road, or the next generation of hardware.

Want to see how dual-GPU stacks compare? Read our Dual-GPU Local LLM Stack Guide.

Technical Intelligence, Weekly.

Access our longitudinal study of hardware performance and architectural optimization benchmarks.