CraftRigs
GT

Georgia Thomas

Setup Guides · Troubleshooting · Workflows · Benchmarks Austin, TX

Most local AI setup guides assume things just work. They don't. ROCm refuses to detect your GPU, Ollama throws cryptic CUDA errors, and the GitHub issue thread is 200 comments deep with no resolution.

Georgia writes the guides that end the debugging spiral, step-by-step, tested on the hardware CraftRigs readers actually own. She tests on the same used RTX 3090s and budget builds her readers run, because a guide that only works on a $6,000 workstation isn't a guide.

Editorial disclosure: Georgia is an editorial persona of the CraftRigs AI-assisted editorial team — a consistent beat and methodology, not an individual human reviewer. How our research and sourcing works: How CraftRigs Works.
Setup Guides Troubleshooting Workflows Benchmarks
259 Articles Published
219 Setup Guides
Nov 2025 Member Since

Latest from Georgia

259 articles
70B CPU Inference: 2–7 tok/s Without Buying GPUs — diagram
Guide

70B CPU Inference: 2–7 tok/s Without Buying GPUs

Your Threadripper already owns the 70B path—CPU-only hits 2.1–7.1 tok/s on DDR5 bandwidth, beats dual RTX 3090 cost for batch jobs, and leaves GPU free. Honest benchmarks, no GPU required.

May 22, 2026
ROCm 7.2 GPU Matrix 2026: Windows vs Linux — diagram
Guide

ROCm 7.2 GPU Matrix 2026: Windows vs Linux

Wrong driver kills AMD ROCm on Windows—ROCDXG not Adrenalin, clean install required. Full consumer GPU matrix with Linux native, WSL2 grades, and honest tok/s numbers. Install right, skip the forum archaeology.

May 22, 2026
Mac Studio Axed, M5 Ultra Delayed: Buy M4 Now? — diagram
Guide

Mac Studio Axed, M5 Ultra Delayed: Buy M4 Now?

Mac Studio 128GB killed in May 2026, M5 Ultra pushed to Q4—M4 Max 96GB is your only 70B Q8_0 option until 2027. We map every tier, benchmark AMD's Strix Halo rival, and tell you whether to buy or wait.

May 22, 2026
RTX 5060–5090 Street Prices Exposed: Buy or Wait? — diagram
Guide

RTX 5060–5090 Street Prices Exposed: Buy or Wait?

NVIDIA cut RTX 50-series production 40%—street prices run 18–67% over MSRP. RTX 5060 Ti 16GB at $485 beats the stack, but used RTX 3090 at $500 undercuts everything. Match card to budget before Computex hype resets the board.

May 22, 2026
Arc B580 Stuck at 13 tok/s? IPEX-LLM Unlocks 70 — diagram
Guide

Arc B580 Stuck at 13 tok/s? IPEX-LLM Unlocks 70

Your $250 Arc B580 runs Mistral 7B at 13 tok/s via Vulkan—leaving XMX acceleration idle. IPEX-LLM Docker hits 70 tok/s with full setup steps for Linux and Windows. Stop running at 20% speed.

May 22, 2026
LM Studio 0 GPUs: Fix for CUDA, ROCm, WSL2 (2026) — diagram
Guide

LM Studio 0 GPUs: Fix for CUDA, ROCm, WSL2 (2026)

LM Studio shows 0 GPUs detected? NVIDIA CUDA 12.2, AMD ROCDXG May 2026 driver, and WSL2 passthrough each have distinct fixes—most are driver mismatches, not hardware failure. Run the 60-second pre-flight, then follow your platform path.

May 22, 2026
Open WebUI Pipelines: Wire Any LLM Backend (2026) — diagram
Guide

Open WebUI Pipelines: Wire Any LLM Backend (2026)

Native Ollama locks you to one machine—Pipelines unlocks llama.cpp, remote Ollama, and custom APIs in the same chat UI. New April 2026 Desktop App auto-recovers GPU crashes. Wire your backend once, switch models freely.

May 22, 2026
VRAM Shortage 2026: Buy Used, Not New — diagram
Guide

VRAM Shortage 2026: Buy Used, Not New

HBM and GDDR7 shortages keep RTX 50-series prices 40% above MSRP. Used RTX 3090 24GB at $500 beats new cards on $/GB-VRAM. Here's how to navigate the chaos and buy smart.

May 22, 2026
GMKtec EVO-X2 Memory Bandwidth: 256 GB/s, Qwen 3.6 Speed — diagram
Guide

GMKtec EVO-X2 Memory Bandwidth: 256 GB/s, Qwen 3.6 Speed

EVO-X2: 256 GB/s unified memory (273 GB/s observed). Delivers 14–18 tok/s on Qwen 3.6 35B Q4_K_M. Compare Mac Mini M4: 120 GB/s. See specs, thermals, BIOS quirks, $1,500–$2,200 pricing, and when to buy.

May 8, 2026
diagram
Guide

Phi-4 14B Q4_K_M: VRAM & GPU Fit Guide

Phi-4 14B Q4_K_M won't fit your 8GB card at Q4 with usable context—but 12GB handles 8K, 16GB+ handles 32K. Exact VRAM per quant tier, GPU fit table, and decode benchmarks.

May 7, 2026
Best GGUF Coding Model 2026 — diagram
Guide

Best GGUF Coding Model 2026

Gave up on local coding? Qwen 3.6 27B Q5 hits 94% accuracy, Reddit consensus May 2026. DeepSeek V4: cost-collapse. Phi-4 14B: 12GB fit. Match model to hardware, not guesses.

May 6, 2026
Used RTX 3090 scams: 4 red flags + eBay escape plan — diagram
Guide

Used RTX 3090 scams: 4 red flags + eBay escape plan

Burned-out mining cards and relabeled 3080s cost buyers $300–$450—eBay resolves only ~75% of GPU disputes in 10–30 days. Demand GPU-Z sensors, timestamped cuda-memtest video, and 30-min llama.cpp burn-in before you buy.

May 6, 2026
Qwen3.6 MoE on 16 GB: --n-cpu-moe Fixes OOM — diagram
Guide

Qwen3.6 MoE on 16 GB: --n-cpu-moe Fixes OOM

16 GB GPU chokes on Qwen3.6 MoE—OOM or 2 tok/s without flags. --n-cpu-moe 20 --split-mode row hits 18–28 tok/s at 11.2 GB VRAM, but --fit on alone degrades 40–60%. Pin experts first, or don't run it.

May 6, 2026
8GB VRAM Too Small? Qwen 3.5 9B Hits 12.4 tok/s—Here's How — diagram
Guide

8GB VRAM Too Small? Qwen 3.5 9B Hits 12.4 tok/s—Here's How

Stuck with 8GB VRAM? Qwen 3.5 9B at Q4_K_M runs 12.4 tok/s on RTX 4060, 8.7 tok/s on RTX 3060—verified benchmarks, exact quants, and copy-paste configs for code, chat, and agents. Stop waiting for a GPU upgrade and start running local LLMs today.

May 5, 2026
CUDA vs ROCm OOM: Why AMD crashes differently — diagram
Guide

CUDA vs ROCm OOM: Why AMD crashes differently

AMD GPU OOM errors hide 3.2GB of dark matter VRAM that CUDA doesn't. Learn the 4 symptom signatures, 8 diagnostic commands, and ROCm-specific fixes that NVIDIA guides never include.

May 5, 2026