Ellie Garcia

Hardware Reviews · Benchmarks • Boston, MA

Manufacturer specs look impressive until you load a 70B model, hit an OOM error, and realize the memory bandwidth was the bottleneck nobody mentioned.

Ellie tests GPUs under real local AI workloads, the same tasks you're actually running. Her reviews specifically probe the failure points that don't appear in official spec sheets: thermal throttling under sustained inference, memory bandwidth saturation, and quantization trade-offs.

Editorial disclosure: Ellie is an editorial persona of the CraftRigs AI-assisted editorial team — a consistent beat and methodology, not an individual human reviewer. How our research and sourcing works: How CraftRigs Works.

Hardware Reviews Benchmarks

42 Articles Published

40 Reviews

Feb 2026 Member Since

Latest from Ellie

42 articles

Review

GPU Under $1000: RTX 3090 vs Arc Pro B70 vs RX 9070 XT for Local LLMs

RTX 3090 used, Arc Pro B70 ($949), and RX 9070 XT ($649) compared: real tok/s benchmarks, cost-per-token math, and ecosystem support for local LLM inference.

Apr 26, 2026

Review

Used RTX 3090 Buyer's Checklist 2026 — Inspection, Red Flags, eBay/Jawa Price Tracker

Stop buying dead mining cards. Our $480 used RTX 3090 checklist includes a 10-minute VRAM stress test that catches most thermal damage before you pay.

Apr 23, 2026

Benchmark

Apple Silicon M-Series LLM Benchmark: M1 Through M5 tok/s Comparison

Your M2 Air chokes at 0.4 tok/s while M4 Max hits 18.4 tok/s — 46x gap. We sourced 340+ benchmarks to name the exact tier you need.

Apr 18, 2026

Benchmark

Intel Arc B-Series for Local LLMs: Real Benchmarks vs NVIDIA Budget Cards

RTX 4060 8 GB hits OOM at 13B models — Arc B580's 12 GB runs them native at 38 tok/s, but vLLM XPU needs Linux. Real MLPerf numbers inside.

Apr 18, 2026

Review

RTX 5060 Ti 16GB LLM Verdict: What April 16 Reviews Tell Local AI Users [2026]

16GB GDDR7 sounds perfect — but street price is $549 and the RTX 3090 beats it on bandwidth. Here's which LLMs actually fit and whether to buy now.

Apr 16, 2026

Review

RTX 5060 Ti 16GB Local LLM Review: Real Inference Results [2026]

16GB GDDR7 at $549 street — but a used RTX 3090 has double the bandwidth. Here's which LLMs actually fit and whether the math works for budget builders.

Apr 15, 2026

Review

RTX 5060 Ti 8GB Honest Review: Real VRAM Limits for Local LLMs

Benchmark RTX 5060 Ti 8GB on 13B-70B models. See why 8GB hits the ceiling for Llama, Qwen, and Mistral at Q4 quantization. Driver story included.

Apr 4, 2026

Review

RTX 5070 Review: The Dual-Purpose GPU for LLM + Gaming

RTX 5070 12GB GDDR7 review. Real tok/s on 34B models, DLSS 5 gaming FPS, and whether one GPU can handle both local LLM and 4K gaming.

Apr 4, 2026

RX 9060 XT 16GB local LLM performance comparison — Phi-2 94.5 tok/s, Mistral 65.1 tok/s, Llama 3 53.2 tok/s, best value verdict vs RTX 5060 Ti

Review

RX 9060 XT 16GB Review: Budget AMD GPU for Local LLMs [2026]

RX 9060 XT runs Llama 14B at 53 tok/s on AMD's cheapest 16GB card. ROCm 7.0.2+ required, $80 cheaper than RTX 5060 Ti. When to buy, when to skip.

Apr 3, 2026

RX 9070 XT 16GB local LLM performance — 58–72 tok/s on 13B models, 28–32 tok/s on 70B, 5–10% slower than RTX 5070 Ti with $100–150 savings, verdict 7/10

Review

RX 9070 XT Review: Can AMD Finally Beat NVIDIA at Local LLMs?

RX 9070 XT with ROCm 7 runs Llama 3.1 70B via llama.cpp, but Ollama support lags NVIDIA. Real benchmarks, honest verdict on whether AMD's $719 card matches RTX 5070 Ti's $749 performance.

Apr 3, 2026

Review

Ryzen 7 9800X3D Local LLM CPU Review: 3D V-Cache Changes Hybrid Inference

Ryzen 7 9800X3D with 96MB 3D V-Cache handles 70B model layer offload 40% faster than older CPUs. Perfect for budget builders stuck on mid-tier GPUs. $429-449 as of April 2026.

Apr 3, 2026

Review

Ryzen 9 9950X for Local LLM Builders: Raw Cores vs. Smart Cache

16-core CPU for fine-tuning and quantization — but is it worth $650 when a used 7950X3D costs $400 and the 9950X3D launches in 3 weeks? Real-world breakdown.

Apr 3, 2026

Review

The $2,000 Local LLM Build: RTX 5060 Ti 16GB + Ryzen 7 9800X3D

Want a usable local AI rig for under $2,000? Take the RTX 5060 Ti 16GB + 9800X3D combo on real models. Here's what it actually runs well, and where budget builds hit their limits.

Apr 2, 2026

Review

$4,500 Dual-GPU AI Workstation for 70B Inference (Realistic Build 2026)

Build a $4,500 dual-GPU workstation that runs Llama 3.1 70B at high quality. Complete parts list, real benchmarks with vLLM/Ollama, and honest assessment of quantization tradeoffs.

Apr 2, 2026

Review

AnythingLLM for Local LLM: Building Production RAG Without Vendor Lock-In

AnythingLLM combines document retrieval + local model control in one platform. Self-hosted RAG with Ollama, offline-first, no cloud dependency. 2026 review.

Apr 2, 2026

Review

ASUS NUC 14 Pro 64GB Review: Silent Local LLMs Without a GPU Tower

Want silent local AI without a tower? ASUS NUC 14 Pro with 64GB DDR5 and Intel Arc runs 7B–8B models quietly—reviewed with real-world inference tests. Compact, privacy-first alternative to cloud APIs.

Apr 2, 2026

Review

DDR5 RAM for Local LLM Builds: Bandwidth Over Hype

Stop buying RAM by MHz. Bandwidth — not clock speed — moves inference. Here's which DDR5 kits matter for AI and which are marketing hype.

Apr 2, 2026

Review

Best NVMe SSD for AI Model Storage in 2026 — Load 70B Fast

Slow GGUF loading? Benchmarks reveal which NVMe SSDs actually speed up model loads—skip overpriced drives without losing performance. PCIe 4.0 vs 5.0 tested.

Apr 2, 2026

Review

GMKtec EVO X2 Review: 96GB Unified Memory for Local LLMs [Honest Verdict]

EVO X2 with Ryzen AI Max 395 runs 70B models locally at $1,799, but only 3–13 tokens/sec. Silence and flexibility beat raw speed—here's whether it's worth it vs RTX 4080 SUPER.

Apr 2, 2026

Review

Intel Arc B580 12GB Local LLM Review: Budget GPU, Real Performance [2026]

Shopping for a sub-$300 local LLM GPU? Arc B580 gives 12GB for $249 — real tok/s benchmarks on Llama 7B, Qwen models vs RTX 4060. Honest take on Vulkan quirks included.

Apr 2, 2026

Review

Intel Arc Pro B70 Review: 32GB GPU for Professional Local LLM Inference

Arc Pro B70 delivers 32GB VRAM at $949 for professional inference workloads. First Intel challenge to NVIDIA's pro GPU monopoly. OneAPI stability concerns vs. proven CUDA ecosystem — verdict inside.

Apr 2, 2026

Review

Intel Core Ultra 9 285K for Local LLM Builds: Hybrid Inference CPU Tested

Intel Core Ultra 9 285K delivers solid CPU inference at $475–$535. Tested against Ryzen 9 9950X on 8B/13B models. Worth the upgrade? Real benchmarks inside.

Apr 2, 2026

Review

Jan.ai: Open-Source Privacy-First LLM Frontend for Local Builders

Jan.ai is a free, open-source desktop frontend for running local LLMs with zero cloud dependency. Privacy-first architecture, clean UI, and minimal overhead—the simplest way to own your AI conversations in 2026.

Apr 2, 2026

Review

LM Studio Review: Best GUI for Local LLMs in 2026 [Tested]

Terminal-free local LLM setup with model browser and one-click download — but 15–20% slower than Ollama. Worth it only if you hate the command line.

Apr 2, 2026

Review

Mac Studio M4 Max: Silent Unified Memory for Local AI (Up to 32B Models)

Mac Studio M4 Max with 128GB unified memory runs 30B+ models silently. Slower than RTX 5090 on 70B inference, but no external GPUs needed. Unified memory deep dive, real benchmarks, and the honest verdict on price.

Apr 2, 2026

Review

Mac Mini M4 16GB: Silent Local AI Machine, Honest Limits Included

Mac Mini M4 runs Llama 8B at 30 tok/s for just $599 all-in. Silent, no setup, Apple ecosystem. But 13B models get slow, and 70B needs M4 Pro. Here's what you actually get.

Apr 2, 2026

Review

Minisforum MS-A1 Mini PC for Local AI: Quiet, Compact, But Not Fast

Need a silent, compact local AI PC under $900? Minisforum MS-A1 runs 7B–13B models on integrated GPU. Real benchmarks versus Intel NUC — honest verdict on whether form factor justifies the speed trade-off.

Apr 2, 2026

Review

Ollama Review 2026: Still the Best Way to Run Local Models?

Free, simple, and fast—but is Ollama still the right choice in 2026? Real pros and cons vs LM Studio and vLLM, plus when to use each.

Apr 2, 2026

Review

Open WebUI 2026: Free ChatGPT Interface for Your Local Stack

Want ChatGPT's interface without cloud lock-in? Open WebUI runs on your hardware, free, with vision, RAG, and multimodal support. Setup in 5 minutes. Honest review + verdict inside.

Apr 2, 2026

Review

RTX 3090 for Local LLMs in 2026: Is $900 Used Really a Deal?

RTX 3090 used GPU review: 24GB VRAM for 70B models but now $800–1,000 on the secondhand market. Real tok/s with CPU offload, comparison to RTX 5070 Ti and RTX 4090 used. Should you buy in 2026?

Apr 2, 2026

Review

RTX 4090 Used Market: 24GB GPU for 70B Models in 2026

Shopping used RTX 4090 for local 70B inference? Specs, real-world benchmarks at Q4_K_M quantization, and verification tips to avoid bad buys — April 2026 update.

Apr 2, 2026

Review

RTX 5060 Ti 16GB Review: Best Budget LLM GPU at $429 [2026]

Most $400 GPUs cap at 8GB — the 5060 Ti 16GB doubles that at the same price. Runs 14B daily at 95 tok/s. Is it better than a used RTX 3090?

Apr 2, 2026

Review

RTX 5060 Ti 8GB: Budget Local LLMs, Hard Reality Check

RTX 5060 Ti 8GB runs 7B models fast but hits the wall hard at 13B. $379 entry point is tempting—just know your ceiling before buying.

Apr 2, 2026

Review

RTX 5070 Local LLM Review: The Budget Pick With Real Speed Limits

RTX 5070 delivers 12GB GDDR7 at $549 — faster than RTX 4070 Ti for inference. But can it really run 70B? Here's the reality. Spoiler: 27B is the sweet spot.

Apr 2, 2026

Review

RTX 5070 Ti Review: The 16GB GPU That Finally Closes the VRAM Gap [2026 Tested]

RTX 5070 Ti 16GB review for local LLMs. Runs 70B models at 40+ tok/s, costs $250 less than RTX 5080, and delivers best value for power users. Benchmarked vs RTX 5080 and RTX 5060 Ti.

Apr 2, 2026

Review

RTX 5080 Local LLM Review: 30B at 30 tok/s, 70B Won't Fit

RTX 5080 hits 25–30 tok/s on 30B models—but 70B Q4 won't fit in 16GB VRAM. Is the $250 premium over RTX 5070 Ti worth it? Here's the answer.

Apr 2, 2026

Review

RTX 5090 for Local LLMs: Is 32GB VRAM Worth $2,000?

RTX 5090 dominates 70B model inference with 32GB GDDR7. Real benchmarks vs 5080, honest verdict on whether flagship VRAM is necessity or luxury for local AI.

Apr 2, 2026

Review

GMKtec EVO-X2 Review: 128GB Ryzen AI Max Mini PC for Local LLMs

Full review of the GMKtec EVO-X2 with Ryzen AI Max+ 395 and 128GB LPDDR5x. Real-world LLM performance, Linux setup, GTT memory allocation, and value verdict.

Mar 12, 2026

Review

Intel Arc B580 for Local LLMs: Best 12GB Card Under $300, With Caveats

Arc B580 offers 12GB VRAM at $249 — nothing else comes close at that price. Real benchmarks show 20–30% SYCL overhead vs CUDA. Here's who should buy it and who should skip it.

Mar 12, 2026

Review

Best Monitors for AI Development Workflows

Ultrawide monitors make local AI development significantly more productive. Here's what to look for and which models to buy in 2026.

Mar 8, 2026

Review

Liquid Cooling for AI Workstations: Worth the Hassle?

Do you need a custom loop or AIO cooler for a local LLM rig? The honest answer depends on one thing: how many GPUs you're running.

Mar 8, 2026

Review

5 Budget GPUs Under $300 That Can Actually Run Local LLMs

Under $300, your options are limited but workable. The Arc B580 wins at this tier — nothing else gives you 12GB VRAM at competitive bandwidth for less.

Feb 25, 2026