CraftRigs
Architecture Guide

Build a Local LLM PC for Under $500: What You Can Actually Run

By Georgia Thomas 5 min read

Some links on this page may be affiliate links. We disclose it because you deserve to know, not because it changes anything. Every recommendation here comes from benchmarks, not budgets.

Quick Summary

  • Target budget: $480-520 total, using used GPU + new mid-range components
  • Best GPU option: Used RTX 3060 12GB ($180) for CUDA compatibility; RX 6700 XT 12GB ($200) as AMD alternative
  • Honest ceiling: Llama 3.1 8B at 30-40 t/s, Phi-4 14B (slower), no 70B — this is a single-user interactive rig, not a batch workstation

If you want to run local LLMs and your total budget is $500, you can build a capable rig — not a fast one, but a functional one. The used GPU market makes this possible in 2026. A used RTX 3060 12GB costs $150-180 on eBay. That's 12GB of CUDA VRAM at a price that leaves room for a complete system.

Here's what the build looks like, what it runs, and where it hits the wall.

The GPU: Your Most Important Decision

The GPU dominates both cost and capability in a local LLM build. At the $500 total budget, you have roughly $150-200 to spend on the GPU. Two cards stand out:

Used RTX 3060 12GB — $150-180

The RTX 3060 12GB is the recommended pick. Here's why it stands out in the budget tier:

12GB VRAM in a mainstream card. NVIDIA made an unusual decision with the 3060 — they gave it 12GB while the 8GB RTX 3060 Ti got less VRAM than its cheaper sibling. For local LLM use, this means the 3060 12GB outperforms the 3060 Ti on model capacity despite costing less.

CUDA support, zero friction. llama.cpp, Ollama, LM Studio, ComfyUI — everything works. No driver wrestling, no backend configuration, no compatibility surprises.

Reasonable bandwidth. At 360 GB/s, it's not fast by 2026 standards, but adequate for 8B and 13B models.

Benchmark: Llama 3.1 8B Q4_K_M at ~35-40 tokens/second. Interactive, usable, not frustrating.

The risk with used 3060s: check for mining history. Many 3060s were used for Ethereum mining before the merge. GPU-Z can reveal power limit modifications and usage patterns. Buy from sellers with return policies and check eBay seller ratings carefully.

RX 6700 XT 12GB — $180-200

AMD's 12GB option at similar pricing. Slightly more VRAM bandwidth than the RTX 3060. The catch: ROCm (AMD's CUDA equivalent) requires Linux and is absent from Windows. On Linux with ROCm properly configured, you can get competitive performance with llama.cpp. On Windows, you're limited to Vulkan or CPU fallback in llama.cpp — noticeably slower.

If you're building on Linux and comfortable with ROCm setup, the 6700 XT is a viable alternative. If you're on Windows or want simplicity, stick with the RTX 3060.

Full Build Parts List

Price

~$180

~$100

~$50

~$50

~$40

~$560 You can trim this below $500 by:

  • Skipping to 16GB RAM initially (upgrade later) — saves $30
  • Using a used CPU — Ryzen 5 5600X used runs ~$75-80
  • Choosing a smaller SSD — 256GB works for the OS + a few models

Why Ryzen 5 5600X? It's the performance-per-dollar leader on the AM4 platform in 2026. AM4 is a mature platform with a huge used parts ecosystem. The 5600X pairs well with any B550 board and DDR4 — and DDR4 is the budget choice right now. DDR5 systems command a significant price premium that doesn't make sense at this budget tier.

32GB RAM matters. When running models that exceed your VRAM, llama.cpp offloads layers to system RAM. 32GB gives you room to offload 2-4 layers of a 13B model rather than crashing. It also prevents the OS from swapping under normal workloads.

What This Build Actually Runs

Runs well (30+ tokens/second, interactive quality)

  • Llama 3.1 8B Q4_K_M — fits fully in VRAM, 35-40 t/s
  • Mistral 7B — similar performance to Llama 8B
  • Phi-4 Mini (3.8B) — fast, good for code assistance
  • Gemma 3 4B — Google's small model, runs clean

Runs, but slowly (15-30 t/s or with CPU offloading)

  • Phi-4 14B Q4_K_M — 12GB is tight; may need Q3_K_M to fit fully, slower at ~20-25 t/s
  • Llama 3.1 13B Q4_K_M — fits in 12GB at Q4_K_M (~8GB model size), runs at ~25-30 t/s
  • Mistral 22B Q3_K_M — partial CPU offloading required, expect 10-15 t/s

Won't run usably on this build

  • Llama 3 70B — needs 24GB+ VRAM for Q4_K_M; CPU offloading to 32GB RAM works technically but at 2-5 t/s, which is painful for interactive use
  • Any 30B+ model at Q4_K_M — same problem
  • Batch inference / parallel requests — this build is single-user interactive only

The Honest Verdict

This is a real local LLM machine. Llama 3.1 8B at 35-40 tokens per second is a comfortable interactive experience — fast enough that you're not watching a cursor blink. Mistral 7B for code assistance, Phi-4 Mini for quick tasks, and occasional 13B model use are all legitimate use cases at this budget.

What it isn't: a development workstation, a batch processing rig, or a machine for experimenting with frontier models. If those requirements matter, see the full local AI budget guide across every price tier or our $1,200 local LLM build guide — the $1,200 tier gets you significantly more headroom.

The build also isn't futureproofed. The Ryzen 5 5600X on AM4 is a mature platform, not a growth platform. DDR4 won't support future upgrades to DDR5-native CPUs. But at $500, you're buying today's capability at today's prices — not building for five years from now.

For pure budget GPU analysis, see the best GPUs under $300 for local LLM use. For a detailed comparison of the RTX 3060 12GB against the RTX 4060 Ti 16GB at the next tier up, see our RTX 4060 Ti 16GB vs RTX 3060 12GB comparison.

Shopping Notes

eBay: Best source for used RTX 3060s. Search "RTX 3060 12GB" and filter completed listings to gauge real prices. Avoid listings with "as is" or "for parts" disclaimers. Buy from sellers with 100+ positive feedback and returns accepted.

GPU condition check: When the card arrives, run GPU-Z (Windows) or nvidia-smi immediately. Check core clock, memory clock, temperature under load. Run a 30-minute llama.cpp inference session and monitor temps — 80C+ under load on a founders edition card may indicate thermal paste degradation.

RAM pricing: DDR4 32GB kits (2x16GB) run $55-70 in early 2026. Don't pay more than $70 — prices are stable and kits are widely available. Any DDR4-3200 CL16 kit works fine for this build.

budget-build local-llm rtx-3060 pc-build cheap-ai

Technical Intelligence, Weekly.

Access our longitudinal study of hardware performance and architectural optimization benchmarks.