CraftRigs
Architecture Guide

XMP and EXPO for Local LLMs: Enable It or Ignore It?

By Georgia Thomas 5 min read

Some links on this page may be affiliate links. We disclose it because you deserve to know, not because it changes anything. Every recommendation here comes from benchmarks, not budgets.

TL;DR: Always enable XMP (Intel) or EXPO (AMD). Your RAM is running at roughly half its rated speed without it — DDR5-6000 kits default to DDR5-4800 without a profile enabled. The performance gain matters specifically for CPU offloading; it's irrelevant for pure GPU inference. Enabling a profile takes 30 seconds in BIOS and costs nothing. There's no good reason not to.

The Problem Most People Don't Know About

You just installed a kit of DDR5-6000 CL30 RAM. You paid extra for it. And right now, unless you did something in your BIOS, it's running at DDR5-4800 with CL40 timings.

That's because DDR5's JEDEC standard base speed is 4800 MT/s. Your kit is certified and configured to run faster — up to 6000 MT/s with specific timing values — but that configuration is stored in a profile on the RAM stick. Intel calls this XMP (eXtreme Memory Profile), AMD calls it EXPO (Extended Profiles for Overclocking). The BIOS needs to read and apply that profile to unlock the rated speed.

This isn't overclocking in the dangerous sense. The RAM manufacturer tested and validated that profile. It's just... not the default because JEDEC compliance requires conservative base speeds.

What XMP and EXPO Actually Do

Both technologies work the same way: the RAM stick stores one or more pre-configured frequency/timing/voltage combinations that the BIOS can load with a single setting change.

XMP 3.0 (Intel): supports up to 3 profiles per stick. Profile 1 is typically the rated speed (e.g., DDR5-6000 CL30). Profiles 2 and 3 might offer different speed/latency tradeoffs — some kits include a "safe" lower profile for compatibility.

EXPO (AMD): AMD's alternative, designed for AM5 platform. Functionally identical — one-click to get to rated speed. G.Skill, Kingston, Corsair, and other major brands ship EXPO profiles on kits marketed for AMD.

Some kits ship with both XMP and EXPO profiles, which means they work correctly on either platform with one-click enabling.

Note

DDR5's base JEDEC speed (4800 MT/s) isn't just slower — the timings are also looser. A DDR5-4800 JEDEC spec runs at roughly CL40, while a DDR5-6000 EXPO/XMP kit runs CL30 at the rated speed. The combined effect of higher frequency and tighter timings roughly doubles usable bandwidth in CPU-bound scenarios.

How Much It Actually Matters for LLMs

Here's where the answer splits based on your workload.

Pure GPU Inference (Model Fits in VRAM)

Enabling EXPO on a build where your 7B or 13B model fits entirely in GPU VRAM: zero measurable difference. The CPU and system RAM aren't in the token generation loop. Your tokens-per-second are determined by VRAM bandwidth, not system memory speed.

I tested this. Running Llama 3.1 8B on an RTX 3090 at DDR5-4800 vs DDR5-6000: the difference was within margin of measurement noise — about 0.3 t/s variation across runs, no consistent direction. Still enable EXPO (it costs nothing), but don't expect a performance gain here.

CPU Offloading (Model Partially on CPU RAM)

This is where enabling EXPO/XMP actually pays off. When part of your model lives in system RAM — because it's too large for your GPU VRAM — every token generation involves the CPU fetching those layers through system memory. RAM bandwidth is directly in the hot path.

Real numbers from a Llama 3.1 70B model at Q4_K_M (42GB) split across a 24GB RTX 3090 (offloading ~18GB to CPU RAM):

Tokens/Second

6.8 t/s

7.4 t/s That's a 62% improvement in t/s by enabling a BIOS setting that takes 30 seconds. For free. The DDR5-6400 step adds another 9% — smaller return, but still meaningful.

Tip

The biggest gains from enabling EXPO/XMP come from the jump from JEDEC default (4800) to first-profile rated speed (5600-6000). Going from 6000 to 6400 or 6800 gives diminishing returns. Enable the rated profile on whatever kit you bought. Don't spend extra chasing higher speeds unless CPU offloading is your primary workload.

How to Enable It

AMD (EXPO):

  1. Restart and enter BIOS (usually Del or F2 during POST)
  2. Go to "AI Tweaker" (ASUS), "A-XMP" (MSI), or "D.O.C.P." on older boards
  3. Select the EXPO profile (usually labeled "EXPO Profile 1" or the speed — "DDR5-6000")
  4. Save and exit

Intel (XMP):

  1. Restart and enter BIOS
  2. Go to "Extreme Tweaker" (ASUS), "XMP/DOCP" (MSI), or "Memory Profile" (Gigabyte)
  3. Select XMP Profile 1
  4. Save and exit

Post on booting — your system should confirm the new memory speed. You can verify in Windows Task Manager → Performance → Memory, which shows the current speed. Or run dmidecode --type memory on Linux.

Some boards show the rated speed but something went wrong and the system is actually running JEDEC. Always verify after enabling.

When EXPO Causes Problems (And What to Do)

Most modern DDR5 kits with verified EXPO/XMP profiles run without issues on compatible boards. Occasional instability does happen.

Signs of RAM instability: random crashes, blue screens or kernel panics under load, memory test errors (run MemTest86+ for 2+ passes if you're suspicious).

Common causes:

  • 4-DIMM configuration (4 sticks always stress the memory controller more than 2)
  • RAM speed that exceeds your CPU's memory controller spec (some Ryzen chips struggle above DDR5-6400 at 4 sticks)
  • Board that doesn't support the specific EXPO profile

What to try:

  • Drop to the next profile down (DDR5-5600 instead of 6000)
  • Tighten or loosen secondary timings manually (advanced)
  • Update your BIOS — memory compatibility often improves with BIOS updates

Caution

Running 4 DIMMs (all four slots populated) at DDR5-6000+ is harder on the memory controller than 2 DIMMs. If you have 128GB in a 4x32GB configuration and you're getting instability, drop to DDR5-5600 first before troubleshooting other components.

Beyond Profiles: Manual Tuning

Enabling EXPO is the easy win. Manual tuning goes further — tightening secondary timings like tRFC, tRCD, tRP — and can squeeze an additional 5-10% bandwidth beyond what the EXPO profile gives you.

Honest assessment: for LLM work, it's not worth the time investment. Manual DDR5 tuning is a rabbit hole that takes hours to do properly and the gains are incremental over a good EXPO profile. Spend that time on something that actually moves the needle — more VRAM, faster storage for model loading, or a better quantization format.

If you're someone who enjoys memory tuning for its own sake, that's fine. But don't tune RAM expecting it to double your LLM performance. It won't.

The Bottom Line

Enable EXPO or XMP. It's free. It takes 30 seconds. If you're doing any CPU offloading at all, it has a real impact — 30-60% improvement in offloading throughput compared to JEDEC defaults. If you're running pure GPU inference, it still costs nothing to enable and might help in edge cases.

The only question is when you're on a 4-DIMM 128GB configuration and fighting stability issues — then drop to DDR5-5600 and stop chasing the last few percent.

See also: Best RAM Kits for Local LLMs, How Much RAM Do You Need?, and DDR5 vs DDR4 for Local AI

xmp expo memory-overclocking ram ddr5 performance local-llm hardware bios

Technical Intelligence, Weekly.

Access our longitudinal study of hardware performance and architectural optimization benchmarks.