GDDR7
The latest GPU memory standard, used in RTX 50-series cards, with roughly double GDDR6X bandwidth.
GDDR7 is the seventh generation of GDDR (Graphics Double Data Rate) memory, introduced in NVIDIA's RTX 50-series GPUs (Blackwell architecture) in 2025. It replaces GDDR6X as the high-end GPU memory standard and delivers a substantial bandwidth improvement over its predecessor.
Performance vs GDDR6X
GDDR7 operates at higher data rates and uses a wider internal bus per pin, roughly doubling effective bandwidth over GDDR6X:
- RTX 4090 (GDDR6X): 1,008 GB/s
- RTX 5090 (GDDR7): approximately 1,800 GB/s
- RTX 5080 (GDDR7): approximately 960 GB/s
The RTX 5090's bandwidth advantage over the 4090 is roughly 78% — and since LLM decode speed is memory-bandwidth bound, that translates nearly directly to faster token generation at the same model size.
VRAM Capacity on RTX 50-Series
The RTX 5090 launches with 32GB of GDDR7, and the RTX 5080 with 16GB. For local LLMs:
- 32GB on the 5090 opens up 32B models at higher quantization levels (Q6, Q8) that were marginal on 24GB cards
- 16GB on the 5080 is the same capacity as the old RTX 3080 Ti — adequate for 13B models at decent quality settings
Power Considerations
GDDR7's higher bandwidth comes with higher power draw. The RTX 5090 has a 575W TDP, requiring a robust PSU (900W+ recommended) and good case airflow. This is a meaningful consideration for a dedicated local AI rig.
Why It Matters for Local AI
GDDR7 is the most significant bandwidth jump in GPU memory in recent years. For LLM workloads specifically, this means the RTX 50-series — particularly the 5090 — will generate tokens materially faster than RTX 40-series on equivalent models. If you're building a new rig in 2025–2026, GDDR7 cards are worth the premium if tokens-per-second is your priority.