What is GTT memory on AMD Ryzen AI Max?

GTT (Graphics Translation Table) memory is system RAM that the iGPU can address as VRAM. On Ryzen AI Max with unified memory, this allows the GPU to use most of your installed RAM for model weights and KV cache.

Does amdgpu.gttsize still work for setting VRAM on modern kernels?

No. The amdgpu.gttsize parameter was deprecated and no longer takes effect on Linux kernels 6.1 and newer. Use amdttm.pages_limit and amdttm.page_pool_size instead.

How much VRAM can I actually allocate on the AI Max+ 395 with 128GB RAM?

Practical maximum is around 108GB. The kernel and OS reserve a portion of system RAM, so you cannot allocate all 128GB. A target of 96GB is stable and leaves enough RAM for the OS and background processes.

Do I need to rebuild the initramfs after changing kernel parameters?

On Fedora, running grub2-mkconfig is sufficient. On Ubuntu/Debian-based systems, update-grub regenerates the GRUB config. Neither requires rebuilding initramfs for kernel parameter changes.

Can I allocate all 128GB of RAM as VRAM on the Ryzen AI Max+ 395?

No. The kernel, OS, and background processes require their own RAM. Attempting to allocate all 128GB to the GPU causes OOM kills and system instability. The practical maximum is around 108GB via amdttm.pages_limit=28311552, but 96GB (amdttm.pages_limit=25165824) is the recommended stable target, leaving roughly 32GB for the system. This 96GB pool is sufficient to run Llama 3.1 70B at Q4_K_M (~43GB) with meaningful KV cache headroom.

Will these kernel parameters affect non-LLM tasks on the system?

Minimally. Setting a large GTT pool reserves memory management capacity for GPU use, but the pool is dynamically allocated — memory is not physically reserved until the GPU actually uses it for inference. Normal desktop and CPU workloads are unaffected. The main practical consideration is that you should leave at least 16-24GB for the OS and applications; setting pages_limit too aggressively (trying to claim 120GB+ on a 128GB system) risks OOM kills during inference when both GPU and CPU need memory simultaneously.

How to Allocate More VRAM on AMD Ryzen AI Max (Linux GTT Memory Guide)

Some links on this page may be affiliate links. We disclose it because you deserve to know, not because it changes anything. Every recommendation here comes from benchmarks, not budgets.

The AMD Ryzen AI Max series — particularly the AI Max+ 395 — is one of the most compelling platforms for local LLM inference because of its unified memory architecture. With 128GB of LPDDR5X shared between CPU and iGPU, you can theoretically run 70B models entirely in GPU memory. The problem: Linux does not give the GPU all that memory by default.

Out of the box, the iGPU on Ryzen AI Max typically exposes only a fraction of total RAM as usable VRAM through GTT (Graphics Translation Table) memory. If you check rocm-smi after a fresh install, you might see only 8–16GB of VRAM available. The remaining memory is there — the OS just has not been told to hand it to the GPU.

This guide covers exactly how to change that.

Quick Summary

Default VRAM: Only 8–16GB accessible out of the box on most Linux distros
Fix: Set amdttm.pages_limit and amdttm.page_pool_size kernel parameters — amdgpu.gttsize is deprecated and does nothing on modern kernels
Practical max: ~108GB on AI Max+ 395 with 128GB RAM; 96GB is a stable, safe target

Understanding GTT Memory vs. Dedicated VRAM

On discrete AMD GPUs (RX 7900 XTX, for example), VRAM is physically on the card and GTT memory is an additional pool of system RAM the GPU can address. The two pools are distinct.

On Ryzen AI Max, there is no discrete VRAM. The iGPU uses a portion of the system LPDDR5X directly. GTT memory is the VRAM. The driver needs to know how large that pool can grow, and by default it is conservative — typically set to a fraction of installed RAM.

This is a kernel TTM (Translation Table Manager) decision, not a firmware setting. You cannot change it in BIOS/UEFI. It requires a kernel parameter.

The Deprecated Method (Do Not Use)

Before kernel 6.1, the common advice was:

amdgpu.gttsize=98304

This parameter accepted a size in megabytes. It no longer works. On Linux 6.1+, this parameter is silently ignored. If you search forum posts and find someone telling you to use amdgpu.gttsize, that advice is outdated. You will add it to your kernel command line, reboot, and see no change in rocm-smi.

The Modern Method: amdttm Kernel Parameters

The correct parameters for current kernels are:

amdttm.pages_limit — maximum number of 4KB pages the TTM allocator can manage
amdttm.page_pool_size — size of the pre-allocated page pool

Both control the same underlying limit from different angles. Setting pages_limit is the primary lever.

The Formula

pages_limit = target_GB × 1024 × 1024 / 4.096

Breaking that down: you are converting gigabytes to 4KB pages.

1 GB = 1,073,741,824 bytes
1 page = 4,096 bytes
Pages per GB = 1,073,741,824 / 4,096 = 262,144

The simplified form 1024 × 1024 / 4.096 approximates this and is the value you will see in most documentation.

Common targets:

Target VRAM	pages_limit value
32 GB	8,388,608
48 GB	12,582,912
64 GB	16,777,216
96 GB	25,165,824
108 GB	28,311,552

For the AI Max+ 395 with 128GB RAM, 96GB is the recommended target. This leaves ~32GB for the OS, running applications, and kernel overhead. Pushing to 108GB is possible but increases the risk of OOM kills during inference if the model's KV cache grows large.

Applying the Parameters: Fedora

Fedora uses GRUB2. The parameters go in /etc/default/grub.

Step 1: Edit the GRUB config

sudo nano /etc/default/grub

Find the line that starts with GRUB_CMDLINE_LINUX= and add the parameters inside the quotes:

GRUB_CMDLINE_LINUX="... amdttm.pages_limit=25165824 amdttm.page_pool_size=25165824"

Keep any existing parameters intact. Only append.

Step 2: Regenerate the GRUB config

On Fedora with a BIOS/MBR system:

sudo grub2-mkconfig -o /boot/grub2/grub.cfg

On Fedora with UEFI:

sudo grub2-mkconfig -o /boot/efi/EFI/fedora/grub.cfg

If you are unsure which applies, run both. The one targeting the non-existent path will fail harmlessly.

Step 3: Reboot

sudo reboot

Applying the Parameters: Ubuntu / Debian

sudo nano /etc/default/grub

Add to GRUB_CMDLINE_LINUX_DEFAULT:

GRUB_CMDLINE_LINUX_DEFAULT="quiet splash amdttm.pages_limit=25165824 amdttm.page_pool_size=25165824"

Then:

sudo update-grub
sudo reboot

Applying the Parameters: Arch Linux

Edit /etc/default/grub the same way, then:

sudo grub-mkconfig -o /boot/grub/grub.cfg
sudo reboot

If you use systemd-boot instead of GRUB, add the parameters to your loader entry in /boot/loader/entries/.

Verifying the Change

After reboot, use two methods to confirm GTT memory is available.

Method 1: dmesg

sudo dmesg | grep -i gtt

You should see a line similar to:

[drm] amdgpu: 98304M of GTT memory ready.

If the value matches your target (approximately — it will be in MB), the parameter took effect.

Method 2: rocm-smi

rocm-smi --showmeminfo vram

Look for the GTT total. On a correctly configured AI Max+ 395 with 96GB target, you should see approximately 98,304 MB total GTT.

Method 3: /sys filesystem

cat /sys/class/drm/card0/device/mem_info_gtt_total

This returns bytes. Divide by 1,073,741,824 to get GB.

What To Do If It Does Not Work

Kernel too old: amdttm module parameters require a reasonably recent kernel. Linux 6.6+ is recommended for Ryzen AI Max. Check with uname -r. On Fedora 40/41, you should already be on a supported kernel. On older Ubuntu LTS releases, you may need to install a mainline kernel.

ROCm not installed: If ROCm drivers are not installed, the amdgpu kernel module may load in a limited mode that ignores TTM parameters. Install ROCm 6.x before attempting this configuration.

Parameter not being passed: Verify GRUB is actually using your edited config file. Some systems have multiple GRUB configs. Run sudo grep amdttm /proc/cmdline after reboot — if the parameters do not appear there, GRUB is not using the file you edited.

Practical VRAM Allocation for Inference

Once GTT memory is unlocked, your inference stack (Ollama, llama.cpp, etc.) needs to know to use it. Most tools detect VRAM automatically via ROCm. A 96GB allocation makes the following models fully GPU-resident:

Fits in 96GB GTT?

Yes

No For 405B and larger models, CPU+GPU hybrid inference with llama.cpp is still required — see our guide on CPU+GPU hybrid inference with llama.cpp for layer offloading strategy. For a full review of the hardware that makes best use of this configuration, see our GMKtec EVO-X2 review and the AMD Strix Halo vs Mac Mini M4 comparison.

Tokens per second on the AI Max+ 395 with models fully in GTT memory is significantly better than CPU inference: expect 15–30 t/s on 70B Q4_K_M, depending on context length and batch size. This is competitive with a single RTX 4090 for prompt processing, though slightly behind on generation speed due to LPDDR5X bandwidth versus GDDR6X.

Memory Bandwidth Consideration

One caveat about the AI Max approach: LPDDR5X bandwidth tops out around 273 GB/s on the AI Max+ 395. A discrete RTX 4090 provides 1,008 GB/s of GDDR6X bandwidth. For pure generation speed on large models, the RTX 4090 wins. Where the AI Max shines is total capacity — 96GB of fast-enough unified memory at a fraction of the cost of dual RTX 4090s, in a power envelope under 120W.

If you are running a 70B model and need maximum throughput rather than maximum capacity, a discrete GPU build still wins. But if you need to run 70B+ models on a single laptop or mini PC without external GPUs, the Ryzen AI Max with properly configured GTT memory is the most practical option available today.