Got the framework desktop working with 96GB allocated to GPU. The working software is LM Studio using Vulcan. ROCm crashes the model and ollama crashes gnome
Benchmarks following:
Want to remark on a sentiment I've heard from the AT dev community â in specific words, even, from @Boris, @npub1pl2e...j63z, and [@rude1.blacksky.team](
If I use the bios to allocate 96GB of my framework's memory to the GPU, I can get the 120b param GPT-OSS to respond very quickly, but within two prompts gnome crashes due to a failed vram allocation.
Step 1 is to debug that, but step 2 is to debug why dynamic allocation doesn't work