Does Vulkan work?

bjxjvhbkh · May 17, 2025, 11:25am

Vulkan makes a huge positive impact on LLM inference with llama.cpp,especially with multimodal models.

I’m thinking about getting a Radxa Orion O6, but I could not find anything about the state of Vulkan support.

willy · May 17, 2025, 3:48pm

I did a llama.cpp build enabling Vulkan but that was super slow, even if I offloaded only one layer to it. I’m clearly ignorant of all the technos related to GPUs, so I don’t understand what vulkan exactly is (driver, library etc), nor how it compares or relates to cuda, opencl etc. I don’t even know if it really used the GPU or fell back to emulation on the CPU. All these things are totally obscure to me, there are probably too many layers of abstraction and cryptic names for me :-/

nyanmisaka · May 17, 2025, 4:30pm

gist.github.com

https://gist.github.com/nyanmisaka/ce03b5f61cb02389a97094b5172f5bf0

mali-g720mc10-immortals-r53p0-00eac0-vulkaninfo.log

radxa@orion-o6:~$ vulkaninfo
'DISPLAY' environment variable not set... skipping surface info
==========
VULKANINFO
==========

Vulkan Instance Version: 1.3.239


Instance Extensions: count = 16

This file has been truncated. show original