Fogwise Airbox: Vision Model Support for Radxa Fogwise Airbox

Hello everyone,

I am exploring the Radxa Fogwise Airbox and its compatibility with various vision models. Specifically, I’m interested in:

Phi-3 Vision https://huggingface.co/microsoft/Phi-3-vision-128k-instruct
llava-llama3 https://huggingface.co/xtuner/llava-llama-3-8b-v1_1-gguf
llava https://github.com/haotian-liu/LLaVA/blob/main/docs/MODEL_ZOO.md

Does anyone know if there are pre-converted versions of any of these models, or models with similar capabilities, available for the Radxa Fogwise Airbox?
Any information on their performance and setup would be greatly appreciated.

Thanks in advance!

Best regards,
Matthias

note specifically the models you asked about, but you can see what is available in the model zoo.

Hi, @M_Kraft

For now we don’t have any pre-converted multimodal model,

but you still can convert it using the latest version of TPU-MLIR

best,
Morgan