Orion O6 Debug Party Invitation

you mean no output in uefi but linux works?

In UEFI there is no output from the GT 1030, but there is output from the on-board display.

NVIDIA proprietary driver:

2 Likes

@geerlingguy now I can’t wait for you to cover this board

About the GPUs I tested: Radeon RX 6400, GeForce RTX 4060, GeForce GT 1030

  • Only Radeon RX 6400 works on UEFI, but the Linux kernel crashes in the amdgpu driver
  • GeForce RTX 4060 doesn’t show up in lspci, so I suspect the PCIe link isn’t established
  • GeForce GT 1030 works on Fedora Workstation 41 (wayland)

These are the results when using a mainline based kernel in ACPI mode. If you use a CIX kernel in Device Tree mode, the results may be different.

NVIDIA GeForce GT 1030
It doesn’t work on UEFI.

Maybe it’s a compatibility issue between EDKII and the UEFI GOP version used in GT1030 VBIOS?

https://www.nvidia.com/en-us/drivers/nv-uefi-update-x64/

  • GeForce RTX 4060 doesn’t show up in lspci, so I suspect the PCIe link isn’t established
  • GeForce GT 1030 works on Fedora Workstation 41 (wayland)

The GT1030 uses PCIe Gen3 while the RTX4060 uses PCIe Gen4, which may be the problem. Support for Cadence PCIe IP in upstream is not yet complete.

https://patchwork.kernel.org/project/linux-pci/patch/20250123070935.1810110-1-18255117159@163.com/

You cant force PCIE to 3.0 on the bios?

…to be fair I’m only covering the very bottom right part in this photo.

And thank you Radxa team for the detailed Getting Started guide. Hope to get this running quite soon!

7 Likes

The default governor may perform worse than expected in some apps. Setting them to performance can improve results.

# cpu
echo performance | sudo tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor

# gpu
echo performance | sudo tee /sys/devices/platform/soc@0/15000000.gpu/devfreq/15000000.gpu/governor

radxa@orion-o6:~$ glmark2-es2-wayland
=======================================================
    glmark2 2023.01
=======================================================
    OpenGL Information
    GL_VENDOR:      ARM
    GL_RENDERER:    Mali-G720-Immortalis
    GL_VERSION:     OpenGL ES 3.2 v1.r49p0-00eac0.b97811108d91b3a6cd0a9d90e51f9da5
    Surface Config: buf=32 r=8 g=8 b=8 a=8 depth=24 stencil=0 samples=0
    Surface Size:   800x600 windowed
=======================================================
[build] use-vbo=false: FPS: 9015 FrameTime: 0.111 ms
[build] use-vbo=true: FPS: 9245 FrameTime: 0.108 ms
[texture] texture-filter=nearest: FPS: 9097 FrameTime: 0.110 ms
[texture] texture-filter=linear: FPS: 9035 FrameTime: 0.111 ms
[texture] texture-filter=mipmap: FPS: 9915 FrameTime: 0.101 ms
[shading] shading=gouraud: FPS: 8299 FrameTime: 0.121 ms
[shading] shading=blinn-phong-inf: FPS: 7943 FrameTime: 0.126 ms
[shading] shading=phong: FPS: 8077 FrameTime: 0.124 ms
[shading] shading=cel: FPS: 8054 FrameTime: 0.124 ms
[bump] bump-render=high-poly: FPS: 4527 FrameTime: 0.221 ms
[bump] bump-render=normals: FPS: 9443 FrameTime: 0.106 ms
[bump] bump-render=height: FPS: 8805 FrameTime: 0.114 ms
[effect2d] kernel=0,1,0;1,-4,1;0,1,0;: FPS: 8780 FrameTime: 0.114 ms
[effect2d] kernel=1,1,1,1,1;1,1,1,1,1;1,1,1,1,1;: FPS: 6532 FrameTime: 0.153 ms
[pulsar] light=false:quads=5:texture=false: FPS: 8829 FrameTime: 0.113 ms
[desktop] blur-radius=5:effect=blur:passes=1:separable=true:windows=4: FPS: 3839 FrameTime: 0.261 ms
[desktop] effect=shadow:windows=4: FPS: 7163 FrameTime: 0.140 ms
[buffer] columns=200:interleave=false:update-dispersion=0.9:update-fraction=0.5:update-method=map: FPS: 1449 FrameTime: 0.690 ms
[buffer] columns=200:interleave=false:update-dispersion=0.9:update-fraction=0.5:update-method=subdata: FPS: 1407 FrameTime: 0.711 ms
[buffer] columns=200:interleave=true:update-dispersion=0.9:update-fraction=0.5:update-method=map: FPS: 2160 FrameTime: 0.463 ms
[ideas] speed=duration: FPS: 4056 FrameTime: 0.247 ms
[jellyfish] <default>: FPS: 6343 FrameTime: 0.158 ms
[terrain] <default>: FPS: 708 FrameTime: 1.413 ms
[shadow] <default>: FPS: 7009 FrameTime: 0.143 ms
[refract] <default>: FPS: 1615 FrameTime: 0.619 ms
[conditionals] fragment-steps=0:vertex-steps=0: FPS: 9377 FrameTime: 0.107 ms
[conditionals] fragment-steps=5:vertex-steps=0: FPS: 8472 FrameTime: 0.118 ms
[conditionals] fragment-steps=0:vertex-steps=5: FPS: 9597 FrameTime: 0.104 ms
[function] fragment-complexity=low:fragment-steps=5: FPS: 9585 FrameTime: 0.104 ms
[function] fragment-complexity=medium:fragment-steps=5: FPS: 8515 FrameTime: 0.117 ms
[loop] fragment-loop=false:fragment-steps=5:vertex-steps=5: FPS: 8545 FrameTime: 0.117 ms
[loop] fragment-steps=5:fragment-uniform=false:vertex-steps=5: FPS: 8429 FrameTime: 0.119 ms
[loop] fragment-steps=5:fragment-uniform=true:vertex-steps=5: FPS: 8387 FrameTime: 0.119 ms
=======================================================
                                  glmark2 Score: 7036
=======================================================

radxa@orion-o6:~$ glmark2-wayland
=======================================================
    glmark2 2023.01
=======================================================
    OpenGL Information
    GL_VENDOR:      Mesa
    GL_RENDERER:    zink (Mali-G720-Immortalis)
    GL_VERSION:     4.0 (Compatibility Profile) Mesa 23.0.4 (git-b87980692c)
    Surface Config: buf=32 r=8 g=8 b=8 a=8 depth=24 stencil=0 samples=0
    Surface Size:   800x600 windowed
=======================================================
[build] use-vbo=false: FPS: 3568 FrameTime: 0.280 ms
[build] use-vbo=true: FPS: 3743 FrameTime: 0.267 ms
[texture] texture-filter=nearest: FPS: 4106 FrameTime: 0.244 ms
[texture] texture-filter=linear: FPS: 4124 FrameTime: 0.243 ms
[texture] texture-filter=mipmap: FPS: 3921 FrameTime: 0.255 ms
[shading] shading=gouraud: FPS: 3219 FrameTime: 0.311 ms
[shading] shading=blinn-phong-inf: FPS: 2941 FrameTime: 0.340 ms
[shading] shading=phong: FPS: 3190 FrameTime: 0.313 ms
[shading] shading=cel: FPS: 3112 FrameTime: 0.321 ms
[bump] bump-render=high-poly: FPS: 2123 FrameTime: 0.471 ms
[bump] bump-render=normals: FPS: 3788 FrameTime: 0.264 ms
[bump] bump-render=height: FPS: 3112 FrameTime: 0.321 ms
[effect2d] kernel=0,1,0;1,-4,1;0,1,0;: FPS: 3868 FrameTime: 0.259 ms
[effect2d] kernel=1,1,1,1,1;1,1,1,1,1;1,1,1,1,1;: FPS: 2947 FrameTime: 0.339 ms
[pulsar] light=false:quads=5:texture=false: FPS: 3705 FrameTime: 0.270 ms
[desktop] blur-radius=5:effect=blur:passes=1:separable=true:windows=4: FPS: 1416 FrameTime: 0.706 ms
[desktop] effect=shadow:windows=4: FPS: 2332 FrameTime: 0.429 ms
[buffer] columns=200:interleave=false:update-dispersion=0.9:update-fraction=0.5:update-method=map: FPS: 1221 FrameTime: 0.820 ms
[buffer] columns=200:interleave=false:update-dispersion=0.9:update-fraction=0.5:update-method=subdata: FPS: 1435 FrameTime: 0.697 ms
[buffer] columns=200:interleave=true:update-dispersion=0.9:update-fraction=0.5:update-method=map: FPS: 1435 FrameTime: 0.697 ms
[ideas] speed=duration: FPS: 1341 FrameTime: 0.746 ms
[jellyfish] <default>: FPS: 1825 FrameTime: 0.548 ms
[terrain] <default>: FPS: 440 FrameTime: 2.275 ms
[shadow] <default>: FPS: 2321 FrameTime: 0.431 ms
[refract] <default>: FPS: 797 FrameTime: 1.256 ms
[conditionals] fragment-steps=0:vertex-steps=0: FPS: 3792 FrameTime: 0.264 ms
[conditionals] fragment-steps=5:vertex-steps=0: FPS: 3948 FrameTime: 0.253 ms
[conditionals] fragment-steps=0:vertex-steps=5: FPS: 3786 FrameTime: 0.264 ms
[function] fragment-complexity=low:fragment-steps=5: FPS: 4004 FrameTime: 0.250 ms
[function] fragment-complexity=medium:fragment-steps=5: FPS: 4147 FrameTime: 0.241 ms
[loop] fragment-loop=false:fragment-steps=5:vertex-steps=5: FPS: 4004 FrameTime: 0.250 ms
[loop] fragment-steps=5:fragment-uniform=false:vertex-steps=5: FPS: 3936 FrameTime: 0.254 ms
[loop] fragment-steps=5:fragment-uniform=true:vertex-steps=5: FPS: 3986 FrameTime: 0.251 ms
=======================================================
                                  glmark2 Score: 2957
=======================================================
2 Likes

8K60 AV1 hardware acceleration in Chromium works out of the box in the Beta images (vendor kernel), but it should be noted that this requires specific versions of the Chromium and Mutter compositor packages that built into the image.

So, before upgrading your packages via sudo apt update && sudo apt upgrade, run sudo apt-mark hold chromium chromium-common libmutter-11-0 to ensure these packages are pinned.

3 Likes

I guess you’re talking about your 2nd Geekbench 6 run against all the others: https://browser.geekbench.com/search?q=orion (archived version as of now)

When we compare an earlier run of this Geekbench 6 ‘benchmark’ with your result made with performance then we can see the O6 still being outperformed by the older Cix EVB while both were running at the same clockspeed (at least single-threaded: 2.5 GHz running in reality while funny GB6 ‘reports’ nonsense like 0 MHz and 2.6 GHz): https://browser.geekbench.com/v6/cpu/compare/10147782?baseline=8545345

So there’s room for improvements with current settings which is good news to me :slight_smile:

Not only in Geekbench 6, but also in my FFmpeg and glmark2 gpu test. The default governors - cpu/schedutil and gpu/simple_ondemand do not perform well in many software.

From output of v4l2-compliance -d3:
Total for mvx device /dev/video3: 45, Succeeded: 32, Failed: 13, Warnings: 3

I tried chromiun v132 from debian repo and h264, hevc and vp9 decoders are detected, but when playing h264 video the v4l2 decoder failed because VIDIOC_STREAMOFF failed. The bsp is not providing a standard v4l2 stateful api so is has to use a patched chromium.

Yes, MVX reuses the V4L2 interface and adds some of its own extensions.

With these two patches you can test the v4l2m2m decoder in FFmpeg.
0001-v4l2m2m-dec.zip (2.2 KB)

The path of the MVX demo is at /usr/share/cix/bin/mvx_{decoder,encoder}

1 Like

That is Mesa Zink running over libmali?

Yes. libmali(vulkan) -> zink -> opengl (not es)

clinfo/clpeak results: mali-g720mc10-immortals-r49p0-00eac0-clinfo-clpeak.log

vulkaninfo/vkpeak results: mali-g720mc10-immortals-r49p0-00eac0-vulkaninfo-vkpeak.log

I couldn’t resist to boot ‘my’ O6 with the Radxa supplied Debian image relying on Cix’ BSP kernel. While I don’t think the board at this stage is ready to be benchmarked, I let an sbc-bench -r in Netio consumption measuring mode run anyway. ‘Results validation’ section tells us:

  • Measured clockspeed not lower than advertised max CPU clockspeed
  • Background activity (%system) OK
  • Too much other background activity: 1% avg, 2% max -> https://tinyurl.com/mr2wy5uv
  • No throttling
  • schedutil cpufreq governor configured but neither dynamic-power-coefficient nor sched-energy-costs defined

So schedutil can’t work properly (we’ll see if it ever will), we have too much background activity (killing /usr/bin/cix_audio_switch.sh helps, disabling desktop environment not – quite the opposite) and idle consumption right now at least with ‘my’ board (only NVMe SSD attached, the OS running from an external 10 Gbps USB SSD, nothing else -> headless) is a weird 16W with every governor/policy set to powersave… and ~17,5W with everything set to performance.

Consumption difference when running the demanding 7-zip benchmark is +11,5W (28W) compared to idle so there’s something seriously wrong.

Asides that the BSP kernel has a few more policies/governors compared to some stock arm64 distro kernel which isn’t that surprising.

2 Likes

cix_audio_switch.sh

I removed this. It seems to execute once a minute and produces a bunch of logs.

  • schedutil cpufreq governor configured but neither dynamic-power-coefficient nor sched-energy-costs defined

Linux on ARM uses device-tree to configure this stuff.

I’m sure CIX and Radxa don’t plan to waste time on this device tree image, they are already testing ACPI based 6.6 BSP kernel.

It’s in an endless loop with a sleep 0.5 delay.

Another interesting observation: I started sbc-bench in ‘Geekbench mode’ (-G). In this mode sbc-bench wants to test each cluster individually (sorted by CPU type) and then tests all CPU cores twice (to highlight how unreliable Geekbench 6 is accross different test runs). Results here: https://0x0.st/88Pr.txt

With CD8180/P1 we have two types of cores (A520 and A720) in a somewhat strange setup:

CPU sysfs topology (clusters, cpufreq members, clockspeeds)
                 cpufreq   min    max
 CPU    cluster  policy   speed  speed   core type
  0        0        0      800    2500   Cortex-A720 / r0p1
  1        0        1      800    1800   Cortex-A520 / r0p1
  2        0        1      800    1800   Cortex-A520 / r0p1
  3        0        1      800    1800   Cortex-A520 / r0p1
  4        0        1      800    1800   Cortex-A520 / r0p1
  5        0        5      800    2300   Cortex-A720 / r0p1
  6        0        5      800    2300   Cortex-A720 / r0p1
  7        0        7      800    2200   Cortex-A720 / r0p1
  8        0        7      800    2200   Cortex-A720 / r0p1
  9        0        9      800    2400   Cortex-A720 / r0p1
 10        0        9      800    2400   Cortex-A720 / r0p1
 11        0        9      800    2400   Cortex-A720 / r0p1

The cpufreq properties match with reality (measured clockspeeds) and we should assume cpu0 as an A720 core able to run at 2.5 GHz will be the fastest (which is the case with benchmarks like 7-zip)? Not with Geekbench:

                                0       1-4      5-11   all 1st   all 2nd

 Single-Core Score           1059       268      1278      1282      1277
 Multi-Core Score            1059       801      5711      6572      6537

The fastest A720 core is amongst those that clock between 2.2 and 2.4 GHz for whatever reasons. Now modifying sbc-bench to recognize cluster not by core type but cpufreq properties to retest.

Edit: Testing with Geekbench 6 again reveals that the fastest CPU core is not cpu0 but amongst cpu9-cpu11 which according to cpufreq settings are clocked lower than cpu0 (2.4 vs. 2.5 GHz): https://0x0.st/88ZR.txt

Testing all eight A720 cores individually reveals that cpu11 is the fastest core. While cpufreq settings show a 2.4 GHz cluster consisting of cpu9-cpu11 in reality cpu11 is clocked at 2.5 GHz and has some gains over cpu0:

Using 7-zip benchmark single-threaded and @willy’s mhz utility it looks like this:

for i in 11 0 5 6 7 8 9 10 ; do echo -e "\ncpu${i}:" ; taskset -c $i 7zr b -mmt=1 | grep -E "Avr|Tot" ; echo $(( $(taskset -c $i /usr/local/src/mhz/mhz 3 100000 | awk -F" cpu_MHz=" '{s+=$2} END {printf "%.0f", s}') / 3 )) ; done

cpu11:
Avr:             100   3647   3642  |              100   4035   4031
Tot:             100   3841   3836
2498

cpu0:
Avr:             100   3577   3571  |              100   4022   4018
Tot:             100   3800   3794
2498

cpu5:
Avr:             100   3455   3448  |              100   3684   3679
Tot:             100   3570   3564
2298

cpu6:
Avr:             100   3460   3454  |              100   3688   3683
Tot:             100   3574   3568
2298

cpu7:
Avr:             100   3418   3412  |              100   3527   3523
Tot:             100   3472   3467
2198

cpu8:
Avr:             100   3414   3408  |              100   3532   3528
Tot:             100   3473   3468
2199

cpu9:
Avr:             100   3614   3607  |              100   3882   3877
Tot:             100   3748   3742
2398

cpu10:
Avr:             100   3533   3528  |              100   3868   3863
Tot:             100   3701   3695
2398
3 Likes

Im not sure how much fast it is vs the G610 on a RK3588, looks like about x3 to me?

Good thing about that OpenGL 4.0 support via Zink, it means it has geometry shaders, thats why the G610 was limited to 3.1.