Orion O6 Debug Party Invitation

Yep. The file compression test at least shows MB/s and what we can see is that in the MT test it delivers twice the performance of Rock 5B+, which is quite decent. Also such tools are never able to measure the performance on big cores only (or by excluding the little cores cluster which is not there for performance purposes). This would significantly help and even stabilize measurements: it’s even possible that some tests are not finishing at the same time on all cores and affect the overall measurement!

2 Likes

Long time (20y+) linux user/sys admin/network admin here that would love to hack on general Debian support as well as testing various NICs with this board and isn’t afraid to use an SPI Flasher and recompile some kernels. I would also try to get DPDK/VPP up and running and do some benchmarks.

2 Likes

BredOS developer here, we develop an archlinux based os for many SBCs, we think BredOS supporting the Orion O6 would be good since there’s already a thread called “Archlinux for the Orion O6”

3 Likes

I mainly work on development related to the Linux kernel and DRM. If I have a O6, I might be able to help with some display upstream-related work.

3 Likes

The day has come ;D

Looking forward to see code while waiting for benchmarks and reviews.

Here is the output from running sbc-bench: https://0x0.st/8opg.bin

EDIT: it was run on official Debian 12

5 Likes

Thank you Naoki. I’m very pleased to see that it looks pretty good out of the box, with measured CPU frequencies matching the advertised OPP, and excellent RAM timings for tests from the A720 cores! The CPU cores ordering is very strange however, I don’t know if it’s caused by the cores declaration in the DTB or any such thing:

     CPU    cluster  policy   speed  speed   core type
      0        0        0      800    2500   Cortex-A720 / r0p1
      1        0        1      800    1800   Cortex-A520 / r0p1
      2        0        1      800    1800   Cortex-A520 / r0p1
      3        0        1      800    1800   Cortex-A520 / r0p1
      4        0        1      800    1800   Cortex-A520 / r0p1
      5        0        5      800    2300   Cortex-A720 / r0p1
      6        0        5      800    2300   Cortex-A720 / r0p1
      7        0        7      800    2200   Cortex-A720 / r0p1
      8        0        7      800    2200   Cortex-A720 / r0p1
      9        0        9      800    2400   Cortex-A720 / r0p1
     10        0        9      800    2400   Cortex-A720 / r0p1
     11        0        9      800    2400   Cortex-A720 / r0p1

=> 1 2.5G big core, 4 1.8G small cores, 2 big 2.3G cores, 2 big 2.2G cores, 3 big 2.4G cores. That’s 5 different clusters, compared to the expected 3 clusters which should be each made of 4 identical cores.
Regardless, despite the many different frequencies, these frequencies are reasonably good already (which explains the performance gains you measured on the kernel build).

In fact, I suspect that for the little cores (A520), where the DRAM timings are particularly bad, that’s exactly the same issue as on the rock5 where the DMC doesn’t leave powersave mode when only little cores are running, and very likely that the timings will be better if you change the DMC to performance mode. But for tests involving all cores at once, it should not change anything, since the big cores will turn the DMC to high performance mode anyway. On the rock5, running this significantly improves the little cores performance for me:

echo performance > /sys/devices/platform/dmc/devfreq/dmc/governor

I suspect you’ll see the same here. But again it will only affect workloads running exclusively on small cores, so nothing very important for your general tests.

2 Likes

I would call it smart since this device is supposed to run with ‘standard’ aarch64 OS images that lack any optimization wrt IRQ/SMP affinity as such making cpu0 the most beefy one makes a lot of sense. Though this ‘5 cluster’ setup is challenging to be properly monitored when hacking around with settings.

At least all A720 cores have the same cache size regardless of their current clockspeed differences.

Also I saw that Radxa has packaged irqbalanced to be included in their reference OS image as such good luck with stuff like network performance when IRQs get spread across the little cores.

BTW: I’ve bought a RTL8126 and two RTL8157 and do some 5GbE testing right now in preparation of hacking around with the two RTL8126 on O6. Asides iperf3 would you suggest better networking tests?

@hrw: /proc/cpuinfo for you https://0x0.st/8o9l.txt :slight_smile:

1 Like

Looks like I am out of excuses and will have to order one. The only decisions to make are:

  • when to order (do not want to get call from courier while on vacations)
  • which ram size (SBC use: 16GB, desktop use: 32GB or devel use: 64GB)

CD8180 added to AArch64 SoC features table.

3 Likes

This SoC uses DynamIQ which means the actual clusters can be smaller than the performance domains

Edit: from Radxa official doc it looks like it is a single DSU cluster with all 12 cores inside, and the A520 cores do not have L2 at all?

1 Like

It does look like the driver for RTL8126 is still missing in their 6.1 BSP and the test is running with a separate RTL8125 PCIe card:

  • Realtek Device 8126: Speed 8GT/s, Width x1, driver in use: ,
  • Realtek Device 8126: Speed 8GT/s, Width x1, driver in use: ,
  • MEDIATEK Device 7925: Speed 5GT/s, Width x1, driver in use: ,
  • Realtek RTL8125 2.5GbE: Speed 5GT/s, Width x1, driver in use: r8169, ASPM Disabled
  • 119.2GB “WTPCIe-SSD-128GB” SSD as /dev/nvme0: Speed 8GT/s, Width x4, 0% worn out, drive temp: 25°C, ASPM Disabled

what is “their 6.1 BSP”?

Just to avoid people trying to make things worse, I’ll add that the two RTL8126 are on-board devices, the other three are mine, so don’t complain that they didn’t come with the purchase :wink:

1 Like

Correct me if I’m wrong. From these lines it seems like there is no kernel driver for the onboard RTL8126 NIC in the 6.1 BSP kernel that the test is running on. Does that mean we have to plugin an additional NIC to even access the internet if the OS is going to be shipped in current state? Or that is just detection error and the driver is actually available in this kernel

sbc-bench was run on official Debian 12. Please contact the Debian project for more information…

Ah, thank you for that info, that makes sense because the driver for the card is indeed missing in Debian 12. Sorry for misunderstanding the environment

From measurements it looks like 64K L2 for the A520. We’ll see soon. Developer samples are on their way :slight_smile:

I hope to publish findings with RTL8126 prior to tuesday (then O6 is expected to arrive here).

Correct me if I understand the kernel wrong about the kernel:

  • We can run mainline kernel for example via official debian/fedora/opensuse arm64 iso
  • Drivers of vpu/gpu/npu missing in mainline kernel is installed via extra dkms packages

I am curious also. Are the NPU drivers available at launch? Are there any details about the NPU?

Working with the NPU under Rockchip has been a challenge, so interested to understand how we will work with the NPU on this board. Nothing on the CIX website that I can find either.

Edit: Anyone able to share a dmesg output?

  • We can run mainline kernel for example via official debian/fedora/opensuse arm64 iso

Yes, the official debian 12 image has been tested in UEFI ACPI mode.

  • Drivers of vpu/gpu/npu missing in mainline kernel is installed via extra dkms packages

These drivers do not exist upstream and Cix has not started upstreaming them. So DKMS or kernel patches are needed for now.

1 Like