I’ve now added a reproducer to the ramspeed repo and updatd the procedure on the rock5-itx thread above.
News about the ROCK 5B Plus! ;)
for a real world ram use case try llamacpp from phoronix test suite or github
here on my tests it caps at 20GB/s or so i.e 5t/s on 7B Q4 models
Has the SPI been cleared? I saw that mismatching DDR and SPL blobs can disable the DMC leading to a performance impact.
The thing is that llama.cpp also depends on calculations so you never know if you’re CPU-bound or RAM-bound until you can test on another machine with different CPU and same RAM or different RAM and same CPU. I happen to have access to a 80-core Ampere Altra made of Neoverse-N1 cores that are exactly the same as Cortex-A76. Llama.cpp is quite fast there, and there are 6 DDR4 channels. But once limited to 4 cores, it’s basically the same speed as on the rk3588, showing that the 4 A76 there are delivering what they can and are a most important limiting factor than the DRAM speed.
@willy might answer this (I personally haven’t conducted any tests on RK3588 for quite some time)
That’s an interesting bit of information! With BSP kernel running looking below /sys/class/devfreq/dmc
(or checking the existence of this dir) or at /sys/kernel/debug/clk/clk_summary
might already be sufficient?
what should I look for in this directory?
I look at the max_freq in Rock 5B and Rock 5B+ and they are 2112000000 and 2736000000 respectively…
Now please run sbc-bench
on the 5B+ and post results link here so we can check whether the DRAM latency issue is present on your system since if not Joshua might have nailed it with ‘DMC disabled’ and @willy may have tested with DRAM clock set to the absolute minimum instead of 2400 MHz.
Amazing, you’re totally right, that was the problem! I’m seeing that the dmc runs at 2.4 GHz during the filling of the memory and goes down to 534 during the scan. Apparently both dmc_ondemand and simple_ondemand are dumb enough to ignore memory reads!!! And I can’t find a configurable one to fix that. Thus I did this:
# cat /sys/class/devfreq/dmc/max_freq > /sys/class/devfreq/dmc/min_freq
and now my ramlat numbers are wayyyyyy better:
willy@rock5:/tmp$ taskset -c 4 ./ramlat -s -n
size: 1x32 2x32 1x64 2x64 1xPTR 2xPTR 4xPTR 8xPTR
4k: 1.752 1.753 1.752 1.753 1.752 1.753 1.753 3.327
8k: 1.752 1.753 1.752 1.753 1.752 1.752 1.753 3.417
16k: 1.752 1.753 1.752 1.753 1.752 1.753 1.753 3.417
32k: 1.752 1.753 1.752 1.753 1.752 1.753 1.753 3.420
64k: 1.754 1.753 1.754 1.753 1.754 1.754 1.755 3.420
128k: 5.596 5.453 5.564 5.450 5.526 6.081 7.460 13.29
256k: 8.115 8.074 8.125 8.084 8.095 8.254 9.772 15.81
512k: 13.82 13.36 13.75 13.36 13.74 13.93 15.79 22.96
1024k: 41.87 39.68 41.90 39.56 41.99 40.28 40.86 46.22
2048k: 52.99 47.90 52.01 47.76 52.38 48.69 48.04 54.20
4096k: 93.04 89.75 93.46 89.62 93.83 88.62 88.12 90.95
8192k: 118.5 108.0 114.8 106.9 113.0 107.3 106.7 110.9
16384k: 139.5 131.8 136.8 130.8 136.2 130.6 129.6 133.0
I’m going to re-run the ramwalk test now.
Edit: ramwalk on rock5-itx is now 8% slower than on rock5b instead of 80% slower! Much better! I’ll need to retest with the 2736 MHz DDR init code, but for this I need to force to boot from the SD, thus make a contact using a resistor, so I’ll do it once I’m home
From my understanding, the DMC is disabled to maintain stability as the memory cannot train/initialize properly when the DDR and SPL blobs do not match.
This can be observed in U-Boot when attaching a serial console:
ERROR: loader&trust unmatch!!! Please update loader if need enable dmc
ERROR: current trust bl31 need match with loader ddr bin V1.13 or newer
ERROR: current loader need match with trust bl31 V1.38-V1.40
OK nice catch indeed. I don’t have this one, otherwise I’d have jumped on it already as you can imagine It’s indeed very possible in that case that the DMC devfreq gets disabled and that u-boot leaves default settings on, with possibly safe values leading to sub-optimal performance in that case.
For now what’s great is that we’re finding software-only causes to the issues encountered till now, which is very encouraging. I still have no idea what the “stability issues” related to DDR5 at 2736 MHz are, but the 2400 setting is much more bearable now with DMC at full perf.
Edit: just booted it at 2736 from the SD by touching wires, and guess what ?
ERROR: loader&trust unmatch!!! Please update loader if need enable dmc
ERROR: current trust bl31 need match with loader ddr bin V1.13 or newer
ERROR: current loader need match with trust bl31 V1.38-V1.40
So now this explains why when booted like this the memory was always fast. I hadn’t faced this previously since it requires all attention on the wire I’m touching for the boot, while the default boot at 2.4 doesn’t report this.
The next step will probably be to try to rebuild ddr_bin with other settings, but quite frankly, this is become really cumbersome with almost no info, just for the sake of trying to catch up with rock 5b’s performance.
the wait is over! Radxa had published the promotion on their wechat and presale on JD website!
mmexport1721306631987.webp (110.9 KB)
My heart literally skipped a beat when I saw it. Haha.
It’s more beautiful than I thought it would be
Awesome, what is JD website? I don’t have wechat but I want to order two or three.
Urgh… I have the bad feeling it’s already too late. Visiting the forum regulary wasn’t enough I guess xD Was waiting to build my nas…
JD is a online store website like alibaba/taobao
Anyways, you may take a look at the China shop links for advance information.
I think arace.tech would also follow up shortly.
Nice, thank you for clarifying!
I also noticed that it was announced at radxa site yesterday. All I can say about it is that we should get more vibrant pcb color! pink was awesome so far nobody have purple
just joking
At same time Orange Pi5 Max was announced. This one is the most compact RK3588 (yes, without S) in ROCK 5A format. Of course they could not wire up everything for SoC, second RAM chip is missing, but they managed to pack everything in that small size, including full size 2280 m.2 connector.
R5 B+ is nice evolution over 5B. I also noticed that eMMC size is described as up to 128GB so I hope it will not be wasted for Roobi os only.
no worry dominik, their bsp could build normal debian image for 5b+
I know,
But it’s just adding new build target into github and it will save much time for others
Minimal image should always be accessible
Wow. These new boards are lookin’ really nice. Almost a shame we’re completely dropping the plans of using Radxa products ever again…
oh? My experience with radxa is good! I think Radxa would be interested to knowing your problems.