ROCK 5 in ITX form factor

Yeah exactly something like this. Good to see they’re evaluating it. the only thing is that LPCAMM seems “large” to me, I mean, we’re talking about SOCs supporting only 2 32-bit chips or 4 16-bit ones, but maybe this will permit interleaving and improved average latencies. We’ll see.

Few more pictures can be seen where my journey with this board within the next weeks will be documented: https://github.com/ThomasKaiser/Knowledge/blob/master/articles/Quick_Preview_of_ROCK_5_ITX.md

1 Like

My sample arrived this morning too (only 4 days late thanks to UPS!) and I took a bunch of pictures for a quick close-up of the board. The photos likely do it zero justice but it’s a really pretty-looking board, as tkaiser said!

2 Likes

Review update: MMC storage, USB-C, SATA power and 1st real problem identified: SATA write performance sucks

4 Likes

Great photos, thanks

Review update: Testing 2.5GbE networking, OS defaults vs. optimal settings, achieving weird iperf3 results.

1 Like

Review update: Testing today’s b1 Radxa build, more recent boot BLOBs and Armbian defaults.

TL;DR: newer DRAM initialization BLOBs do not solve the problem that LPDDR5 @ 2736 MHz is not significantly faster than LPDDR4X @ 2112 MHz, Radxa’s b1 builds from today don’t solve the reported problems and someone at Armbian managed to ruin NAS/server performance by choosing an inappropriate cpufreq governor

Or maybe better TL;DR: grab an Alder Lake-N x64 box when you want to run an energy efficient server

1 Like

And another one: SMB Multichannel.

TL;DR: 600 MB/s over two 2.5GbE connections, don’t buy an Alder Lake-N x64 box

Really interested in one of these to replace my aging NAS, watching this with interest.

I’m curious if the board-powered fan is controllable with a temperature sensor on the board itself?

Hi, @tkaiser

After we read your post, we put it in high priority and did the test immediately:

https://gist.github.com/RadxaStephen/88b6a5e09699017d04a2299192104232

We also reported this with Rockchip and worked with them closely and got a shocking fact from Rockchip’s engineer:

The LPDDR5 latency is expected to be slower than LPDDR4x. Just like LPDDR4 latency is slower than LPDDR3.

We also checked with AI, first, we asked:

is lpddr5 latency higher than lpddr4?

AI answered: No, LPDDR5 has lower latency compared to LPDDR4.

Then we asked:

why real world test shows lpddr5 has higher latency?

Here is the answer:

Based on the provided search results, there are a few key points explaining why real-world tests may show higher latency for LPDDR5 compared to LPDDR4:

  1. Current LPDDR5 implementations have higher rated latencies (CAS latencies around CL40) compared to typical LPDDR4 latencies (CL20 or lower).13 While LPDDR5’s higher data rates can offset some of this latency penalty, the raw latency numbers are still higher initially.
  2. Benchmarking tools like AIDA64 may not accurately measure LPDDR5 latency yet, as the technology is new. Some users have reported AIDA64 showing unexpectedly high latency numbers for LPDDR5 that don’t match real-world performance.2
  3. LPDDR5 operates at an effective lower clock speed (800MHz x4 channels) compared to the rated “data rate” speed of 6400MT/s. This can make latency appear higher when measured by tools not optimized for LPDDR5.2
  4. While LPDDR5 reduces command/address latency through architectural changes, the data transfer latency itself may still be higher in early LPDDR5 implementations compared to mature LPDDR4 modules.4

So in summary, the higher rated CAS latencies of initial LPDDR5 chips combined with potential benchmarking tool inaccuracies can make LPDDR5’s real-world latency appear higher than LPDDR4, even though LPDDR5 reduces other latency components.1234 As LPDDR5 matures, its latency is expected to improve.

Full answer:

https://www.perplexity.ai/search/is-lpddr5-latency-KDYiMM5MQiS8L3Qy5LiZQg

We are still investigating this, will update here after more findings.

Well, somehow expected but I would’ve thought 2736 MHz vs. 2112 MHz would compensate for that. Obviously wrong or hopefully RK revisits the topic and is maybe able to improve timings?

We’ll see but at least for now we might conclude that workloads that benefit from lower memory latency (not that much) run better off LPDDR4X with RK3588.

Edit: though Geekbench on other platforms than x86 more being a joke than a reputable benchmark here are numbers comparing Rock 5B (LPDDR4X @ 2112 MHz) and Rock 5 ITX (LPDDR5 @ 2736 MHz):

Basically same performance at the moment (until maybe Rockchip sends out a new DRAM initialization BLOB that improves performance)

Yes, since it’s a ‘12V PWN enabled fan header’ you can then combine with any thermal source you want.

But for RK3588 you won’t need a fan unless you cramp the board with a few pieces of hot spinning rust into a small enclosure w/o any ventilation since this SoC is pretty energy efficient.

My glorious plan involves a Jonsbo N2 case, the rock5 ITX, and four HDDs. So if the disks are busy I may well need a case fan.

Review update: Settings matter for performance

TL;DR: how Armbian guys managed to halve performance of one specific use case with the snip of a finger, w/o even noticing of course and w/o doing any evaluation prior or after.

@tkaiser I thought that DDR5 pretty much always had worse (or equivalent) latency to DDR4. I get it that you expected an improvement, but how does this compare to the same test in other systems, for example x86?

The is a clear benefit of DDR5 in your memcpy numbers.

Not just me :wink:

No idea, I’m not a hardware guy, just started to read about the latency stuff recently…

That doesn’t seem to end up with ‘software in general’ getting faster. Synthetic benchmarks are always a problem since unless it’s known to which use case they relate they just generate random numbers. And for example 7-zip (sbc-bench's main metric for some reasons) depending more on low latency than high bandwidth will generate slightly lower scores on RK3588 with LPDDR5 at just 5472 MT/s while my naive assumption would be that most other software will more benefit from higher memory bandwidth.

But if we look at Geekbench6 then LPDDR4X and LPDDR5 generate more or less same scores (don’t look at the total scores but the individual ones):

But Geekbench in itself is a problem as it uses memory access patterns that are not that typical (at least according to RPi guys that try to explain the low multi GB6 scores their RPi 5 achieves) and we don’t know which tests are sensitive to memory latency and which to bandwidth. I did a test weeks ago with Rock 5B comparing LPDDR4X clocked at 528 MHz and 2112 MHz so we know at least which individual tests are not affected by memory clock at all (on RK3588 – on other CPUs with different cache sizes this may differ): https://github.com/raspberrypi/firmware/issues/1876#issuecomment-2021505017

But this also is not sufficient to understand GB6 scores. It would need a system where CAS latency and memory clock can be adjusted in a wide range and even then the question remains: how do the generated scores translate to real world workloads.

I think at the moment we can conclude that the faster LPDDR5 clock does not result in a significantly faster system while in some areas where memory bandwidth is everything (video stuff for example) measurable improvements are possible.

Thanks for the detailed answer, my point was just that before we treat the no improvement in latency as an issue, we should ask the question whether it was not supposed to be this way. Memory latency did not change in a meaningful way since the DDR3 days because the higher clocks are offset by higher delays.

And whether this translates to a more responsive system is another issue altogether. I believe that higher memory throughput is likely to result in a bit higher GPU performance. But it depends on other factors too.

Currently trying to wrap my head around idle consumption which is clearly too high (I measured 4W with everything set to powersave, once a NVMe SSD is mounted, it breaks my heart to look at the SmartPower device).

Turns out ASPM is completely disabled (checked on latest b1 build and Armbian legacy):

root@rock-5-itx:/home/radxa# lspci -vvPPDq | awk '/ASPM/{print $0}' RS= | grep --color -P '(^[a-z0-9:./]+|:\sASPM (\w+)?( \w+)? ?((En|Dis)abled)?)';
0001:10:00.0 PCI bridge: Fuzhou Rockchip Electronics Co., Ltd RK3588 (rev 01) (prog-if 00 [Normal decode])
		LnkCtl:	ASPM Disabled; RCB 64 bytes, Disabled- CommClk+
pcilib: sysfs_read_vpd: read failed: Input/output error
0001:10:00.0/11:00.0 SATA controller: ASMedia Technology Inc. ASM1164 Serial ATA AHCI Controller (rev 02) (prog-if 01 [AHCI 1.0])
		LnkCtl:	ASPM Disabled; RCB 64 bytes, Disabled- CommClk+
0003:30:00.0 PCI bridge: Fuzhou Rockchip Electronics Co., Ltd RK3588 (rev 01) (prog-if 00 [Normal decode])
		LnkCtl:	ASPM Disabled; RCB 64 bytes, Disabled- CommClk+
0003:30:00.0/31:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8125 2.5GbE Controller (rev 05)
		LnkCtl:	ASPM Disabled; RCB 64 bytes, Disabled- CommClk+
pcilib: sysfs_read_vpd: read failed: Input/output error
0004:40:00.0 PCI bridge: Fuzhou Rockchip Electronics Co., Ltd RK3588 (rev 01) (prog-if 00 [Normal decode])
		LnkCtl:	ASPM Disabled; RCB 64 bytes, Disabled- CommClk+
0004:40:00.0/41:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8125 2.5GbE Controller (rev 05)
		LnkCtl:	ASPM Disabled; RCB 64 bytes, Disabled- CommClk+

root@rock-5-itx:/home/radxa# zgrep ASPM /proc/config.gz 
CONFIG_PCIEASPM=y
CONFIG_PCIEASPM_DEFAULT=y
# CONFIG_PCIEASPM_POWERSAVE is not set
# CONFIG_PCIEASPM_POWER_SUPERSAVE is not set
# CONFIG_PCIEASPM_PERFORMANCE is not set
# CONFIG_PCIEASPM_EXT is not set

Checked on Rock 5B with 5.10 BSP kernel it’s the same.

Anyone an idea why? @RadxaYuntian @jack

Do we need to fiddle around with setpci to bring idle consumption into a sane range?

1 Like

Any idea when this board might be available to buy from Okdo in the UK?

Fairly sure this was the reason we changed. The post above showed a way to adjust this setting though.