ROCK 5 in ITX form factor

I also received a sample not long ago. IMO any common 115x 1U low profile passive cooler should be overkill for the low power consumption of RK3588. Also, since SoC and RAM chips have different thicknesses, you may need to use thermal paste and thermal pads for them respectively. But considering that the RAM chip is LPDDR5, it doesn’t really matter even if it doesn’t touch the heat sink.

edit: This is the model I’m using. It will not conflict with the onboard RTC battery holder.

1 Like

@nyanmisaka Thank You for your feedback
Indeed, SoC, eMMc and memory chips only represent 10~15 W of heat to dissipate.
I think cutting a large and light aluminium heatsink like this one is a good and cheaper alternative. Then, using adhesive thermal pads of different heights might allow a sufficient sticky contact. Otherwise, I will have to find a refurbished LGA 115x 1U heatsink or simply wait for delivery of this heatsink

Final review update now that the device is ready to be bought and it’s confirmed that eMMC won’t be user-accessible by default: https://github.com/ThomasKaiser/Knowledge/commit/2489492d03db2961d6ac249cc6eefca4687ffa32

2 Likes

thanks for details,
hopefully we will get sooner or later more juice out of ddr5 as we all expect. For now it’s just the number and some chance that higher capacities will be more affordable :slight_smile:

Interesting! I’m now trying my 4th cooler on my board as the first 3 just wouldn’t work and they were marked as being LGA 115x compatible. The mounting posts for the backplate were just too wide to go through the holes on the motherboard so here’s hoping 4th time’s the charm :smiley:

1 Like

Hi all!

I’m also a lucky recipient of this really great board (thanks Radxa!). I’m sharing a few first-contact comments before I forget them:

  • the board looks amazingly clean and well arranged. I’m normally not sensitive to PCB colors but this black varnish that hides the copper tracks combined with all the golden pads makes me think of some high-end audio gear :slight_smile: Those who love to have a window on their enclosures to exhibit their boards to friends will probably be proud to have something new to show:
  • the connectors and pinouts are well marked. For example you clearly see RJ45-1 and RJ45-2 etc, as well as the front panel’s pinout. That’s one of the advantage of the black paint, the white marking is perfect and uniform so you don’t have any doubt when reading anything:
  • I think that a reset button close to the SPDIF connector (top right above) would be nice during testing or setup. For now I’m making shorts with a screwdriver on the f-panel connector.
  • I was surprised to find that the 12V in is on a 2.5mm connector. I searched among all my power blocks (20+) and couldn’t find even a single one in 2.5, all voltages included:

    I found two salvaged male connectors coming from previous PSUs, so I connected one of them to a male 2.1mm connector to make an adapter:

As an alternative I could have used a 12V->ATX adapter since I have one but I preferred to test the board without extra components first.

  • I thought it would be easier to find a heat sink, but I figured I didn’t seem to have the fixing back plate! I started the board without any heat sink and figured that it remains totally cool when idle. Regardless I managed to find one salvaged heat sink + fan + plate kit coming from two different devices that I could mount. I wanted to test it to see if the fan speed could be adjusted:



    I think it could possibly be useful to maybe provide a default plastic plate + long pins that allow to attach “something” with a pair of elastics or serflexes for use during early tests, since it’s likely not going to heat much anyway. I noticed that the fan starts at full speed and its speed significantly drops at the end of the boot, likely when properly set up to only adapt to temperature maybe.

  • The UART connector is ordered GND-TXD-RXD as more and more boards these days it seems, so I could directly connect a CH340-based adapter I’m using with some other boards and that supports the default 1.5 Mbauds of Rockchip chips:

So from a hardware perspective, that’s quite awesome.

However, for now I’m stuck. The system boots on a “roobi” system that I found on the site was a new pre-installer (good idea, that could definitely help). However, I end up with a login prompt and tried many login/password combinations (root, rock, roobi, radxa, even nothing and I don’t remember what), and nothing is accepted. I failed to find the info. I know I’m not good at finding such possibly obvious info but I searched an hour or so, which is already a bit too much for a login/password pair. (At least during this search the SoC remained totally cool without its heat sink). Thus any hint to log in would be more than welcome!

I looked at the boot loader and found that you have approx 1s to choose between 1 and 2 (both end up with this login prompt, the second saying that root account is locked, maybe I did too many attempts?):

Cannot open access to console, the root account is locked.
See sulogin(8) man page for more details.

Press Enter to continue.
...
Debian GNU/Linux 11 roobi ttyFIQ0

roobi login:

It’s also possible to interrupt the boot by pressing Ctrl-C instead of choosing 1 or 2 and end up in u-boot. Partitions 1 and 2 seem to be FAT, part 3 is ext2/3/4 and u-boot by default accesses its kernel there from /boot. But for now I didn’t manage to boot it by hand to pass args and bypass the login.

There are some bluetooth hci0 timeout errors that pollute the console for one minute or so and after that they’re gone.

That’s all for this evening. Once I manage to log into it, I’ll be interested in giving it a try with an M.2 10GbE NIC I bought not long ago, and with 4 SSDs.

…is most likely meant to be operated with a connected display and keyboard/mouse? Maybe even accessing http://roobi.local/ from another machine on your network works if you use mDNS/ZeroConf?

I guess all you can do the console way is to somehow login, then wipe the eMMC to get the board booting from TF card or something similarly destructive?

1 Like

You think so ? I’m not against offering a possibility to install a board via a keyboard and screen (I even encourage it), but I would never expect that one to be the only solution. Yes I came over that page as well and I just considered it irrelevant to my use case, and it spoke about other vendor’s hardware so I didn’t insist.

Hmm I have a HDMI-to-USB capture adapter that I bought for the rare cases I need to connect to HDMI without having to carry a heavy display, I can try. But if that’s it, it would be gross, because the board presents a valid boot prompt, nothing that lets you imagine you’re not supposed to use that. At minimum there should be a usable root/root or so access on the console for those who want a simple install method.

BTW I have not found the board on the network, but I only connected the cable long after I exhausted all my imagination of unusable logins, so maybe it didn’t request an IP address.

But I’ll retry with the HDMI adapter just in case it shows anything, thanks for the suggestion.

Well, this is stuff for end users. It seems they simply made an Electron app to interact with the user and to setup things which is accessible via display and since being Electron might simply work over a network connection too.

And in case you still want to try local access maybe give ps:ps a try? But how to continue then? Accessing http://127.0.0.1 with a text browser? :crazy_face:

Edit: Maybe executing roobi might then work (see here)

I understand this but it doesn’t mean that those who know what they want to setup should go via the complicated path. It’s like when some distros didn’t let you run fdisk and/or would always reorder your partitions. On spinning rust you’d definitely want to place the swap closest to the most commonly used data and you’d end up with swap at the end, making the system super slow due to large seeks. I remember finding myself having to install in two steps using a secondary disk to later move the data.
I’ll give the ps:ps a try BTW, thanks for spotting this.

Last point regarding end users, end users install their PC by booting using a USB stick plugged into a USB port, not by using whatever proprietary install system. For example I had no trouble installing my LX2K that boots any regular distro from UEFI. There’s an EDK2 port for RK3588 that also covers Rock5B, I never tested it but it could be a more universal and conventional boot and setup method that targets end users.

Correct. And ‘their PC’ is x86.

In this stupid ARM world where each and every SoC needs its own bootloader (at least proprietary DRAM initialization) this simply doesn’t work which is the reason various SBC vendors came up with the idea of preflashing stuff like OOWOW, Roobi or whatever to present the user a list of OS images that contain all the proprietary crap necessary to boot on this specific device.

And this won’t change unless all ARM SoC and SBC vendors start to adopt Arm SystemReady (or whatever the current spec is called that usually has zero relevance for end users since nobody adopts anything of this) and preflash a SPI NOR with a bootloader (already containing all the proprietary crap needed for this specific device) being able to boot generic aarch64 OS images.

2 Likes

I agree and that’s exactly my point. That’s also why I asked FriendlyELEC to install the SPI NOR on their latest NanoPC-T6 and to flash it (though I don’t think this last point has been done yet). As a PC user I don’t like UEFI because it’s more complicated than the legacy mode without adding visible value. But for ARM it’s the least painful solution that works for everyone out of the box, and the code exists for RK3588.

BTW, ps:ps worked fine on the login prompt (with some script syntax errors but that’s all):

roobi login: ps
Password: 
Linux roobi 5.10.110-33-rockchip #65700d485 SMP Wed Apr 3 04:26:57 UTC 2024 aarch64

The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
Last login: Fri May 17 19:29:29 UTC 2024 on tty1
-bash: [: : integer expression expected
-bash: [: : integer expression expected
-bash: [: : integer expression expected
ps@roobi:~$

I also found that the board indeed got a DHCP address, and sending a browser to it indeed shows the roobi interface. Another point that is not great for on-the-table setup is this:

Due to remote access to this device, authentication is required. After clicking the start button,
please press the power button three times within 60 seconds to complete the verification.

/me using the screwdriver again to short pins.

It then allows you to choose an OS image among two, and claims “the medium is not detected” (even when I click on the refresh circle on the right). This is exactly the reason I absolutely hate and despise such proprietary installation tools. Nobody uses them, so they are extremely lightly tested and full of bugs and limitations that make them a total pain to deal with.

Bah, for now I’ll run my tests from this image. It’s already a full-fledged OS, even gcc is installed on it! I don’t know what the intent was for a pre-installation image but it will be useful anyway :wink:

Is there an installation medium (other than eMMC which should not be user-accessible on Rock 5 ITX by default anyway since it’s Roobi’s place to live)?

No, I have not yet added any other one, I naturally expected such a board to have its data on either SATA or M2 and the OS on the eMMC, I mean, what anyone would likely expect on a server board! Sacrifying the eMMC for a boot loader is non-sense, particularly when that means that you’ll either need to move your OS to the data disks or dedicate one drive for the OS. I really don’t understand such baroque choices. If the goal is to make it as irritating as using a Synology NAS I can understand, but I’m not seeing how one sees value in doing that :-/

I’m starting to get a feeling that the board is really awesomely designed from a hardware perspective, maybe one of the best product from Radxa to date, and that some incomprehensible software choices are going to hinder its adoption by spreading words of pain of setup and use :-/ I’m a bit confused.

Also, there’s a micro-SD slot on the board that supports UHS mode and reads at 88MB/s from a 64G SD I have here. If there’s one please where the Roobi gadget should be placed, it’s exactly on a micro-SD. Everyone will easily find a spare one, download the Roobi image, place it on the SD and install the OS from there, then eject the SD once finished. No need to sacrify the eMMC for that single-use stuff. And I would strongly prefer to install the OS on an eMMC than on a micro-SD.

Nope, users fail to select the right OS image (please note: there’s not one Roobi OS install but these are device specific since the SBC vendor doesn’t ship with SPI NOR preflashed with something that would allow to boot generic aarch64 images).

And (even most experienced) users especially with Radxa are not even able to find the images for a specific device.

As for the target audience: Radxa advertises this thing as ‘ARM PC’ and desktop users want their OS on a NVMe SSD since they have heard that MB/s are that great and that important (nobody in that world understands the difference between random and sequential I/O). The average desktop user in front of such an ‘ARM PC’ will never notice there’s eMMC inside since it only contains some ‘firmware’ allowing the OS image of choice to be flashed to an installed SSD in the M.2 slot.

For desktop I agree that you want to install the OS on M.2. For a server with no other PCIe slot you’ll want to use M.2 to connect PCIe devices (extra SATA slots or network cards).

And it doesn’t change the fact that Radxa could provide the roobi SD image for this board as they used to do in the past for their other OS images. Again, I’m fine with providing a user-friendly installer, but not at the cost of removing the only viable OS storage of the board. Right now I’m running off this distro which is going to be the main OS of this platform. This looks like a particularly weird and ridiculous situation where every storage device was shifted one place:

  • SPI NOR: empty, not used
  • eMMC: should contain the main OS, instead contains u-boot and this roobi gadget that takes all the room left (7G partition dedicated to this!!!)
  • M2: no longer usable for PCIe / SATA if you’re forced to move your OS to an NVME SSD due to this roobi you’ll never ever use again that steals your eMMC.
  • SD: not used during the install process
  • USB: not used during the install process

A correct setup would be:

  • SPI NOR: boot loader (u-boot, edk2, whatever etc)
  • eMMC: main OS. Optionally reserve a few MB for a recovery OS if the NOR is too tight.
  • SATA: data storage
  • M.2: either main data storage (SSD), complementary data storage (SATA), network, other?
  • SD/USB: usable during boot to plug either a standard installation image for the main OS (ubuntu, freebsd, maybe even windows, I don’t know), and usable as well for reinstallation by using the user-friendly roobi installer.

And what’s sad is that everything is present and properly wired on the product for this, it’s just totally mixed up at the software level!

I’ve conducted some comparative tests to measure the effect of the different DRAM generation. For this I’ve run llama.cpp on the Rock5B, the Rock5 ITX (under roobi), and ADLINK’s AADK based on an Ampere Altra Q80-26 (80 cores at 2.6 GHz). The Altra uses Neoverse-N1 but it’s exactly the same core as A76. LLMs are interesting because they’re often limited by the memory bandwidth during generation. Since my Rock5B has 4GB RAM, I’ve used the Phi3-3B model quantized at Q6_K (3.1 GB) and a small context of 512 tokens. I’m only using the big cores for this test.

  • The Rock5B has its two big clusters running at 2256 and 2272 MHz respectively (hence 2264 avg). It uses LPDDR4x at 4224 MT/s. It parses at 5.58 tokens/s and produces 4.63 tokens/s.
  • The Rock5 ITX has its two big clusters at 2287 and 2223 MHz respectively (2255 avg), and LPDDR5 at 5472 MT/s. It parses at 5.71 tokens/s and produces 4.85 tokens/s, hence 4.7% faster generation for 0.4% lower CPU frequency.
  • The Rock5 ITX with only 3 threads instead of 4 drops to 3.80 t/s generation, above the theoretical 3.64 if we were CPU-bound, proving that the DRAM B/W is already the limiting factor when running under 4 threads.
  • The Altra limited to 4 threads has its cores running at 2600 MHz and 6 single-DIMM 64-bit DDR4 channels at 2933 MT/s. It parses at 7.19 tokens/s and produces 5.96 tokens/s. Hence it’s respectively 28.8 and 28.7% faster than the 5B for 14.8% higher CPU frequency and 4.17x higher memory bandwidth.
  • The highest generation speed the Altra reaches is around 40 threads at 22.80 tokens/s, or 3.8 times faster than with 4 threads, 4.92 times faster than Rock 5B or 4.7 times faster than Rock 5 ITX.

This shows that 6*2933 MT/s has a hard upper bound of 22.8 tok/s. This fixes an upper bound of 5.47 t/s for the Rock5B’s 4224 MT/s RAM and 7.09 t/s for the Rock5 ITX’s 5472 MT/s. Of course the CPUs also play a limiting factor here, but we’ve shown above that DRAM counts for the 4-thread test. If we perform a quick ratio calculation, 4.85/22.8*6*2933 shows that the Rock5-ITX delivers as if it was running DDR4 RAM at 3743 MT/s or 1872 MHz, and Rock5B as if it was at just 1800 MHz or DDR4-3600 (but again CPU does count here).

I would genuinely have expected a slightly higher gain between 5B and ITX (maybe 10-15%), but I’ve read above that there’s still this pending question about why LPDDR5 is not that much faster. With that said, it still is slightly faster (4.7% for a 0.4% slower CPU) but not by much.

Regardless that remains very good performance and it should be sufficient for most use cases. But anything we find to make LPDDR5 perform significantly better than LPDDR4X would be welcome I guess.

For those interested in reproducing these tests, I’ve used tag b2918 of llama.cpp, with Phi-3-mini-128k-instruct.Q6_K.gguf. The command line and (trimmed) output are:

willy@roobi:~/llama.cpp$ time taskset -c 4-7 ./main -c 512 -s 1 --temp 0.1 -n -1 --threads 4  -m ../models/Phi-3-mini-128k-instruct.Q6_K.gguf -e -p "<|im_start|>system\nYou're a super-smart AI assistant that never writes hallucinations, and you respond to the user's questions accurately.<|im_end|>\n<|im_start|>user\nPlease explain to me what could be the benefits of running an LLM on a low-power processor like a Cortex A76 or a Neoverse-N1.<|im_end|><|im_start|>Assistant\n"
(...)
<s> <|im_start|>system
You're a super-smart AI assistant that never writes hallucinations, and you respond to the user's questions accurately.<|im_end|>
<|im_start|>user
Please explain to me what could be the benefits of running an LLM on a low-power processor like a Cortex A76 or a Neoverse-N1.<|im_end|><|im_start|>Assistant
Running a Large Language Model (LLM) on a low-power processor like the Cortex A76 or Neoverse-N1 could have several potential benefits.

1. **Energy Efficiency**: Low-power processors are designed to consume less power, which can lead to significant energy savings. This is particularly beneficial in large-scale deployments where energy consumption can be a major concern.

2. **Cost Savings**: Lower power consumption translates to lower energy costs. This can result in significant cost savings, especially in large-scale deployments.

3. **Environmental Impact**: Lower energy consumption also means a reduced environmental impact. This is particularly important in the context of climate change and the global effort to reduce carbon emissions.

4. **Heat Generation**: Lower power processors generate less heat. This can reduce the need for cooling systems, which can further reduce energy consumption and costs.

5. **Performance**: While it's important to note that low-power processors may not offer the same level of performance as high-power processors, they can still provide adequate performance for many applications.

In conclusion, running an LLM on a low-power processor can offer benefits in terms of energy efficiency, cost savings, environmental impact, heat generation, and adequate performance. However, the specific benefits would depend on the specific requirements and constraints of the application.<|endoftext|> [end of text]
(...)
system_info: n_threads = 4 / 8 | AVX = 0 | AVX_VNNI = 0 | AVX2 = 0 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 0 | SSSE3 = 0 | VSX = 0 | MATMUL_INT8 = 0 | LLAMAFILE = 1 | 

llama_print_timings:        load time =     842.03 ms
llama_print_timings:      sample time =      17.60 ms /   342 runs   (    0.05 ms per token, 19431.82 tokens per second)
llama_print_timings: prompt eval time =   18920.53 ms /   108 tokens (  175.19 ms per token,     5.71 tokens per second)
llama_print_timings:        eval time =   70334.79 ms /   341 runs   (  206.26 ms per token,     4.85 tokens per second)
llama_print_timings:       total time =   89350.71 ms /   449 tokens
3 Likes

And for those wondering how x86 would behave here, the N5105 in my Odroid-H3 with its 4 cores at 2.8 GHz has two DDR4-3200 DIMMs. It delivers only 1.85 tokens/s on this test, or only 38% of the Rock5-ITX! In this case it’s likely that the lack of AVX counts. But given that newer CPUs such as N100 have cut their DRAM bandwidth in half to segment their market, they won’t do much better than the Rock5 here anyway. This, I think, makes RK3588 closer to recent x86 chips which have been purposely castrated, and it means that Rock5-ITX very likely has some chances to be compared to other PC motherboards for various setups.

Here are the raw numbers from this test:

system_info: n_threads = 4 / 4 | AVX = 0 | AVX_VNNI = 0 | AVX2 = 0 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 0 | NEON = 0 | ARM_FMA = 0 | F16C = 0 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | MATMUL_INT8 = 0 | LLAMAFILE = 1 | 
(...)
llama_print_timings:        load time =    1400.98 ms
llama_print_timings:      sample time =      15.33 ms /   309 runs   (    0.05 ms per token, 20161.82 tokens per second)
llama_print_timings: prompt eval time =   49784.80 ms /   108 tokens (  460.97 ms per token,     2.17 tokens per second)
llama_print_timings:        eval time =  166469.62 ms /   308 runs   (  540.49 ms per token,     1.85 tokens per second)
llama_print_timings:       total time =  216500.45 ms /   416 tokens
1 Like