Probable DMC tunings with 5.10 BSP kernel on RK3399

Follow-up on https://github.com/geerlingguy/sbc-reviews/issues/2#issuecomment-1429014314

@RadxaYuntian I tried to log into https://github.com/radxa-build/rock-pi-4a/releases/download/20230207-0238/rock-pi-4a_debian_bullseye_kde_2023-02-07T0329+0000_gpt.img.xz running on my early Rock 4 pre-production unit but to no avail:

tk@mac-tk ~ % ssh rock@rock-pi-4a
ssh: connect to host rock-pi-4a port 22: Connection refused

tk@mac-tk ~ % nmap -sT rock-pi-4a
Starting Nmap 7.93 ( https://nmap.org ) at 2023-02-14 11:38 CET
Nmap scan report for rock-pi-4a (192.168.83.24)
Host is up (0.0057s latency).
rDNS record for 192.168.83.24: rock-pi-4a.fritz.box
Not shown: 998 closed tcp ports (conn-refused)
PORT    STATE SERVICE
139/tcp open  netbios-ssn
445/tcp open  microsoft-ds

Nmap done: 1 IP address (1 host up) scanned in 0.99 seconds

SMB ports open but SSH not? Strange…

Is there an image that can be easily tried out (and that of course means SSH enabled) or could you please run sbc-bench -j on such an image to get an idea whether there are tunables or not?

TIA!

I’ll make sure this is disabled as well.

Once you flashed the image to a SD card you should have a FAT32 config partition available. You can edit before.txt (as in running before the config.txt and is being used as first boot config script) and comment out the line that disable the SSH service.

That did the trick (though not with macOS [1]) but ssh rock@rock-pi-4a doesn’t work. Several minutes later after visiting https://github.com/radxa-repo/rbuild now I know the new default logon credentials.

IMO it would be wise to announce this right where the images are available for download. Imagine each and every user of your images wasting minutes to research something as silly as default logon credentials…

Now I’m in and I already found something I need to adjust in sbc-bench (since wrong SoC guessed and not detecting vendor/BSP kernel):

root@rock-pi-4a:/home/radxa# sbc-bench.sh -m
NXP i.MX8QM, Kernel: aarch64, Userland: arm64

CPU sysfs topology (clusters, cpufreq members, clockspeeds)
                 cpufreq   min    max
 CPU    cluster  policy   speed  speed   core type
  0        0        0      408    1416   Cortex-A53 / r0p4
  1        0        0      408    1416   Cortex-A53 / r0p4
  2        0        0      408    1416   Cortex-A53 / r0p4
  3        0        0      408    1416   Cortex-A53 / r0p4
  4        1        4      408    1800   Cortex-A72 / r0p2
  5        1        4      408    1800   Cortex-A72 / r0p2

Thermal source: /sys/devices/virtual/thermal/thermal_zone0/ (cpu-thermal)

Time       big.LITTLE   load %cpu %sys %usr %nice %io %irq   Temp
11:25:12:  408/ 816MHz  0.46  13%   1%   9%   0%   2%   0%  42.2°C
^C

root@rock-pi-4a:/home/radxa# sbc-bench.sh -k
Linux rock-pi-4a 5.10.110-1-rockchip #92c0648fa SMP Mon Feb 6 03:23:17 UTC 2023 aarch64 GNU/Linux

Kernel 5.10.110 is not latest 5.10.166 LTS that was released on 2023-02-01.

Please check https://endoflife.date/linux for details. It is somewhat likely
that a lot of exploitable vulnerabilities exist for this kernel as well as
many unfixed bugs. Better upgrade to a supported version ASAP.

I’ll report back later about probable findings…

[1] There’s something with the partition macOS doesn’t like:

tk@mac-tk ~ % diskutil list disk5
/dev/disk5 (external, physical):
   #:                       TYPE NAME                    SIZE       IDENTIFIER
   0:      GUID_partition_scheme                        *31.9 GB    disk5
   1:       Microsoft Basic Data ⁨config⁩                  16.8 MB    disk5s1
   2:                        EFI ⁨⁩                        5.5 GB     disk5s2

tk@mac-tk ~ % diskutil info disk5s1
   Device Identifier:         disk5s1
   Device Node:               /dev/disk5s1
   Whole:                     No
   Part of Whole:             disk5

   Volume Name:               config
   Mounted:                   No

   Partition Type:            Microsoft Basic Data
   File System Personality:   MS-DOS FAT16
   Type (Bundle):             msdos
   Name (User Visible):       MS-DOS (FAT16)

   OS Can Be Installed:       No
   Media Type:                Generic
   Protocol:                  USB
   SMART Status:              Not Supported
   Volume UUID:               4DB6C674-2B7B-3813-94A9-3CC1617546B7
   Disk / Partition UUID:     5BDC09A8-D4B7-4BDD-B378-472AC29C93AD
   Partition Offset:          16777216 Bytes (32768 512-Byte-Device-Blocks)

   Disk Size:                 16.8 MB (16777216 Bytes) (exactly 32768 512-Byte-Units)
   Device Block Size:         512 Bytes

   Volume Total Space:        0 B (0 Bytes) (exactly 0 512-Byte-Units)
   Volume Free Space:         0 B (0 Bytes) (exactly 0 512-Byte-Units)

   Media OS Use Only:         No
   Media Read-Only:           No
   Volume Read-Only:          Not applicable (not mounted)

   Device Location:           External
   Removable Media:           Removable
   Media Removal:             Software-Activated

   Solid State:               Info not available


tk@mac-tk ~ % diskutil mount disk5s1
Volume on disk5s1 failed to mount
If you think the volume is supported but damaged, try the "readOnly" option

@RadxaYuntian: the strings binary was missing since your image doesn’t ship with package binutils. I guess this will break a lot of scripts as such suggesting to add it to your list of default packages…

Here we go: sbc-bench -j collected basic info: https://github.com/ThomasKaiser/sbc-bench/blob/master/results/reviews/Rock-4A-5.10-BSP.md

Throttling happened so I need to add a fan to the setup, now walking through all four available DRAM clockspeeds (328/416/666/856 MHz) to get an idea how this affects performance (using Geekbench as well as the usual set of benchmarks sbc-bench fires up):

root@rock-pi-4a:/home/radxa# for RAMClock in $(< /sys/devices/platform/dmc/devfreq/dmc/available_frequencies) ; do
> echo ${RAMClock} >/sys/devices/platform/dmc/devfreq/dmc/max_freq
> Netio=192.168.83.72/2 MODE=unattended sbc-bench.sh -G
> Netio=192.168.83.72/2 MODE=unattended sbc-bench.sh
> done

Average load and/or CPU utilization too high (too much background activity). Waiting...

Too busy for benchmarking: 12:13:57 up 2 min,  1 user,  load average: 1.31, 1.04, 0.43,  cpu: 12%
Too busy for benchmarking: 12:14:03 up 2 min,  1 user,  load average: 1.21, 1.02, 0.43,  cpu: 6%
Too busy for benchmarking: 12:14:08 up 2 min,  1 user,  load average: 1.11, 1.00, 0.42,  cpu: 7%
Too busy for benchmarking: 12:14:13 up 2 min,  1 user,  load average: 1.02, 0.99, 0.42,  cpu: 7%
Too busy for benchmarking: 12:14:18 up 2 min,  1 user,  load average: 0.94, 0.97, 0.42,  cpu: 7%
Too busy for benchmarking: 12:14:23 up 2 min,  1 user,  load average: 0.86, 0.95, 0.42,  cpu: 6%

I made a reboot in between and now there’s a lot of background activity as such sbc-bench refuses to start benchmarking. Maybe it resolves automatically after some time, maybe I need to investigate how to stop background tasks…

One step further: benchmarks at the four different DRAM clockspeeds finished (somewhat similar to how I did it with RK3588). I had to start from scratch again since with full desktop there always was too much background activity. One systemctl set-default multi-user.target ; reboot later and after setting echo performance >/sys/devices/platform/dmc/devfreq/dmc/governor to ensure we’re staying always at the same DRAM clockspeed it looks like this:

DRAM 7-zip single 7-zip multi memcpy memset 4M ns 64M ns Geekbench
328 MHz 1510 5240 1490 3762 205.7/289.8 280.0/361.1 257/615
416 MHz 1566 5530 1924 4610 188.7/258.2 255.0/324.0 265/670
666 MHz 1681 6190 2931 7075 148.1/199.4 205.6/255.0 286/769
856 MHz 1715 6360 3456 8357 139.7/187.3 194.7/237.9 292/800

Attention: with DRAM clockspeed at 666 MHz and 856 MHz the Geekbench tests triggered some little throttling but nothing that significant to invalidate the result. Full test output with all links in [1].

The link in the most left column is regular sbc-bench result while the link on the right is Geekbench 5.5.1 always in comparison mode to lowest DRAM clock. I personally hate Geekbench combined scores (since soooo useless) and Geekbench in general but consumers and also reviewers love those so ‘being on the wrong DRAM frequency while benchmarking’ can result with a drop in Geekbench combined scores of almost 15% with the single tests and ~30% with multi.

When we compare 328 MHz vs. 856 MHz https://browser.geekbench.com/v5/cpu/compare/20608717?baseline=20603273 and look through the individual results we have to come to the conclusion that these numbers are garbage anyway. The AES-XT single score differs by 5% while multi differs by 117% when we compare lowest with highest DRAM clockspeed. Which of course makes no sense whatsoever. Since Geekbench is closed source we’ll never know what it actually does but these numbers tell us the golden rule of passive kitchen-sink benchmarking applies here too:

You benchmark A, but actually measure B, and conclude you’ve measured C.

Another very interesting observation is the memset scores being all four times higher when benchmarking the A53 compared to the A72. Also makes not that much sense but @willy also observed this in the past and IIRC came to the conclusion the RK3399 memory controller needing at least one A53 busy at the same time to run with the A72 at full performance.

Now measuring idle consumption with my NetIO powermeter (at the wall that means all losses by cable and RPi USB-C power brick included): Netio=192.168.83.72/2 sbc-bench.sh -g

  • at 328 MHz we have Idle temperature: 23.3°C, idle consumption: 1910mW
  • at 856 MHz we’re at Idle temperature: 26.2°C, idle consumption: 2400mW

That’s a 0.5W idle consumption difference and as such significant.

Next step is fiddling around with uptreshold now that we are convinced the energy savings are worth the efforts…

[1] Full output of testings:

root@rock-pi-4a:~# for RAMClock in $(< /sys/devices/platform/dmc/devfreq/dmc/available_frequencies) ; do echo ${RAMClock} >/sys/devices/platform/dmc/devfreq/dmc/max_freq; Netio=192.168.83.72/2 MODE=unattended sbc-bench.sh -G; Netio=192.168.83.72/2 MODE=unattended sbc-bench.sh; done

Status of performance related governors found below /sys (w/o cpufreq):
dmc: performance / 328 MHz (performance dmc_ondemand simple_ondemand)
ff9a0000.gpu: simple_ondemand / 200 MHz (performance dmc_ondemand simple_ondemand)

sbc-bench v0.9.19 taking care of Geekbench

Installing needed tools: Done.
Checking cpufreq OPP. Done.
Executing RAM latency tester. Done.
Executing Geekbench. Done.
Checking cpufreq OPP again. Done (44 minutes elapsed).

ATTENTION: Throttling might have occured on CPUs 4-5 (Cortex-A72). Check the log for details.

Scores not valid. Throttling occured and/or too much background activity.

Full results uploaded to http://ix.io/4o28. Please check the log for anomalies (e.g. swapping
or throttling happenend).



Average load and/or CPU utilization too high (too much background activity). Waiting...

Too busy for benchmarking: 14:35:30 up 55 min,  1 user,  load average: 3.17, 3.82, 3.18,  cpu: 63%
Too busy for benchmarking: 14:35:35 up 55 min,  1 user,  load average: 2.92, 3.76, 3.16,  cpu: 0%
Too busy for benchmarking: 14:35:40 up 55 min,  1 user,  load average: 2.68, 3.70, 3.14,  cpu: 0%
Too busy for benchmarking: 14:35:45 up 55 min,  1 user,  load average: 2.47, 3.64, 3.12,  cpu: 0%
Too busy for benchmarking: 14:35:50 up 55 min,  1 user,  load average: 2.27, 3.58, 3.11,  cpu: 0%
Too busy for benchmarking: 14:35:55 up 55 min,  1 user,  load average: 2.09, 3.52, 3.09,  cpu: 0%
Too busy for benchmarking: 14:36:00 up 55 min,  1 user,  load average: 1.92, 3.46, 3.07,  cpu: 0%

Status of performance related governors found below /sys (w/o cpufreq):
dmc: performance / 328 MHz (performance dmc_ondemand simple_ondemand)
ff9a0000.gpu: simple_ondemand / 200 MHz (performance dmc_ondemand simple_ondemand)

sbc-bench v0.9.19

Installing needed tools: Done.
Checking cpufreq OPP. Done (results will be available in 13-19 minutes).
Executing tinymembench. Done.
Executing RAM latency tester. Done.
Executing OpenSSL benchmark. Done.
Executing 7-zip benchmark. Done.
Checking cpufreq OPP again. Done (19 minutes elapsed).

Memory performance (all 2 CPU clusters measured individually):
memcpy: : 983.6 MB/s (Cortex-A53)
memset: : 4270.0 MB/s (Cortex-A53)
memcpy: : 1489.5 MB/s (Cortex-A72)
memset: : 3761.5 MB/s (Cortex-A72)

7-zip total scores (3 consecutive runs): 5238,5234,5243, single-threaded: 1510

OpenSSL results (all 2 CPU clusters measured individually):
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
aes-128-cbc      91606.16k   292053.16k   624336.81k   910605.99k  1050296.32k  1059334.83k (Cortex-A53)
aes-128-cbc     322716.01k   759075.86k  1134594.99k  1274997.42k  1346920.45k  1352073.22k (Cortex-A72)
aes-192-cbc      87847.33k   264550.10k   521214.55k   709013.50k   791745.88k   797529.43k (Cortex-A53)
aes-192-cbc     302335.61k   695754.67k   961834.92k  1128278.02k  1184470.36k  1189691.39k (Cortex-A72)
aes-256-cbc      85831.29k   246778.69k   457563.73k   596390.91k   654319.62k   654366.04k (Cortex-A53)
aes-256-cbc     293114.35k   638501.33k   890982.91k   974229.85k  1019292.33k  1018833.58k (Cortex-A72)

Full results uploaded to http://ix.io/4o2g


Average load and/or CPU utilization too high (too much background activity). Waiting...

Too busy for benchmarking: 14:54:40 up  1:14,  1 user,  load average: 5.34, 5.20, 3.71,  cpu: 55%
Too busy for benchmarking: 14:54:45 up  1:14,  1 user,  load average: 4.91, 5.11, 3.69,  cpu: 0%
Too busy for benchmarking: 14:54:50 up  1:14,  1 user,  load average: 4.52, 5.03, 3.67,  cpu: 0%
Too busy for benchmarking: 14:54:55 up  1:14,  1 user,  load average: 4.16, 4.94, 3.65,  cpu: 0%
Too busy for benchmarking: 14:55:00 up  1:14,  1 user,  load average: 3.82, 4.86, 3.63,  cpu: 0%
Too busy for benchmarking: 14:55:05 up  1:15,  1 user,  load average: 3.52, 4.78, 3.61,  cpu: 0%
Too busy for benchmarking: 14:55:10 up  1:15,  1 user,  load average: 3.24, 4.70, 3.59,  cpu: 0%

Status of performance related governors found below /sys (w/o cpufreq):
dmc: performance / 416 MHz (performance dmc_ondemand simple_ondemand)
ff9a0000.gpu: simple_ondemand / 200 MHz (performance dmc_ondemand simple_ondemand)

sbc-bench v0.9.19 taking care of Geekbench

Installing needed tools: Done.
Checking cpufreq OPP. Done.
Executing RAM latency tester. Done.
Executing Geekbench. Done.
Checking cpufreq OPP again. Done (41 minutes elapsed).

ATTENTION: Throttling might have occured on CPUs 4-5 (Cortex-A72). Check the log for details.

Scores not valid. Throttling occured and/or too much background activity.

Full results uploaded to http://ix.io/4o2M. Please check the log for anomalies (e.g. swapping
or throttling happenend).



Average load and/or CPU utilization too high (too much background activity). Waiting...

Too busy for benchmarking: 15:36:07 up  1:56,  1 user,  load average: 3.40, 3.69, 3.26,  cpu: 81%
Too busy for benchmarking: 15:36:12 up  1:56,  1 user,  load average: 3.13, 3.63, 3.24,  cpu: 0%
Too busy for benchmarking: 15:36:17 up  1:56,  1 user,  load average: 2.72, 3.53, 3.21,  cpu: 0%
Too busy for benchmarking: 15:36:22 up  1:56,  1 user,  load average: 2.50, 3.47, 3.19,  cpu: 0%
Too busy for benchmarking: 15:36:27 up  1:56,  1 user,  load average: 2.30, 3.41, 3.17,  cpu: 0%
Too busy for benchmarking: 15:36:32 up  1:56,  1 user,  load average: 2.12, 3.36, 3.16,  cpu: 0%
Too busy for benchmarking: 15:36:37 up  1:56,  1 user,  load average: 1.95, 3.30, 3.14,  cpu: 0%

Status of performance related governors found below /sys (w/o cpufreq):
dmc: performance / 416 MHz (performance dmc_ondemand simple_ondemand)
ff9a0000.gpu: simple_ondemand / 200 MHz (performance dmc_ondemand simple_ondemand)

sbc-bench v0.9.19

Installing needed tools: Done.
Checking cpufreq OPP. Done (results will be available in 13-19 minutes).
Executing tinymembench. Done.
Executing RAM latency tester. Done.
Executing OpenSSL benchmark. Done.
Executing 7-zip benchmark. Done.
Checking cpufreq OPP again. Done (18 minutes elapsed).

Memory performance (all 2 CPU clusters measured individually):
memcpy: : 1142.2 MB/s (Cortex-A53)
memset: : 5225.6 MB/s (Cortex-A53)
memcpy: : 1923.8 MB/s (Cortex-A72)
memset: : 4609.1 MB/s (Cortex-A72)

7-zip total scores (3 consecutive runs): 5541,5527,5530, single-threaded: 1566

OpenSSL results (all 2 CPU clusters measured individually):
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
aes-128-cbc      91691.65k   292048.45k   624245.76k   910673.92k  1050741.42k  1060596.39k (Cortex-A53)
aes-128-cbc     322618.94k   759185.47k  1134807.30k  1275786.58k  1343362.39k  1348375.89k (Cortex-A72)
aes-192-cbc      87838.07k   264344.00k   521245.78k   709337.77k   791860.57k   798294.02k (Cortex-A53)
aes-192-cbc     302435.41k   696013.38k   968719.87k  1130672.47k  1180095.83k  1188440.75k (Cortex-A72)
aes-256-cbc      85825.19k   246334.10k   457761.02k   596491.61k   654494.38k   658724.18k (Cortex-A53)
aes-256-cbc     293093.83k   638400.28k   890970.62k   977024.34k  1019947.69k  1018473.13k (Cortex-A72)

Full results uploaded to http://ix.io/4o2U


Average load and/or CPU utilization too high (too much background activity). Waiting...

Too busy for benchmarking: 15:54:31 up  2:14,  1 user,  load average: 5.03, 4.86, 3.52,  cpu: 70%
Too busy for benchmarking: 15:54:36 up  2:14,  1 user,  load average: 4.62, 4.78, 3.50,  cpu: 0%
Too busy for benchmarking: 15:54:41 up  2:14,  1 user,  load average: 4.25, 4.70, 3.48,  cpu: 0%
Too busy for benchmarking: 15:54:46 up  2:14,  1 user,  load average: 3.91, 4.62, 3.47,  cpu: 0%
Too busy for benchmarking: 15:54:51 up  2:14,  1 user,  load average: 3.60, 4.54, 3.45,  cpu: 0%
Too busy for benchmarking: 15:54:56 up  2:14,  1 user,  load average: 3.31, 4.47, 3.43,  cpu: 0%
Too busy for benchmarking: 15:55:01 up  2:14,  1 user,  load average: 3.04, 4.39, 3.41,  cpu: 0%

Status of performance related governors found below /sys (w/o cpufreq):
dmc: performance / 666 MHz (performance dmc_ondemand simple_ondemand)
ff9a0000.gpu: simple_ondemand / 200 MHz (performance dmc_ondemand simple_ondemand)

sbc-bench v0.9.19 taking care of Geekbench

Installing needed tools: Done.
Checking cpufreq OPP. Done.
Executing RAM latency tester. Done.
Executing Geekbench. Done.
Checking cpufreq OPP again. Done (37 minutes elapsed).

ATTENTION: Throttling might have occured on CPUs 4-5 (Cortex-A72). Check the log for details.

Scores not valid. Throttling occured and/or too much background activity.

Full results uploaded to http://ix.io/4o3g. Please check the log for anomalies (e.g. swapping
or throttling happenend).



Average load and/or CPU utilization too high (too much background activity). Waiting...

Too busy for benchmarking: 16:31:55 up  2:51,  1 user,  load average: 3.22, 3.49, 3.05,  cpu: 87%
Too busy for benchmarking: 16:32:00 up  2:51,  1 user,  load average: 2.97, 3.44, 3.03,  cpu: 0%
Too busy for benchmarking: 16:32:05 up  2:52,  1 user,  load average: 2.73, 3.38, 3.02,  cpu: 0%
Too busy for benchmarking: 16:32:10 up  2:52,  1 user,  load average: 2.51, 3.32, 3.00,  cpu: 0%
Too busy for benchmarking: 16:32:15 up  2:52,  1 user,  load average: 2.31, 3.27, 2.98,  cpu: 0%
Too busy for benchmarking: 16:32:20 up  2:52,  1 user,  load average: 2.12, 3.21, 2.97,  cpu: 0%
Too busy for benchmarking: 16:32:25 up  2:52,  1 user,  load average: 1.95, 3.16, 2.95,  cpu: 0%

Status of performance related governors found below /sys (w/o cpufreq):
dmc: performance / 666 MHz (performance dmc_ondemand simple_ondemand)
ff9a0000.gpu: simple_ondemand / 200 MHz (performance dmc_ondemand simple_ondemand)

sbc-bench v0.9.19

Installing needed tools: Done.
Checking cpufreq OPP. Done (results will be available in 13-19 minutes).
Executing tinymembench. Done.
Executing RAM latency tester. Done.
Executing OpenSSL benchmark. Done.
Executing 7-zip benchmark. Done.
Checking cpufreq OPP again. Done (17 minutes elapsed).

Memory performance (all 2 CPU clusters measured individually):
memcpy: : 1564.8 MB/s (Cortex-A53)
memset: : 7424.0 MB/s (Cortex-A53)
memcpy: : 2931.3 MB/s (Cortex-A72)
memset: : 7075.4 MB/s (Cortex-A72)

7-zip total scores (3 consecutive runs): 6173,6174,6213, single-threaded: 1681

OpenSSL results (all 2 CPU clusters measured individually):
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
aes-128-cbc      91780.68k   292112.66k   624614.40k   910882.13k  1050907.99k  1061792.43k (Cortex-A53)
aes-128-cbc     277549.83k   707053.50k  1102535.17k  1258137.94k  1344798.72k  1347081.56k (Cortex-A72)
aes-192-cbc      87396.49k   263418.86k   520020.39k   708818.60k   791751.34k   798031.87k (Cortex-A53)
aes-192-cbc     302496.07k   696039.13k   967566.25k  1130389.50k  1183959.72k  1185824.77k (Cortex-A72)
aes-256-cbc      85863.56k   247107.80k   457826.22k   596919.30k   654800.21k   657812.14k (Cortex-A53)
aes-256-cbc     292824.94k   638654.87k   891191.98k   977772.89k  1019942.23k  1016758.27k (Cortex-A72)

Full results uploaded to http://ix.io/4o3v


Average load and/or CPU utilization too high (too much background activity). Waiting...

Too busy for benchmarking: 16:48:48 up  3:08,  1 user,  load average: 4.65, 4.34, 3.19,  cpu: 78%
Too busy for benchmarking: 16:48:53 up  3:08,  1 user,  load average: 4.28, 4.27, 3.17,  cpu: 0%
Too busy for benchmarking: 16:48:58 up  3:08,  1 user,  load average: 3.94, 4.20, 3.15,  cpu: 0%
Too busy for benchmarking: 16:49:03 up  3:09,  1 user,  load average: 3.62, 4.13, 3.13,  cpu: 0%
Too busy for benchmarking: 16:49:08 up  3:09,  1 user,  load average: 3.33, 4.06, 3.12,  cpu: 0%
Too busy for benchmarking: 16:49:13 up  3:09,  1 user,  load average: 3.06, 3.99, 3.10,  cpu: 0%
Too busy for benchmarking: 16:49:19 up  3:09,  1 user,  load average: 2.82, 3.92, 3.08,  cpu: 0%

Status of performance related governors found below /sys (w/o cpufreq):
dmc: performance / 856 MHz (performance dmc_ondemand simple_ondemand)
ff9a0000.gpu: simple_ondemand / 200 MHz (performance dmc_ondemand simple_ondemand)

sbc-bench v0.9.19 taking care of Geekbench

Installing needed tools: Done.
Checking cpufreq OPP. Done.
Executing RAM latency tester. Done.
Executing Geekbench. Done.
Checking cpufreq OPP again. Done (36 minutes elapsed).

ATTENTION: Throttling might have occured on CPUs 4-5 (Cortex-A72). Check the log for details.

Scores not valid. Throttling occured and/or too much background activity.

Full results uploaded to http://ix.io/4o3W. Please check the log for anomalies (e.g. swapping
or throttling happenend).



Average load and/or CPU utilization too high (too much background activity). Waiting...

Too busy for benchmarking: 17:25:00 up  3:44,  1 user,  load average: 2.95, 3.31, 2.97,  cpu: 90%
Too busy for benchmarking: 17:25:05 up  3:45,  1 user,  load average: 2.71, 3.26, 2.95,  cpu: 0%
Too busy for benchmarking: 17:25:11 up  3:45,  1 user,  load average: 2.66, 3.23, 2.95,  cpu: 0%
Too busy for benchmarking: 17:25:16 up  3:45,  1 user,  load average: 2.44, 3.18, 2.93,  cpu: 0%
Too busy for benchmarking: 17:25:21 up  3:45,  1 user,  load average: 2.25, 3.13, 2.92,  cpu: 0%
Too busy for benchmarking: 17:25:26 up  3:45,  1 user,  load average: 2.07, 3.08, 2.90,  cpu: 0%
Too busy for benchmarking: 17:25:31 up  3:45,  1 user,  load average: 1.90, 3.02, 2.88,  cpu: 0%

Status of performance related governors found below /sys (w/o cpufreq):
dmc: performance / 856 MHz (performance dmc_ondemand simple_ondemand)
ff9a0000.gpu: simple_ondemand / 200 MHz (performance dmc_ondemand simple_ondemand)

sbc-bench v0.9.19

Installing needed tools: Done.
Checking cpufreq OPP. Done (results will be available in 13-19 minutes).
Executing tinymembench. Done.
Executing RAM latency tester. Done.
Executing OpenSSL benchmark. Done.
Executing 7-zip benchmark. Done.
Checking cpufreq OPP again. Done (16 minutes elapsed).

Memory performance (all 2 CPU clusters measured individually):
memcpy: : 1773.0 MB/s (Cortex-A53)
memset: : 8405.7 MB/s (Cortex-A53)
memcpy: : 3455.5 MB/s (Cortex-A72)
memset: : 8356.9 MB/s (Cortex-A72)

7-zip total scores (3 consecutive runs): 6382,6334,6351, single-threaded: 1715

OpenSSL results (all 2 CPU clusters measured individually):
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
aes-128-cbc      91782.10k   292124.25k   624542.55k   910899.20k  1050962.60k  1061797.89k (Cortex-A53)
aes-128-cbc     277560.58k   707125.76k  1102400.60k  1263162.03k  1346205.01k  1345798.14k (Cortex-A72)
aes-192-cbc      87856.03k   264469.21k   521433.09k   709458.26k   792488.62k   798823.77k (Cortex-A53)
aes-192-cbc     302425.07k   683676.31k   961968.90k  1128577.37k  1188350.63k  1190193.83k (Cortex-A72)
aes-256-cbc      85877.07k   247073.58k   457931.61k   597041.83k   654731.95k   658314.58k (Cortex-A53)
aes-256-cbc     293192.91k   638598.36k   891142.83k   977415.85k  1020065.11k  1020477.44k (Cortex-A72)

Full results uploaded to http://ix.io/4o44

Now to the benefits of an adjusted uptreshold parameter: none

root@rock-pi-4a:/home/radxa# for i in 20 25 30 35 40 ; do echo -e "$i: \c"; echo $i >/sys/devices/platform/dmc/devfreq/dmc/upthreshold; 7zr b | grep "^Avr:"; sleep 30 ; done
20: Avr:             556    885   4916  |              526   1501   7887
25: Avr:             554    889   4920  |              525   1501   7886
30: Avr:             568    882   5006  |              526   1502   7895
35: Avr:             561    884   4959  |              526   1502   7898
40: Avr:             555    886   4916  |              525   1502   7891

And this is for a simple reason: since on RK3399 DRAM remains at the highest clockspeed anyway (almost) all the time with this kernel. See the trans_stat node (it hasn’t changed when switching to simple_ondemand):

root@rock-pi-4a:/home/radxa# grep . /sys/devices/platform/dmc/devfreq/dmc/* 2>/dev/null
/sys/devices/platform/dmc/devfreq/dmc/available_frequencies:328000000 416000000 666000000 856000000
/sys/devices/platform/dmc/devfreq/dmc/available_governors:dmc_ondemand simple_ondemand
/sys/devices/platform/dmc/devfreq/dmc/cur_freq:856000000
/sys/devices/platform/dmc/devfreq/dmc/downdifferential:20
/sys/devices/platform/dmc/devfreq/dmc/governor:dmc_ondemand
/sys/devices/platform/dmc/devfreq/dmc/max_freq:856000000
/sys/devices/platform/dmc/devfreq/dmc/min_freq:328000000
/sys/devices/platform/dmc/devfreq/dmc/name:dmc
/sys/devices/platform/dmc/devfreq/dmc/polling_interval:50
/sys/devices/platform/dmc/devfreq/dmc/system_status:0x1
/sys/devices/platform/dmc/devfreq/dmc/target_freq:856000000
/sys/devices/platform/dmc/devfreq/dmc/timer:deferrable
/sys/devices/platform/dmc/devfreq/dmc/trans_stat:     From  :   To
/sys/devices/platform/dmc/devfreq/dmc/trans_stat:           : 328000000 416000000 666000000 856000000   time(ms)
/sys/devices/platform/dmc/devfreq/dmc/trans_stat:  328000000:         0         0         0         0         0
/sys/devices/platform/dmc/devfreq/dmc/trans_stat:  416000000:         0         0         0         0         0
/sys/devices/platform/dmc/devfreq/dmc/trans_stat:  666000000:         0         0         0         0         0
/sys/devices/platform/dmc/devfreq/dmc/trans_stat:* 856000000:         0         0         0         0    443083
/sys/devices/platform/dmc/devfreq/dmc/trans_stat:Total transition : 0
/sys/devices/platform/dmc/devfreq/dmc/upthreshold:40

In contrast Rock 5B remaining on lowest DRAM clock almost all the time (my testing image still running with an older kernel based on the original 5.10.66 branch and not your new default branch that is on 5.10.110):

root@rock-5b:~# grep . /sys/devices/platform/dmc/devfreq/dmc/* 2>/dev/null
/sys/devices/platform/dmc/devfreq/dmc/available_frequencies:528000000 1068000000 1560000000 2112000000
/sys/devices/platform/dmc/devfreq/dmc/available_governors:dmc_ondemand userspace powersave performance simple_ondemand
/sys/devices/platform/dmc/devfreq/dmc/cur_freq:528000000
/sys/devices/platform/dmc/devfreq/dmc/downdifferential:20
/sys/devices/platform/dmc/devfreq/dmc/governor:dmc_ondemand
/sys/devices/platform/dmc/devfreq/dmc/load:0@528000000Hz
/sys/devices/platform/dmc/devfreq/dmc/max_freq:2112000000
/sys/devices/platform/dmc/devfreq/dmc/min_freq:528000000
/sys/devices/platform/dmc/devfreq/dmc/name:dmc
/sys/devices/platform/dmc/devfreq/dmc/polling_interval:50
/sys/devices/platform/dmc/devfreq/dmc/system_status:0x1
/sys/devices/platform/dmc/devfreq/dmc/target_freq:528000000
/sys/devices/platform/dmc/devfreq/dmc/timer:deferrable
/sys/devices/platform/dmc/devfreq/dmc/trans_stat:     From  :   To
/sys/devices/platform/dmc/devfreq/dmc/trans_stat:           : 528000000106800000015600000002112000000   time(ms)
/sys/devices/platform/dmc/devfreq/dmc/trans_stat:* 528000000:         0         0         0         1    759930
/sys/devices/platform/dmc/devfreq/dmc/trans_stat: 1068000000:         1         0         0         0        60
/sys/devices/platform/dmc/devfreq/dmc/trans_stat: 1560000000:         0         0         0         0         0
/sys/devices/platform/dmc/devfreq/dmc/trans_stat: 2112000000:         1         1         0         0        53
/sys/devices/platform/dmc/devfreq/dmc/trans_stat:Total transition : 4
/sys/devices/platform/dmc/devfreq/dmc/upthreshold:25

Some hours ago sbc-bench recorded on Rock 4A within 26 minutes the DRAM being at 328 MHz for a full ~2.3 seconds:

     From  :   To
           : 328000000 416000000 666000000 856000000   time(ms)
  328000000:         0         0         0         1      2283
  416000000:         0         0         0         0         0
  666000000:         0         0         0         0         0
* 856000000:         1         0         0         0   2197350
Total transition : 2

I fiddled a bit around with other tunables but to no avail. So if you want your RK3399 users benefit from the 0.5W lower idle consumption you might ask your RK contact (Kever?) what’s going on with their 5.10.110 and RK3399…

Until this is resolved your 5.10 OS images should both show higher benchmark scores as well as (minimal) higher real-world performance compared to the old 4.4 offerings. Or did you also provide RK3399 images with 4.19 in between?

Thanks for your investigation. I have updated rbuild to include binutils and will update the download page. The image you are using is not officially released yet. We will prepare some documentation to go along with the official release, since this is built using a new image builder.

If you go to the GitHub release page instead from our Wiki download page, you will also find Debian CLI image. We only advertise the GUI image to reduce the support burden.

Will look into this.

Thanks for looking for this. I’ll check the code and ask Rockchip about it.

We jump straight from 4.4 to 5.10 for Linux. Android is less customized so we are using 4.19 per Rockchip SDK settings.

BTW: it would be great if the rbuild generated images contain some info about rbuild itself in e.g. /etc/radxa-rbuild-release, something like version number, git source and build variables. Something similar to /etc/armbian-release. This helps identifying build frameworks, see for example the ‘Build scripts’ line here: https://github.com/ThomasKaiser/sbc-bench/blob/master/results/reviews/Rock-5b.md

I actually have a /etc/build_info generated during the build, but it was not working right :confused:

Edit: fixed

Great! Can you please post here contents of an example /etc/radxa_image_info file. Saves me some time since not needing to try out a new image :slight_smile:

I updated the template, and should look like this now:

radxa@rock-3c:~$ cat /etc/radxa_image_fingerprint 
RBUILD_BUILD_DATE='Fri, 17 Feb 2023 10:20:59 +0800'
RBUILD_REVISION='5078d6c19a1bf319aa0a866b79471e5c71b8a1ed-dirty'
RBUILD_COMMAND='./rbuild rock-3c .xfce'
RBUILD_KERNEL='linux-image-4.19.193-1-rk356x'
RBUILD_KERNEL_VERSION='4.19.193-1-6003ebd76b8f'
RBUILD_UBOOT='u-boot-rk356x'
RBUILD_UBOOT_VERSION='2017.09-1-26d3b69'

Thanks! Based on the example my code will now produce something like this:

Build system:   Radxa rbuild 5078d6c19a1bf319aa0a866b79471e5c71b8a1ed-dirty, rock-3c .xfce, u-boot-rk356x 2017.09-1-26d3b69, 17 Feb 2023